# Conversor IAE CNAE — Agent API Reference

Public, unauthenticated API for AI agents and MCP servers to query the Spanish IAE / CNAE 2025 / CNAE 2009 catalogs plus the AEAT modelos intelligence database.

## Overview
- **Base URL:** `https://lctcvzapdeljxhjoevvp.supabase.co/functions/v1`
- **Auth:** none (public)
- **Rate limit:** 50 requests / 5 minutes per IP
- **Daily Gemini embed cap:** 5,000 calls (shared across all IPs; lexical-only fallback when exhausted)
- **CORS:** `*` (browser-callable from anywhere)
- **No PII logged:** only query text + similarity scores + latency
- **Open dataset (CC-BY-4.0):** github.com/conversoriaecnae/iae-cnae-open-data — IAE + CNAE 2025 + correspondencia CNAE 2009→2025.

## Discovery (handshake)
Call the `describe_conversor()` SQL RPC via supabase-js or REST to get a JSON self-description of catalogs, RPCs, edge endpoints, and crosswalks. Updates automatically as the schema evolves.

## Endpoints

### POST /conversor-search
Body: `{ q: string, type?: 'iae'|'cnae'|'cnae_2009'|'modelos'|'all', top_k?: number, threshold?: number }`
Returns: `{ results: Array<{ code, title, type, score, ... }>, latency_ms }`

### POST /conversor-search/match-modelos-for-code
Body: `{ code: string, code_type: 'iae'|'cnae' }`
Returns: `{ results: Array<{ modelo_id, name, periodicidad, confidence, reason, fecha_proximo_plazo }> }`

## RPCs (via supabase-js with the anon key)
The anon key is publishable. RPCs are SECURITY DEFINER with locked search_path.

- `search_iae_codes(search_query, query_embedding?, match_threshold?)` — hybrid lexical + semantic, 9 signals
- `search_cnae_codes(...)`, `search_cnae_2009_codes(...)` — analogous
- `search_modelos(search_query, query_embedding?, match_threshold?)` — same shape, scoped to modelos
- `match_code_universe(query_embedding, search_query?, top_k?, match_threshold?)` — UNION ALL across all 3 code catalogs
- `match_modelos_for_iae(p_cod_iae, p_code_type)` — applicability join (works for IAE or CNAE)
- `get_modelo_details(p_modelo_id)` — JSONB blob: row + casillas + plazos_current + counts
- `get_iae_details(text)`, `get_cnae_details(text)`, `get_cnae2009_details(text)` — JSONB detail blobs
- `search_modelo_chunks(query_embedding, match_threshold?, match_count?, filter_topics?, filter_modelo_ids?)` — RAG retrieval over document_chunks for prose grounding
- `get_fiscal_brief_for_code(p_code, p_code_type, p_query_embedding?)` — single-call bulk endpoint combining applicability + RAG chunks + casillas
- `describe_conversor()` — JSONB self-description

## Embedding model
- Gemini Embedding 2 at 768-dim (Matryoshka-truncated)
- Query prefix: `task: search result | query: {text}`
- Documents are pre-embedded; agents only need to embed queries
- Recommended threshold: 0.55 (lexical-friendly), 0.65 for strict semantic

## Example: full per-code fiscal brief in one round-trip
```ts
const { data } = await supabase.rpc('get_fiscal_brief_for_code', {
  p_code: '501.3',
  p_code_type: 'iae',
  p_query_embedding: await embedQuery('obligaciones fiscales albañilería'),
});
// data => { code, modelos_aplicables: [...], rag_chunks: [...], ... }
```

## Calendar feed
- `GET /api/v1/calendario/{code_type}/{code}.ics` — iCalendar feed for a single IAE/CNAE code
- Subscribable from Google Calendar, Outlook, Apple Calendar
- Updates as plazos are loaded into modelo_plazos

## Source attribution
Every modelo row includes `oficial_url` pointing to the AEAT sede page. Every applicability row includes `confidence` (0.0–1.0) and `source_method` (`rule`|`llm`|`manual`). Every RAG chunk includes `source_anchor` linking back to `document_chunks` for citation.

## Versioning
The agent API follows semver. Breaking changes get a `/v2` path. Discovery via `describe_conversor()` always returns the current schema.

## Contact
brian@conversoriaecnae.es
