API Usage¶
This page shows the concrete API calls for full-text extraction, including the LLM-backed path added for end-to-end validation.
Base URL¶
Local API development runs on http://localhost:8734.
Full-Text Extraction¶
CLI example¶
Standard backend¶
curl -X POST "http://localhost:8734/api/v1/text/process" \
-H "Content-Type: application/json" \
-d '{
"text": "The patient exhibits microcephaly and frequent seizures.",
"extraction_backend": "standard"
}'
LLM backend¶
curl -X POST "http://localhost:8734/api/v1/text/process" \
-H "Content-Type: application/json" \
-d '{
"text": "The patient exhibits microcephaly and frequent seizures.",
"extraction_backend": "llm",
"llm_mode": "two_phase"
}'
The LLM response keeps the same top-level shape and includes metadata such as
extraction_backend, llm_model, and llm_mode.
Public REST clients cannot select the LLM provider, model, or base URL. The
server owns the public LLM target and currently routes LLM extraction to
gemini/gemini-3.1-flash-lite-preview. Requests that include llm_model,
llm_provider, or llm_base_url are rejected; omit those fields and use
llm_mode: "two_phase" when requesting LLM extraction.
LLM extraction uses defense-in-depth prompt-injection controls: submitted text is treated as untrusted data, prompt templates separate data from instructions, and backend validation checks structured output before returning suggestions. These controls reduce risk but do not make the service appropriate for clinical decision support. Use the public REST API for research workflows only, and do not submit identifiable patient data to public deployments.
Production Environment¶
The FastAPI layer uses these environment variables for production LLM handling:
export PHENTRIEVE_ENV=production
export PHENTRIEVE_TRUSTED_PROXY_CIDRS="127.0.0.1/32,10.0.0.0/8"
export PHENTRIEVE_LLM_DAILY_LIMIT=3
export PHENTRIEVE_LLM_QUOTA_DB_PATH="../data/app/llm_quota.db"
PHENTRIEVE_ENVcontrols whether the API is running in development or production mode.PHENTRIEVE_TRUSTED_PROXY_CIDRSdefines which proxy networks are allowed to forward client IPs for quota tracking.PHENTRIEVE_LLM_DAILY_LIMITsets the number of successful anonymous LLM API analyses allowed per UTC day.PHENTRIEVE_LLM_QUOTA_DB_PATHpoints to the SQLite database used for API quota persistence.
API Documentation¶
When the API is running, the OpenAPI pages are available at:
- Swagger UI:
http://localhost:8734/docs - ReDoc:
http://localhost:8734/redoc
Profiles vs API¶
The HTTP API does not accept the CLI's --profile flag - request fields
are explicit. If you're moving a workflow from the CLI to the API, copy the
relevant fields from your profile into the request body. See
Configuration Profiles for the profile schema.
Adaptive Re-Chunking¶
TextProcessingRequest accepts an optional adaptive_rechunking field that
mirrors the YAML extraction.adaptive_rechunking block. When provided with
enabled: true, poor-quality chunks are subdivided at sentence boundaries
and re-queried. The full schema and trigger semantics are described in
Adaptive Re-Chunking.
Request example:
curl -X POST "http://localhost:8734/api/v1/text/process" \
-H "Content-Type: application/json" \
-d '{
"text": "Patient with intellectual disability, microcephaly, seizures, and short stature.",
"adaptive_rechunking": {
"enabled": true,
"quality_threshold": 0.55,
"margin_threshold": 0.03,
"max_depth": 2
}
}'
All adaptive_rechunking fields are optional; omitted fields fall through
to server-side defaults. The full set of knobs is:
enabled(bool, defaultfalse)quality_threshold(float, default0.55)margin_threshold(float, default0.03)max_depth(int, default2)min_chunk_chars(int, default30)max_sentences_per_subchunk(int, default3)overlap_sentences(int, default1)score_improvement_gate(float, default0.05)use_ontology_coherence(bool, defaultfalse; reserved, inert in v1)
When the feature is enabled and triggered, the response carries a
meta.adaptive_rechunking block summarizing what happened:
{
"meta": {
"extraction_backend": "standard",
"adaptive_rechunking": {
"enabled": true,
"trigger_count": 3,
"subdivided_count": 2,
"reverted_count": 1,
"max_depth_reached": 1,
"extra_chunks_added": 4
}
}
}
The block is omitted when adaptive_rechunking is disabled or omitted from
the request.