You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat(config): normalize enable flags and add extraction.ocr subtable (#735)
## Summary
- Rename `extraction.enabled` to `extraction.enable` with TOML key
migration and env var deprecation (`MICASA_EXTRACTION_ENABLED` ->
`MICASA_EXTRACTION_ENABLE`)
- Add `[extraction.ocr]` subtable with `enable` (bool) and
`confidence_threshold` (int) fields for independent OCR control
- Wire OCR config through `DefaultExtractors` — OCR extractors
conditionally included based on `extraction.ocr.enable`, low-confidence
words filtered by `confidence_threshold`
- Remove `text_timeout` from the config surface — pdftotext is fast, the
30s safety net stays as an internal `DefaultTextTimeout` constant
- Fix `FormatDuration` to produce clean notation for whole minutes/hours
(`5m` not `5m0s`)
- Fix docs that incorrectly described `llm.timeout` as a 5s quick-op
timeout (it is a 5m HTTP response timeout)
closes#729
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
# with small, fast models optimized for structured JSON output.
303
304
# model = "qwen2.5:7b"
304
305
305
-
# Timeout for pdftotext. Go duration syntax: "30s", "1m", etc. Default: "30s".
306
-
# Increase if you routinely process very large PDFs.
307
-
# text_timeout = "30s"
308
-
309
306
# Maximum pages to OCR for scanned documents. 0 = no limit. Default: 0.
310
307
# max_pages = 0
311
308
@@ -338,7 +335,7 @@ set in `[llm.chat]` and `[llm.extraction]`.
338
335
|`model`| string |`qwen3`| Model identifier sent in chat requests. Must be available on the server. |
339
336
|`api_key`| string | (empty) | Authentication credential. Required for cloud providers (Anthropic, OpenAI, etc.). Leave empty for local servers. |
340
337
|`extra_context`| string | (empty) | Free-form text appended to all LLM system prompts. Useful for telling the model about your house or regional conventions. Currency is handled automatically via `[locale]`. |
341
-
|`timeout`| string |`"5s"`| Max wait time for quick LLM operations (ping, model listing). Go duration syntax, e.g. `"10s"`, `"500ms"`. Increase for slow servers. |
338
+
|`timeout`| string |`"5m"`| Max time for a single LLM response (including streaming). Go duration syntax, e.g. `"10m"`. Increase for slow models. |
0 commit comments