|
53 | 53 | - Need: stricter post-compilation visual checks — compare rendered PDF region against template spec |
54 | 54 | - Consider: pixel-level overlap detection for text/figure collisions |
55 | 55 |
|
56 | | -### [ ] Citation authenticity & hallucination |
57 | | -- LLM-generated references are frequently hallucinated (wrong author, wrong year, non-existent papers) |
58 | | -- Current pipeline has no citation verification step |
59 | | -- Need: post-write citation verification phase |
60 | | - - Cross-check each `\cite{}` entry against Semantic Scholar / CrossRef / Google Scholar API |
61 | | - - Verify: title exists, authors match, year matches, DOI resolves |
62 | | - - Flag or remove unverifiable citations |
63 | | -- Need: researcher agent should provide real BibTeX entries from actual database queries, not LLM memory |
64 | | -- Consider: mandatory `references.bib` sourced exclusively from API-fetched entries |
| 56 | +### [x] Citation authenticity & hallucination |
| 57 | +- Implemented API-first citation system (`ark/citation.py`) |
| 58 | +- LLM never writes BibTeX — all entries fetched from DBLP / CrossRef official APIs |
| 59 | +- Search cascade: DBLP → CrossRef → arXiv → Semantic Scholar |
| 60 | +- Researcher agent selects papers from API-verified candidate list only |
| 61 | +- Per-iteration verification: every review cycle re-verifies `references.bib` |
| 62 | +- Dual-source cross-confirmation (DBLP + CrossRef) |
| 63 | +- Preprint → published version auto-upgrade |
| 64 | +- Unused citation cleanup (removes uncited entries from `.bib`) |
| 65 | +- CLI tools: `ark cite-check`, `ark cite-search`, `ark cite-debug` |
65 | 66 |
|
66 | 67 | ### [ ] Table formatting |
67 | 68 | - Tables can overflow column/page width in two-column venues |
|
0 commit comments