Commit f20f45a
authored
fix: make entity_registry.research() local-only by default (#811)
* fix: make entity_registry.research() local-only by default
research() previously called _wikipedia_lookup() unconditionally,
sending entity names to en.wikipedia.org on every uncached lookup.
This violates the project's local-first and privacy-by-architecture
principles documented in CLAUDE.md.
Changes:
- research() now returns "unknown" for uncached words by default
- New allow_network=True parameter required for Wikipedia lookups
- Wikipedia 404 now returns "unknown" instead of asserting "person"
with 0.70 confidence, preventing entity registry poisoning
- Added privacy warning docstring to _wikipedia_lookup()
- Added tests for local-only default, opt-in network, 404 handling,
and cache-not-persisted-on-local-only behaviour
Refs: #809
* fix: improve research() cache read path and deduplicate test mocks
- Use .get() instead of .setdefault() for cache reads in research()
so the local-only path never mutates _data unnecessarily
- Move .setdefault() to the network-write path only
- Use result.setdefault() for word/confirmed keys to ensure
consistent return shape across all _wikipedia_lookup error paths
- Extract duplicated mock_result dict into _MOCK_SAOIRSE_PERSON
constant shared by 3 test functions1 parent f36d04e commit f20f45a
2 files changed
Lines changed: 118 additions & 30 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
178 | 178 | | |
179 | 179 | | |
180 | 180 | | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
181 | 187 | | |
182 | 188 | | |
183 | 189 | | |
| |||
244 | 250 | | |
245 | 251 | | |
246 | 252 | | |
247 | | - | |
| 253 | + | |
| 254 | + | |
248 | 255 | | |
249 | | - | |
250 | | - | |
| 256 | + | |
| 257 | + | |
251 | 258 | | |
252 | 259 | | |
253 | | - | |
| 260 | + | |
254 | 261 | | |
255 | 262 | | |
256 | 263 | | |
| |||
502 | 509 | | |
503 | 510 | | |
504 | 511 | | |
505 | | - | |
| 512 | + | |
506 | 513 | | |
507 | | - | |
508 | | - | |
509 | | - | |
| 514 | + | |
| 515 | + | |
| 516 | + | |
| 517 | + | |
| 518 | + | |
| 519 | + | |
| 520 | + | |
| 521 | + | |
| 522 | + | |
| 523 | + | |
| 524 | + | |
510 | 525 | | |
511 | | - | |
512 | | - | |
| 526 | + | |
| 527 | + | |
513 | 528 | | |
514 | 529 | | |
515 | 530 | | |
| 531 | + | |
| 532 | + | |
| 533 | + | |
| 534 | + | |
| 535 | + | |
| 536 | + | |
| 537 | + | |
| 538 | + | |
| 539 | + | |
| 540 | + | |
| 541 | + | |
| 542 | + | |
| 543 | + | |
516 | 544 | | |
517 | | - | |
518 | | - | |
| 545 | + | |
| 546 | + | |
519 | 547 | | |
520 | 548 | | |
521 | 549 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
11 | 19 | | |
12 | 20 | | |
13 | 21 | | |
| |||
213 | 221 | | |
214 | 222 | | |
215 | 223 | | |
216 | | - | |
| 224 | + | |
217 | 225 | | |
218 | 226 | | |
219 | | - | |
| 227 | + | |
| 228 | + | |
220 | 229 | | |
221 | 230 | | |
222 | 231 | | |
223 | | - | |
224 | | - | |
225 | | - | |
226 | | - | |
227 | | - | |
228 | | - | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
229 | 237 | | |
230 | | - | |
231 | | - | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
232 | 267 | | |
233 | 268 | | |
234 | 269 | | |
| |||
240 | 275 | | |
241 | 276 | | |
242 | 277 | | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
243 | 287 | | |
244 | 288 | | |
245 | 289 | | |
246 | 290 | | |
247 | | - | |
248 | | - | |
249 | | - | |
250 | | - | |
251 | | - | |
252 | | - | |
253 | | - | |
254 | | - | |
| 291 | + | |
| 292 | + | |
| 293 | + | |
| 294 | + | |
| 295 | + | |
255 | 296 | | |
256 | 297 | | |
257 | 298 | | |
258 | 299 | | |
259 | 300 | | |
260 | 301 | | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
| 319 | + | |
| 320 | + | |
261 | 321 | | |
262 | 322 | | |
263 | 323 | | |
| |||
0 commit comments