Commit ba4768f
committed
feat: switch to jsdom for DOM processing and improve database queries
This commit replaces happy-dom with jsdom for HTML parsing and manipulation due to memory issues encountered during processing with happy-dom.
The following changes were made:
- Updated `vitest.config.ts` to use the `jsdom` environment.
- Replaced `happy-dom` with `jsdom` in `package.json` and `package-lock.json`.
- Updated `HtmlProcessor.ts` and `SemanticMarkdownSplitter.ts` to use `jsdom` for DOM operations.
- Added case-insensitive indexes to `library` and `version` columns in `init-db.sql`.
- Updated database queries in `DocumentStore.ts` to use `LOWER()` for case-insensitive matching of `library` and `version`.
- Updated `ScrapeTool.ts` to normalize the version using `semver`.1 parent bc805fc commit ba4768f
File tree
8 files changed
+532
-53
lines changed- src
- scraper/processor
- splitter
- store
- tools
8 files changed
+532
-53
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
17 | 17 | | |
18 | 18 | | |
19 | 19 | | |
20 | | - | |
21 | | - | |
22 | | - | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
| |||
49 | 49 | | |
50 | 50 | | |
51 | 51 | | |
52 | | - | |
| 52 | + | |
53 | 53 | | |
54 | 54 | | |
55 | 55 | | |
| |||
80 | 80 | | |
81 | 81 | | |
82 | 82 | | |
83 | | - | |
84 | | - | |
| 83 | + | |
| 84 | + | |
85 | 85 | | |
86 | 86 | | |
87 | 87 | | |
| |||
0 commit comments