Skip to content

Commit 6c76509

Browse files
committed
fix: remove overly aggressive html filtering
this commit: - removes figure and sup tags from html filtering - keeps alert and role=alert selectors commented out - adds explanatory comments about alerts being used for important content fixes #36
1 parent 389c8f5 commit 6c76509

File tree

1 file changed

+2
-4
lines changed

1 file changed

+2
-4
lines changed

src/scraper/processor/HtmlProcessor.ts

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -66,18 +66,16 @@ export class HtmlProcessor implements ContentProcessor {
6666
".signup-form",
6767
".tooltip",
6868
".dropdown-menu",
69-
".alert",
69+
// ".alert", // Known issue: Some pages use alerts for important content
7070
".breadcrumb",
7171
".pagination",
72-
'[role="alert"]',
72+
// '[role="alert"]', // Known issue: Some pages use alerts for important content
7373
'[role="banner"]',
7474
'[role="dialog"]',
7575
'[role="alertdialog"]',
7676
'[role="region"][aria-label*="skip" i]',
7777
'[aria-modal="true"]',
7878
".noprint",
79-
"figure",
80-
"sup",
8179
];
8280

8381
constructor(options?: HtmlProcessOptions) {

0 commit comments

Comments
 (0)