fix: allow JavaScript in archived pages to enable KaTeX/MathJax rendering#2541
fix: allow JavaScript in archived pages to enable KaTeX/MathJax rendering#2541idiottrader wants to merge 1 commit intokarakeep-app:mainfrom
Conversation
…ring Remove the -j flag from monolith arguments to preserve JavaScript in archived pages. This fixes issue karakeep-app#1243 where mathematical expressions weren't rendering in reader view.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review infoConfiguration used: Organization UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
WalkthroughModified the archiveWebpage subprocess invocation in crawlerWorker.ts by replacing the "-Ije" command-line flag with "-Ie" while preserving all other arguments unchanged. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Tip Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs). Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Greptile SummaryThis PR removes the Key Changes:
Critical Security Concern: Recommendations:
Confidence Score: 2/5
Important Files Changed
Last reviewed commit: b6e75c7 |
| no_proxy: serverConfig.proxy.noProxy?.join(","), | ||
| }, | ||
| })("monolith", ["-", "-Ije", "-t", "5", "-b", url, "-o", assetPath]); | ||
| })("monolith", ["-", "-Ie", "-t", "5", "-b", url, "-o", assetPath]); |
There was a problem hiding this comment.
Allowing JavaScript in archived pages creates a stored XSS risk. The archived HTML is later rendered using dangerouslySetInnerHTML in BookmarkHTMLHighlighter.tsx (line 411) without sanitization. While the -I flag provides some isolation, malicious JS from archived pages could still execute in users' browsers and potentially access localStorage, cookies, or make API calls. Consider either:
- Using DOMPurify to sanitize the archived HTML before storage/display
- Rendering archived content in a sandboxed iframe with strict CSP
- Only allowing specific script sources needed for math rendering (e.g., KaTeX/MathJax CDN URLs)
Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/workers/workers/crawlerWorker.ts
Line: 1428
Comment:
Allowing JavaScript in archived pages creates a stored XSS risk. The archived HTML is later rendered using `dangerouslySetInnerHTML` in `BookmarkHTMLHighlighter.tsx` (line 411) without sanitization. While the `-I` flag provides some isolation, malicious JS from archived pages could still execute in users' browsers and potentially access localStorage, cookies, or make API calls. Consider either:
1. Using DOMPurify to sanitize the archived HTML before storage/display
2. Rendering archived content in a sandboxed iframe with strict CSP
3. Only allowing specific script sources needed for math rendering (e.g., KaTeX/MathJax CDN URLs)
How can I resolve this? If you propose a fix, please make it concise.
Fixes #1243
Problem
Mathematical expressions weren't rendering in reader view because the monolith archiving tool was stripping JavaScript from archived pages.
Root Cause
In crawlerWorker.ts, the monolith command was called with -Ije flags where -j means "No JavaScript". Since MathJax and KaTeX require JavaScript to render math expressions, the math was not being rendered.
Solution
Changed the monolith arguments from ["-", "-Ije", ...] to ["-", "-Ie", ...] - removing the -j flag while keeping -I (isolate) and -e (no embeds) for security.
Testing
This fix should be tested with: https://transformer-circuits.pub/2025/attribution-graphs/methods.html
/claim #1243