JudolGuard 🛡️

AI-powered browser extension for detecting and blocking online gambling ("judol") content. Built with Plasmo (Manifest V3), powered by Gemma 4 E2B.

📖 Read the full story: WRITEUP.md — includes benchmarks, the fine-tuning journey, and how the base model beat our 7,336-sample LoRA fine-tune.

✨ Features

🧬 Hybrid detection — Aho-Corasick keyword matcher (100+ keywords, O(n)) for text + Gemma 4 E2B multimodal for images
⚡ Real-time blur — Images are blurred the moment Gemma 4 flags them, no waiting for batch completion
📝 Immediate text redaction — Gambling keywords in titles, descriptions, and paragraphs are redacted instantly (no API call)
🔀 Per-image AI analysis — Each image sent individually to Gemma 4 with page keyword context; parallel batches of 8
📊 Stats dashboard — Checked/blocked counters with weekly tracking in the popup
📥 CSV export — Download blocked sites log for reporting or research
🏠 BYOK support — Bring your own API key (OpenAI, DeepSeek, Groq, or any compatible endpoint)
🔒 Privacy-first — Your API key stays in local Chrome storage, never leaves your machine except to your chosen endpoint
📝 Report false positives — One-click GitHub issue pre-filled with all context
⚡ Visual indicator — Live floating badge with scan progress counter
🌐 Domain whitelist — Add trusted sites directly from the popup

📦 Installation

Option A: Download Pre-built (Recommended)

Download the latest build from GitLab: rayhanfadil/gemma_extension/-/releases — grab the gemma_extension.zip artifact.
Extract the zip somewhere permanent.
Open chrome://extensions → toggle Developer mode ON (top-right).
Click Load unpacked → select the extracted folder.
✅ Done! Click the puzzle icon → pin JudolGuard.

Option B: Build from Source

# Prerequisites
node >= 18, pnpm

# Clone from GitLab
git clone https://gitlab.com/rayhanfadil/gemma_extension.git
cd gemma_extension
pnpm install

# Build
pnpm build      # → build/chrome-mv3-prod/

# chrome://extensions → Developer mode → Load unpacked → select build/chrome-mv3-prod/

Requirements

You'll need an inference server running Gemma 4. See Server Setup below.

🏗 Architecture

┌─ Content Script ─────────────────────────────────────────────────────┐
│                                                                       │
│  Page Load                                                           │
│     │                                                                 │
│     ├─ 1. AC Keyword Scan (Aho-Corasick, 100+ keywords, O(n))       │
│     │      │                                                         │
│     │      ├─ Match found? → IMMEDIATE text redaction               │
│     │      │   ├─ Selective: blur + strikethrough + red bg          │
│     │      │   └─ Hide: remove from DOM                             │
│     │      │                                                         │
│     │      └─ Top 5 keywords → used as AI context                   │
│     │                                                                 │
│     ├─ 2. Extract ALL <img> elements (skip SVG, 1×1 trackers)       │
│     │      Resize to 512px → base64                                  │
│     │                                                                 │
│     └─ 3. Image Queue (parallel 8)                                   │
│            │                                                         │
│            ├─ Per image → send to Gemma 4 + keyword context          │
│            │   Prompt: "Analyze this image. Is it related to         │
│            │            online gambling? Answer ONLY YES/NO."        │
│            │                                                         │
│            └─ Result arrives → REAL-TIME action:                     │
│                ├─ YES → blur(30px) grayscale / remove                │
│                └─ NO  → skip                                        │
│                                                                       │
└──────────────────────┬───────────────────────────────────────────────┘
                       │  chrome.runtime.sendMessage
                       ▼
┌─ Background Service Worker ──────────────────────────────────────────┐
│                                                                       │
│  SCAN_KEYWORDS        → detectKeywords() from keyword-detector.ts    │
│                         Returns: matches[], totalScore, context       │
│                                                                       │
│  ANALYZE_IMAGE        → POST /v1/chat/completions                    │
│     │                   { model, messages: [{ role:"user",           │
│     │                     content: [text, image_url] }],             │
│     │                     max_tokens: 20, temperature: 0 }           │
│     │                                                                 │
│     ├─ Ollama       → /api/generate (native multimodal)              │
│     └─ OpenAI-compat → /v1/chat/completions + image_url[]            │
│                         (with optional Bearer token)                 │
│                                                                       │
│  ANALYZE_ARTICLE     → Text-only AI analysis (fallback when          │
│                         page has no images)                           │
│                                                                       │
└───────────────────────────────────────────────────────────────────────┘

🔄 Detection Flow

1. User visits page
   │
2. 💠 JudolGuard · Scanning 0/42  (42%...)    ← floating progress badge
   │
3. AC scan (instant, no API):
   ├─ "gacor, slot, slot88, slot online, maxwin" ← 14 matches
   └─ → Immediately redact text elements with these keywords
   │
4. Extract 42 images → resize to 512px → queue
   │
5. Parallel batch (8 concurrent):
   ├─ Image 1: Gemma 4 → YES → blur instantly
   ├─ Image 2: Gemma 4 → NO
   ├─ Image 3: Gemma 4 → YES → blur instantly
   ├─ ...
   └─ Image 42: Gemma 4 → NO
   │
6. Complete:
   ├─ SAFE (0 blocked)
   │  └─ ✅ JudolGuard · Aman                    ← auto-dismiss 3s
   │
   └─ DETECTED (>0 blocked)
      ├─ ⛔ JudolGuard · 7 judol terdeteksi      ← auto-dismiss 6s
      ├─ Selective: flagged images blurred + text redacted
      ├─ Hide:      flagged images removed + text removed
      └─ Full:      fullscreen overlay (triggered if ANY image blocked)

Why Two Detection Engines?

	Aho-Corasick (Text)	Gemma 4 (Images)
What it catches	Page titles, descriptions, paragraphs, metadata	Gambling banners, slot screenshots, promo images
Speed	Instant — O(n) scan, no API call	~7s per image (API call)
Accuracy	High recall, lower precision (keyword match)	High precision — visual understanding
Scope	100+ Indonesian gambling keywords	Any image, any language
Use case	Google Search results, blog posts, news articles	Image-heavy gambling sites, affiliate landing pages

🛡️ Blocking Modes

Mode	Image Action	Text Action	Use Case
Full Block	Count flagged → overlay at end	No redaction (overlay covers everything)	Maximum protection (default)
Selective	Blur(30px) grayscale real-time	Blur + strikethrough + red background	Browse with partial visibility
Hide	Remove from DOM real-time	Remove from DOM	Clean browsing, no traces

False Positive? Report It!

Click "Laporkan Kesalahan" on the block overlay → opens a pre-filled GitHub issue with:

Blocked URL
Timestamp
Confidence score
Detection reason

We review and improve the model.

⚙️ Configuration

Setting	Default	Description
Inference Provider	OpenAI-compatible (Server)	Local Ollama or remote API
API URL	`https://inference.server-fadil.my.id/v1`	Your endpoint
Model	`gemma-4-E2B-it-Q4_K_M.gguf`	Fine-tuned judol detection model
API Key (BYOK)	—	Optional: your own OpenAI/DeepSeek/Groq key
Blocking Mode	Full	Full / Selective / Hide
Confidence Threshold	85%	Minimum confidence to trigger block
Domain Whitelist	—	One domain per line, never analyzed

📊 Stats & Export

Popup shows live counters:

┌─────────────────────────────────┐
│  PROTECTION STATS     this week │
│  ┌────────┬────────┬────────┐  │
│  │ 1,247  │   12   │   3    │  │
│  │Checked │Blocked │Week    │  │
│  └────────┴────────┴────────┘  │
│  📥 Export Blocked Sites CSV   │
└─────────────────────────────────┘

CSV columns: timestamp, url, reason, confidence, mode

Weekly counter auto-resets every Monday.

🔧 Development

# Prerequisites
node >= 18, pnpm

# Setup
git clone https://gitlab.com/rayhanfadil/gemma_extension.git
cd gemma_extension
pnpm install

# Development
pnpm dev          # Hot-reload dev build
pnpm build        # Production → build/chrome-mv3-prod/

# Load in Chrome
# 1. chrome://extensions
# 2. Enable "Developer mode"
# 3. "Load unpacked" → select build/chrome-mv3-prod/

Requirements

Ollama (local) or OpenAI-compatible API (remote)
Gemma 4 E2B IT model (multimodal vision + text)
Chrome/Chromium 88+
llama.cpp server with --ctx-size 4096 (multimodal needs larger context)

Server Setup

The extension uses an inference server running llama.cpp with a multimodal Gemma 4 model.

Recommended systemd user service:

# ~/.config/systemd/user/llama-inference.service
[Unit]
Description=llama.cpp - Gemma 4 JudolGuard (inference, port 1234)
After=network-online.target
Wants=network-online.target

[Service]
ExecStart=/path/to/llama-server \
  -m /path/to/gemma-4-E2B-it-Q4_K_M.gguf \
  --mmproj /path/to/mmproj-BF16.gguf \
  --ctx-size 4096 \
  --flash-attn on \
  --cont-batching \
  --cache-type-k q8_0 \
  --cache-type-v q8_0 \
  --reasoning off \
  --parallel 4 \
  -ngl 999 \
  --host 0.0.0.0 \
  --port 1234 \
  --log-disable
Restart=always
RestartSec=5

[Install]
WantedBy=default.target

Key flags explained:

Flag	Value	Why
`-m`	`gemma-4-E2B-it-Q4_K_M.gguf`	Base Gemma 4 E2B IT, Q4_K_M (3.2 GB)
`--mmproj`	`mmproj-BF16.gguf`	Vision projection (multimodal support)
`--ctx-size`	4096	Multimodal vision needs large context per image
`--flash-attn`	`on`	~2-3× attention speedup on AMD ROCm
`--cont-batching`	—	Process parallel requests without queuing
`--cache-type-k/v`	`q8_0`	Q8_0 KV cache saves ~50% VRAM with minimal quality loss
`--reasoning`	`off`	Not needed for YES/NO classification
`--parallel`	4	Balance GPU attention across 4 concurrent slots (~65 tok/s each)
`-ngl`	999	Offload all layers to GPU

Run it:

systemctl --user daemon-reload
systemctl --user start llama-inference.service
systemctl --user enable llama-inference.service   # auto-start on boot

Model source: Download from HuggingFace:

lmstudio-community/gemma-4-E2B-it-GGUF (base model)
Or convert via convert_hf_to_gguf.py from the original safetensors

📂 Project Structure

src/
├── background.ts        # Service worker: API calls, message routing
├── content.tsx          # Content script: AC scan, text redaction, image queue, blocking UI
├── image-queue.ts       # Image extraction, resize, parallel batch processing
├── keyword-detector.ts  # Aho-Corasick keyword detection (100+ keywords)
├── ahocorasick.ts       # Aho-Corasick automaton implementation
├── popup.tsx            # Popup UI: stats, whitelist, mode toggle
├── options.tsx          # Settings page: connection, BYOK, threshold
├── style.css            # Tailwind CSS
└── features/
    └── count-button.tsx

test-pages/
├── safe.html            # Test page: clean content
├── gambling.html        # Test page: gambling content
└── images/              # Test images (safe / gambling)

🌐 Related

Inference server: Fadil3/gemma — llama.cpp server setup for Gemma 4
Model: fadiil/judol-guard-gemma4-e2b-gguf — Fine-tuned GGUF model
Dataset: JudolGuard training dataset (coming soon)

📄 License

MIT — Fadil3

Name		Name	Last commit message	Last commit date
Latest commit History 83 Commits
.github/workflows		.github/workflows
assets		assets
docs		docs
src		src
test-pages		test-pages
.gitignore		.gitignore
.prettierrc.mjs		.prettierrc.mjs
README.md		README.md
WRITEUP.md		WRITEUP.md
package-lock.json		package-lock.json
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
postcss.config.js		postcss.config.js
tailwind.config.js		tailwind.config.js
tsconfig.json		tsconfig.json
yarn.lock		yarn.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

JudolGuard 🛡️

✨ Features

📦 Installation

Option A: Download Pre-built (Recommended)

Option B: Build from Source

Requirements

🏗 Architecture

🔄 Detection Flow

Why Two Detection Engines?

🛡️ Blocking Modes

False Positive? Report It!

⚙️ Configuration

📊 Stats & Export

🔧 Development

Requirements

Server Setup

📂 Project Structure

🌐 Related

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

JudolGuard 🛡️

✨ Features

📦 Installation

Option A: Download Pre-built (Recommended)

Option B: Build from Source

Requirements

🏗 Architecture

🔄 Detection Flow

Why Two Detection Engines?

🛡️ Blocking Modes

False Positive? Report It!

⚙️ Configuration

📊 Stats & Export

🔧 Development

Requirements

Server Setup

📂 Project Structure

🌐 Related

📄 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages