Skip to content

[toongod] add support#8963

Open
nthduy wants to merge 7771 commits intomikf:masterfrom
nthduy:feat/toongod-extractor
Open

[toongod] add support#8963
nthduy wants to merge 7771 commits intomikf:masterfrom
nthduy:feat/toongod-extractor

Conversation

@nthduy
Copy link
Copy Markdown
Contributor

@nthduy nthduy commented Jan 30, 2026

Add support for https://www.toongod.org/ webtoon site.

Implements chapter and webtoon extractors with Cloudflare bypass support:

Features:

  • Chapter extractor: extracts all images from chapter pages
  • Webtoon extractor: lists all chapters from webtoon series pages
  • FlareSolverr integration with session management for efficient Cloudflare bypass
  • Browser cookie fallback support

Cloudflare Protection:
Site uses Cloudflare protection. Two bypass methods supported:

  1. FlareSolverr (recommended): Automatic challenge solving with session reuse

    • Configure: {"extractor": {"toongod": {"flaresolverr-url": "http://localhost:8191/v1"}}}
    • Performance: ~0.5s per request (sessions reuse cookies after first challenge)
  2. Browser cookies: Manual cookie export

    • Use: gallery-dl --cookies cookies.txt <url>

mikf and others added 30 commits December 9, 2025 18:55
add 'post' & 'user' extractors
* [pornpics] add category and listing extractors

Add support for:
- Category pages like /ass/, /milf/, /blonde/ etc.
- Listing pages like /popular/, /recent/, /rating/, /likes/, /views/, /comments/

Category pages use JSON pagination like tags/search.
Listing pages don't support JSON pagination and use different HTML structure.

* [pornpics] simplify category pattern via class ordering

- Move PornpicsCategoryExtractor after PornpicsListingExtractor
  so it acts as catch-all, eliminating need for negative lookahead
- Use list comprehension in PornpicsListingExtractor.galleries()

* update docs/supportedsites
* [fitnakedgirls] add extractor

Add support for fitnakedgirls.com:
- Photo galleries (/photos/gallery/)
- Category pages (/photos/gallery/category/)
- Tag pages (/photos/tag/)
- Video posts (/videos/)
- Blog posts (/fitblog/)

Handles both newer (wp-block-image) and older (size-large) templates.

* simplify & fix

- use '_extract_title' method
- move '_pagination' into base class
- update 'FitnakedgirlsTagExtractor' pattern

* update docs/supportedsites

---------

Co-authored-by: Mike Fährmann <mike_faehrmann@web.de>
Add support for nudostar.com forum (XenForo-based forum site).
This is separate from the existing nudostar.py which handles nudostar.tv.

Supports:
- Thread extraction with pagination
- Individual post extraction
- Authentication via xf_user cookie or username/password
- Internal attachments (both linked and embedded images)
- External image host URLs (queued for recursive processing)
- fix website_token extraction
- send website_token as 'X-Website-Token' header
- intercept ytdl logging messages and signal error when it
  emits an error message
- remove "ERROR:" etc from ytdl logging messages
mikf and others added 11 commits January 28, 2026 19:37
for example '?order=asc&group=j0fsj3oem3&tlang=en'
fix '400 Bad Request' errors when retrieving
more than the first batch of posts.
* Make sure that `img_id`, `audio_id` and `cover_id` fields are always available.
    The values are set '' where they are not applicable.
    Having `img_id` is necessary for the default `archive_fmt`, the other fields are handled for consistency.
* Allow downloading more than one cover.
    The previous behavior is kept as-is, but setting the "covers" option to "all" now grabs all available covers.
* Add support for downloading subtitles
    Allows filtering subtitles by source type (ASR, MT) and language.
* Ensure archive uniqueness for covers and subtitles.
* Update the URL test pattern to include the `image` extension.
    Although Tiktok may serve the covers with jpeg content, the file ending can be `.image`.
    The test before 0c14b16 failed because the asserted URL did not match all cover types, but the now used pattern needs the mentioned file ending.
* Add support for "creator_caption" subtitles in "LC" format.
    These subtitles have the keys "Format" set to "creator_caption" and "Source" to "LC".
* Add "LC" (Local Captions) as a subtitle source type in the documentation
* Code deduplication and renaming subtitle metadata
    Changed the item type from singular `subtitle` to `subtitles`.
    Removed the wrong descriptor `cover` from the subtitles fallback title.
* Refactor subtitle filtering
    The filter is now prepared in `_init` to prevent parsing the same config parameter for every item.
    The `_extract_subtitles` function will still extract if either filter (source or language) matches.
* Generate a `file_id` for subtitles
    Subtitles have multiple fields that determine the unique file, so these are simply concatenated.
    This is similar to the cover types, only with more variations.
* Added tests for subtitles
* fix docs entries
* fix '"covers": "all"'
* simplify some code
* Fix fallback title for subtitles
    Added the missing "f" to the f-string and added "subtitle" to the title.
    The resulting title will look like "TikTok video subtitle #1234567"
Add extractor for toongod.org webtoon site with Cloudflare bypass
support using FlareSolverr proxy.
- Fix line length issues (max 79 chars)
- Fix continuation line indentation
@nthduy nthduy force-pushed the feat/toongod-extractor branch 5 times, most recently from 6186e4a to bf23ef8 Compare January 31, 2026 14:39
@nthduy nthduy force-pushed the feat/toongod-extractor branch from bf23ef8 to f654cbc Compare January 31, 2026 14:45
Fix folder naming issue where series names included junk suffixes like
"Manhwa Afahbb" by extracting titles from breadcrumb navigation instead
of URL slugs or H1 tags.

- Extract series name from breadcrumb links (always clean)
- Fallback to H1 tag with cleaning if breadcrumb fails
- Remove "Manhwa", "Webtoon", "Manhua" suffixes
- Remove encoded ID patterns (e.g., "Afahbb", "Aeaabb")

Before: "Perfect Half Manhwa Afahbb"
After: "Perfect Half"
@nthduy
Copy link
Copy Markdown
Contributor Author

nthduy commented Feb 1, 2026

Pushed a new commit to handle an edge case I discovered.

Problem: Some manhwa on ToonGod have strange slugs that break the original H1/slug-based extraction. For example:

The issue is ToonGod's chapter URLs contain these suffixes (/webtoon/perfect-half-manhwa-afahbb/chapter-1/), and the chapter extractor was converting the slug to title case as a fallback.

Solution: I changed the approach to extract from breadcrumb navigation since I noticed it's always clean and consistent. Falls back to H1 tag cleaning if breadcrumb fails.

Tested on multiple series:

All tests pass, flake8 clean.

Thanks for your review!

@nthduy nthduy force-pushed the feat/toongod-extractor branch from d9c75d2 to 1407564 Compare February 10, 2026 23:56
@Dragonatorul
Copy link
Copy Markdown

Toongod also uses wpmadara underneath. The base class in #9246 would cover this site too.

@Dragonatorul
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.