[options] add --stop-after to limit download size #9044
Open
NecRaul wants to merge 7876 commits intomikf:masterfrom
Open
[options] add --stop-after to limit download size #9044NecRaul wants to merge 7876 commits intomikf:masterfrom
NecRaul wants to merge 7876 commits intomikf:masterfrom
Conversation
support using a 'User-Agent' header preset, e.g. "+firefox"
regular division is slightly faster than floor division and a float timestamp value is treated the same as an integer one
and fix a typo
mikf#8803 (comment) mikf#8803 (comment) - add 'recursive' option, remove 'zip' - recurse into subdirectories - add 'path' metadata - remove 'count' & 'num' metadata - update default directory & archive format
allow remuxing bgm audio into a different format/container
fixes regression introduced in a28fbbc
add 'media-user' and 'media-item' extractors TODO: 'media-category' extractor (?)
* bilibili: add support for live photo downloads * fix: resolve flake8 linting errors (whitespace and line length) * fix: resolve flake8 E302 and W293 linting errors * fix: resolve flake8 W293 and E302 linting errors * simplify syntax * add 'livephoto' option * add tests
implement generic access of * list items (L[1] -> L.1) * dict vslues (D[key] -> D.key) * object attributes (O.attr -> O.attr) in standard format strings
make stop condition more lenient
use '[^/?#]+' for names
don't strip URL parameters
fixes regression introduced in 56168fb
use 'concat' demuxer to combine frames for mkvmerge danbooru/danbooru#6103 danbooru/danbooru#6241
fixes regression introduced in d9917ec
- filter posts manually - don't use lists for 'in' checks against constant values
Introduce a new --stop-after option to limit the total number of bytes downloaded. DownloadJob now tracks cumulative downloaded size using a shared capacity dictionary. Extraction stops once the limit is reached. - Add capacity parameter to DownloadJob and propagate it in main() - Track _capacity["used"] and compare against _capacity["limit"] - Parse --stop-after value and convert to bytes
Add text.format_bytes() to convert integers to human-readable byte strings (e.g., 2.50M). This is conceptually the reverse of text.parse_bytes(). Currently only used in DownloadJob to log --stop-after limits, but could be moved or inlined where it is used if preferred.
cc920f5 to
d8adaec
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR introduces a new
--stop-afteroption to stop extraction after a specified total number of downloaded bytes. The limit is shared across allDownloadJobinstances so nested extractors and child jobs respect the same global cap.Changes
--stop-after SIZECLI argumentexception.StopExtractiontext.format_bytes()helper for human-readable logging of downloaded bytesNotes
text.format_bytes()is intended as the inverse oftext.parse_bytes()and currently only used for logging, but may be useful elsewhere. If this helper is not desired, it could be inlined or moved to where it's currently being used.