gallery-dl

Commit Graph

Author	SHA1	Message	Date
Mike Fährmann	3cbbefd4ed	support 'filter' option for post processors (#1460 )	3 years ago
Mike Fährmann	adf4d661b3	use '_extractor' info in UrlJobs	3 years ago
Mike Fährmann	b50b8e6cf4	refactor applying 'parent-…' options	3 years ago
Mike Fährmann	7ab8374385	add 'parent-skip' option (#1399 )	3 years ago
Mike Fährmann	c693db5b1a	add '"skip": "terminate"' option Stops not only the current extractor/job, but all parent extractors/jobs as well.	3 years ago
Mike Fährmann	c5ca7905ce	add 'noop()' and 'identity()' functions	3 years ago
Mike Fährmann	5b4da4b4bf	reorder config access in Job constructor (#1111)	3 years ago
Mike Fährmann	b4ed7cb961	fix 'category-transfer' (#1111 ) broken since commit `055c32e0`	3 years ago
Mike Fährmann	a86ffb04bb	add 'output.fallback' option to enable/disable fallback URLs for -g/--get-urls	3 years ago
Mike Fährmann	a75e485461	add archive format to InfoJob output (#875 )	4 years ago
Mike Fährmann	bf241811dd	allow '_extractor' fields to be None or empty	4 years ago
Mike Fährmann	23641742a3	improve 'parent-directory' (#1364 ) Allow forwarding metadata from the top-level extractor to all children if 'parent-directory' is enabled for all extractors along the way. For example 'reddit' -> 'gfycat' -> 'redgifs'	4 years ago
Mike Fährmann	df94182e11	implement 'parent-metadata' option (#1364 ) experimental, might not work as expected, etc.	4 years ago
Mike Fährmann	b6719becf1	ensure '-s/--simulate' always prints filenames (#1360 ) by assuming a potentially wrong filename extension in cases where the correct one would only get known after a download started	4 years ago
Mike Fährmann	c963741860	add '-E/--extractor-info' command-line option (#875 )	4 years ago
Mike Fährmann	65ca923b4e	fix 'whitelist' option for BaseExtractor instances	4 years ago
Mike Fährmann	56a8968435	remove 'Message.Metadata' (#866 )	4 years ago
Mike Fährmann	46323ae6ff	initialize 'hooks' as empty tuple follow-up to `9c29fc4e` Prevent a "race" between initializing 'pathfmt' and 'hooks', and receiving a signal in between (e.g. ctrl+c), which would then crash in 'handle_finalize()'.	4 years ago
Mike Fährmann	9c29fc4e55	always initialize DownloadJob.hooks (fixes #1135 ) and not just when any (potential) post processors are defined	4 years ago
Mike Fährmann	9fffa9c343	rework post processor callbacks	4 years ago
Mike Fährmann	f99c6031e0	apply post processor blacklists/whitelists to basecategories (#1103)	4 years ago
Mike Fährmann	a3ca2f6080	update fallback URL handling remove Message.Urllist and use a '_fallback' field inside a kwdict	4 years ago
Mike Fährmann	fd20093c96	allow blacklist/whitelist to be empty lists/strings (#1051 )	4 years ago
Mike Fährmann	d5fa716d89	fix crash when using 'skip=false' and archive (fixes #1023 ) Separating the archive check from pathfmt.exists() in `b5243297` had some unintended side effects. It is also not possible to monkey-patch a dunder method like __contains__ because of the special method lookup that gets performed for them.	4 years ago
Mike Fährmann	231dd4c800	accumulate postprocessor objects (#994 ) Instead of one 'postprocessors' setting overwriting all others lower in the hierarchy, all postprocessors along the config path will now get collected into one big list. For example '--mtime-from-date' will therefore no longer cause other postprocessor settings in a config file to get ignored.	4 years ago
Mike Fährmann	3afd362e2e	add 'sleep-extractor' option (closes #964 ) (would have been nice if this were possible without code duplication)	4 years ago
Mike Fährmann	c78aa17506	add general 'blacklist' and 'whitelist' options (#492 , #844 )	4 years ago
Mike Fährmann	5912727b88	support format string replacement fields in archive paths (closes #985)	4 years ago
Mike Fährmann	b5243297ff	write skipped files to archive (closes #550 )	4 years ago
Mike Fährmann	3f73cc6855	allow 'parent-directory' to work recursively (fixes #905 )	4 years ago
Mike Fährmann	d5bfb0b38c	set pseudo extension for Metadata messages (#865 ) This prevents pathfmt.filename from potentially being empty.	4 years ago
Mike Fährmann	1b3870a4be	flush after writing JSON in DataJob() (#727 ) … and remove the dead handle_finalize() method, which is never called since DataJob() overrides run().	4 years ago
Mike Fährmann	7e8a747c56	improve output of '-K' for parent extractors 2 (#825 ) This is what `bb882b8` was supposed to be, but I managed to not include those changes in the first commit …	4 years ago
Mike Fährmann	ece73b5b2a	make 'path' and 'keywords' available in logging messages Wrap all loggers used by job, extractor, downloader, and postprocessor objects into a (custom) LoggerAdapter that provides access to the underlying job, extractor, pathfmt, and kwdict objects and their properties. __init__() signatures for all downloader and postprocessor classes have been changed to take the current Job object as their first argument, instead of the current extractor or pathfmt. (#574, #575)	4 years ago
Mike Fährmann	a1e739b96c	reuse connection adapters from parent extractors	4 years ago
Mike Fährmann	42f29c3e11	improve and simplify attribute access in DownloadJob.initialize()	4 years ago
Mike Fährmann	56f1c96168	implement 'parent-directory' option (#551 )	5 years ago
Mike Fährmann	37247dbaff	miscellaneous fixes	5 years ago
Mike Fährmann	0e9dc5c88e	fix AttributeError when accessing 'temppath' [ci skip]	5 years ago
Mike Fährmann	0b84068d84	remove temp files before downloading from fallback URLs otherwise the next call to download() with a fallback URL could see the partially downloaded "remains" from the previous, failed download attempt and "continue" it, writing the second half of a potentially different version of that file.	5 years ago
Mike Fährmann	2d4887b75b	improve KeywordJob output for "parent" extractors (closes #548 )	5 years ago
Mike Fährmann	2e2fc7f0ad	prevent infinite recursion when spawning extractors (closes #489 )	5 years ago
Mike Fährmann	1921c127a5	make OSErrors during file downloads nonfatal (closes #512 ) … except ENOSPC (No space left on device), since there is no reason to continue downloading in that case. All other errors that would prevent downloading data and writing it to disk get already raised during directory creation and are therefore not checked here.	5 years ago
Mike Fährmann	63e6993716	merge 'bypost' functionality into metadata postprocessor	5 years ago
Gio	c0b9ad678d	Separate metadata from handle_url into handle_metadata, commenting	5 years ago
Gio	6ed4fc07ff	Don't print intentional metadata skips to the console.	5 years ago
Gio	cfc70a97ab	Added an additional channel for downloading the metadata of an entire post or gallery.	5 years ago
Mike Fährmann	f5604492c3	update interface of config functions	5 years ago
Mike Fährmann	3fc1e12949	[postprocessor:metadata] filter private entries i.e. keys starting with an underscore	5 years ago
Mike Fährmann	9e88e7a344	[postprocessor:exec] improve (#421 , #413 ) - add 'final' option - include job status in pp finalization - improve and extend documentation	5 years ago
Mike Fährmann	5af291ba5c	include failed downloads and child extractors in exit status	5 years ago
Mike Fährmann	322c2e7ed4	renaming variables mostly 'keyword(s)' to 'kwdict'	5 years ago
Mike Fährmann	4409d00141	embed error messages in StopExtraction exceptions	5 years ago
Mike Fährmann	c887493a80	overhaul exception stuff	5 years ago
Mike Fährmann	389d2d7e38	implement 'cookies-update' option (#445 )	5 years ago
Mike Fährmann	03bc8adfc7	[postprocessor:exec] run after file moved to target location (#421)	5 years ago
Mike Fährmann	776e9e073f	close archive on job completion (#417 )	5 years ago
Mike Fährmann	9178b54eae	handle errors when opening download archive file (#417 )	5 years ago
Mike Fährmann	682105b8ee	prevent crash when loading unavailable downloader (#405 )	5 years ago
Mike Fährmann	5f8621b29d	improve output of active post processor modules	5 years ago
Mike Fährmann	0bb873757a	update PathFormat class - change 'has_extension' from a simple flag/bool to a field that contains the original filename extension - rename 'keywords' to 'kwdict' and some other stuff as well - inline 'adjust_path()' - put enumeration index before filename extension (#306)	5 years ago
Mike Fährmann	8dc42bb178	implement 'enumerate' for 'extractor.skip' (#306 ) [ci skip]	5 years ago
Mike Fährmann	20f7b07312	ensure postproc finalize() is called during C-c or crash (#355 )	5 years ago
Mike Fährmann	7b77ecc35a	fix paths for files without extension (#220 )	5 years ago
Mike Fährmann	62097284fe	add 'download' option (#220 )	5 years ago
Mike Fährmann	fe7805de7c	improve attribute access in DownloadJob.handle_url() Storing a value in a local variable an accessing it that way is faster than going through 'self' if it is accessed more than once.	5 years ago
Mike Fährmann	f2000a69aa	implement 'image-unique' and 'chapter-unique' options (#303 ) The default value for both is 'false', i.e. duplicate URLs are NOT ignored. The previous behavior was to always ignore duplicate URLs to make '--abort-on-skip' work properly when new images where added to the beginning of a collection while gallery-dl is running.	5 years ago
Mike Fährmann	ee4d7c3d89	update downloader.find() and related code Instead of replacing 'https' with 'http' for every URL in 'get_downloader()', this now only happens once during downloader initialization. Also unit tests.	5 years ago
Mike Fährmann	523ebc9b0b	Fix serialization of 'datetime' objects in '--write-metadata' Simplified universal serialization support in json.dump() can be achieved by passing 'default=str', which was already the case in DataJob.run() for -j/--dump-json, but not for the 'metadata' post-processor. This commit introduces util.dump_json() that (more or less) unifies the JSON output procedure of both --write-metadata and --dump-json. (#251, #252)	5 years ago
Mike Fährmann	b09a8184ca	move TestJob into test module; test _extractor values	6 years ago
Mike Fährmann	ae353ed3b0	provide "extractor" and "job" keys for logging output This allows for stuff like "{extractor.url}" and "{extractor.category}" in logging format strings. Accessing 'extractor' and 'job' in any way will return "None" if those fields aren't defined, i.e. in general logging messages.	6 years ago
Mike Fährmann	89ee8cd7e4	filter "private" kwdict entries	6 years ago
Mike Fährmann	61741d7333	provide type information for Queue messages Child extractors are now directly constructed with Extractor.from_url() if the extractor class is known beforehand, instead of using extractor.find() and searching through all possible extractor classes.	6 years ago
Mike Fährmann	277b52101a	add 'category-transfer' option [ci skip]	6 years ago
Mike Fährmann	5f38ac9609	[postprocessor:exec] add a better error message (#155 )	6 years ago
Mike Fährmann	0225d90078	add exception name and traceback for OSErrors	6 years ago
Mike Fährmann	fb53b5dd55	fix control+c during -j and range tests	6 years ago
Mike Fährmann	13cb270326	set target directory before postprocessor init (fixes #126 )	6 years ago
Mike Fährmann	b828473aa3	retry HTTP requests for more exception classes	6 years ago
Mike Fährmann	c47482b110	smaller changes, missing docs, etc. - make 'netrc' extractor-specific - rename 'downloader.enable' to 'enabled' - document 'downloader.ytdl.format' - consistent newlines in configuration.rst	6 years ago
Mike Fährmann	3c25fa2dad	update build_testresult_db.py script	6 years ago
Mike Fährmann	8ef84a6823	add option to enable/disable specific downloader modules ... and write URLs with no (active) downloader to unsupported-file	6 years ago
Mike Fährmann	d3d7f01543	add 'prepare()' step for post-processors This allows post-processors to modify the destination path before checking if a file already exists.	6 years ago
Mike Fährmann	6ed629f2b6	allow specifying number of skips before abort/exit (closes #115 ) In addition to 'abort' and 'exit', it is now possible to specify 'abort:N' and 'exit:N' (where N is any integer) as value for 'skip' to abort/exit after consecutively skipping N downloads.	6 years ago
Mike Fährmann	48a8717a7c	add 'output.num-to-str' option ... to convert any numeric values to string when outputting them as JSON (during '--dump-json' or otherwise)	6 years ago
Mike Fährmann	0514d6a0ae	make --filter and --range config-file options The functionality of --(chapter-)filter and --(chapter-)range are now also exposed as the following config-file options: - extractor..image-filter - extractor..image-range - extractor..chapter-filter - extractor..chapter-range TODO: update configuration.rst	6 years ago
Mike Fährmann	4a348990f4	adjust value resolution for retries/timeout/verify options This change introduces 'extractor..retries/timeout/verify' options as a general way to set these values for all HTTP requests. 'downloader.http.retries/timeout/verify' is a way to override these options for file downloads only and will fall back to 'extractor..…* values if they haven't been explicitly set. Also: downloader classes now take an extractor object as first argument instead of a requests.session.	6 years ago
Mike Fährmann	ca6ac4db6a	fix 'content' tests	6 years ago
Mike Fährmann	188876d814	implement youtube-dl downloader module URLs starting with 'ytdl:' will now be handled by youtube-dl. There is probably a lot to fix and improve, but the basic use case works. TODO: - format selection and ytdl options in general - better filename/path handling - ytdl support for "unsupported URLs" - ...	6 years ago
Mike Fährmann	8c8da11bb8	do not create directory structures when using '-s'	6 years ago
Mike Fährmann	41249f3ead	improve extractor.get_downloader()	6 years ago
Mike Fährmann	712b58a93b	[postprocessor] add black-/whitelist options Each post-processor config dict now supports a list of extractor categories for which it should/shouldn't be active for. For example: "postprocessors": [ {"name": "classify", "whitelist": ["tumblr", "deviantart"], ... } ]	6 years ago
Mike Fährmann	4313c95bc9	improve error message for OAuth2 authentication	6 years ago
Mike Fährmann	973cf98e88	fix download skip for files without extension	6 years ago
Mike Fährmann	2403c405e3	Merge branch 'postprocessor'	6 years ago
Mike Fährmann	baccf8a958	improve postprocessor handling - add pathfmt argument for __init__() - add finalization step - add option to keep or delete zipped files	6 years ago
Mike Fährmann	7646bdbcfd	improve postprocessor initialization code	6 years ago
Mike Fährmann	821535b458	adjust PathFormat class	6 years ago
Mike Fährmann	2df1a15fb8	add '-s/--simulate' to run data extraction without download Useful for quick testing (even though -g and -j kind of do the same) and to fill a download archive without actually downloading the files. -s does the same as the default behaviour, except downloading stuff. Maybe it should get a more fitting name, as it does actually write to disk (cache, archive)?	6 years ago
Mike Fährmann	76c32d58e5	[postprocessor] initial code	6 years ago

1 2 3 4 5

215 Commits (3528974459cb2e05e88c916435cda8983f33faf5)