Very much a work in progress.
At the moment, it allows to
- wait and restart an extractor (#3338)
- change the exit code (#3630)
- change the log level of a logging message
based on the contents of a logging message
- share adapter & connection pool across sessions with the same
ssl options, ssl ciphers, and source address
- simplify browser emulation to just a list of headers and ciphers
Each entry in such a list can now also include a subcategory
'<category>:<subcategory>'
and it is possible to use '*' or an empty string as placeholder
'*:<subcategory>', ':<subcategory>', '<category>:*'
For example
"blacklist": "imgur,*:tag,gfycat:user" or
"blacklist": ["imgur", "*:tag", "gfycat:user"]
will filter all 'imgur' extractors, all extractors with a 'tag'
subcategory (e.g. https://danbooru.donmai.us/posts?tags=bonocho),
and all 'gfycat' user extractors.
regression from adf4d661
It would either stop at the first level (-g) or go infinitely deep (-G)
Going down to for example level 3 with -ggg didn't work.
This change makes it possible to specify just the name of a post processor
in the "postprocessors" list instead of a dict with all of its options.
The options for it will then be taken from inside the "postprocessor"
block similar to "extractor", "downloader", or "output" blocks.
This makes it possible to for example override the default settings for
--write-metadata by specifying a custom "metadata" block, or to set a
custom post processor block ("cbz") and then use it by referencing just
its name in "postprocessors" lists.
{
"postprocessor":
{
"metadata": {
"name": "metadata",
"event": "post",
"filename": "{tweet_id|post_id|id}.json"
},
"cbz": {
"name" : "zip",
"compression": "store",
"extension" : "cbz"
}
}
}
Allow forwarding metadata from the top-level extractor to all children
if 'parent-directory' is enabled for all extractors along the way.
For example 'reddit' -> 'gfycat' -> 'redgifs'