Mike Fährmann
193dca2ce1
update extractor test results
4 years ago
Mike Fährmann
93ce7466e2
[2chan] skip external links
4 years ago
Mike Fährmann
b9bfa4c675
update extractor test results
4 years ago
Mike Fährmann
71acbdabf4
[2chan] fix metadata extraction
5 years ago
Mike Fährmann
2a3bd4e3c7
rename extractor classes starting with a digit
5 years ago
Mike Fährmann
2e516a1e3e
store the full original URL in Extractor.url
6 years ago
Mike Fährmann
4b1880fa5e
propagate 'match' to base extractor constructor
6 years ago
Mike Fährmann
6284731107
simplify extractor constants
...
- single strings for URL patterns
- tuples instead of lists for 'directory_fmt' and 'test'
- single-tuple tests where applicable
6 years ago
Mike Fährmann
9e12e073ab
[2chan] fix extraction
6 years ago
Mike Fährmann
34873dbd90
set 'archive_fmt' values
...
These are going to be used to create an unique id for each image.
7 years ago
Mike Fährmann
6f30cf4c64
change keyword names to valid Python identifiers
...
This commit mostly replaces all minus-signs ('-') in keyword names with
underscores ('_') to allow them to be used in filter-expressions. For
example 'gallery-id' got renamed to 'gallery_id'.
(It is theoretically possible to access any variable, regardless of its
name, with 'locals()["NAME"]', but that seems a bit too convoluted if
just 'NAME' could be enough)
7 years ago
Mike Fährmann
394241cd6f
[2chan] fix extraction
7 years ago
Mike Fährmann
30d3a5f9b2
support redirects on 4chan archives
7 years ago
Mike Fährmann
47692f28da
[2chan] add thread extractor
7 years ago