Commit Graph

19 Commits (1406f7125f2e010b6b5409d4b12a2602af19289f)

Author SHA1 Message Date
Mike Fährmann b0cb4a1b9c
replace 'text.extract()' with 'text.extr()' where possible
2 years ago
Mike Fährmann c6a9bab019
update extractor test results
2 years ago
Vrihub 96fcff182c
generic extractor (#735)
3 years ago
Mike Fährmann bd08ee2859
remove most 'yield Message.Version' statements
3 years ago
Mike Fährmann 2919d78bfc
update extractor test results
4 years ago
Mike Fährmann 193dca2ce1
update extractor test results
4 years ago
Mike Fährmann 93ce7466e2
[2chan] skip external links
4 years ago
Mike Fährmann b9bfa4c675
update extractor test results
4 years ago
Mike Fährmann 71acbdabf4
[2chan] fix metadata extraction
5 years ago
Mike Fährmann 2a3bd4e3c7
rename extractor classes starting with a digit
5 years ago
Mike Fährmann 2e516a1e3e
store the full original URL in Extractor.url
6 years ago
Mike Fährmann 4b1880fa5e
propagate 'match' to base extractor constructor
6 years ago
Mike Fährmann 6284731107
simplify extractor constants
6 years ago
Mike Fährmann 9e12e073ab
[2chan] fix extraction
6 years ago
Mike Fährmann 34873dbd90
set 'archive_fmt' values
7 years ago
Mike Fährmann 6f30cf4c64
change keyword names to valid Python identifiers
7 years ago
Mike Fährmann 394241cd6f
[2chan] fix extraction
7 years ago
Mike Fährmann 30d3a5f9b2
support redirects on 4chan archives
7 years ago
Mike Fährmann 47692f28da
[2chan] add thread extractor
7 years ago