Mike Fährmann
2007cb2f59
[tests] check extractor category values
8 months ago
Mike Fährmann
fc4e737f67
[wikimedia] include 'sha1' in default filenames
8 months ago
Mike Fährmann
44f2c15a04
[wikimedia] handle 'File:' paths
8 months ago
Mike Fährmann
93b4120e77
[gelbooru] support 'all' and empty tag ( #5076 )
8 months ago
Mike Fährmann
a416d4c3d5
[sankaku] support post URLs with alphanumeric IDs ( #5073 )
8 months ago
Mike Fährmann
ea553a1d55
[wikimedia] generalize ( #1443 )
...
- support mediawiki.org
- support mariowiki.com (#3660 )
- combine code into a single extractor
(use prefix as subcategory)
- handle non-wiki instances
- unescape titles
8 months ago
Mike Fährmann
89066844f4
add 'config_instance' method
...
to allow for a more streamlined access to BaseExtractor instance options
8 months ago
Mike Fährmann
34a7afdbc1
merge #2340 : [wikimedia] add 'article' and 'category' extractors ( #1443 , #2906 )
8 months ago
Mike Fährmann
c3c1635ef3
[wikimedia] update
...
- rewrite using BaseExtractor
- support most Wiki* domains
- update docs/supportedsites
- add tests
8 months ago
Ailothaen
221f54309c
[wikimedia] Improved archive identifiers
8 months ago
Ailothaen
e33056adcd
[wikimedia] Add Wikipedia/Wikimedia extractor
8 months ago
Mike Fährmann
3d68eda4ab
[kemonoparty] add 'revision_hash' metadata ( #4706 , #4727 , #5013 )
...
A SHA1 hexdigest of other relevant metadata fields like
title, content, file and attachment URLs.
This value does NOT reflect which revisions are listed on the website.
Neither does 'edited' or any other metadata field (combinations).
8 months ago
Mike Fährmann
4d6ec6958d
[scripts] add 'push --force' to pull-request
8 months ago
Mike Fährmann
799a8206ad
merge #5061 : [webtoons] extract more metadata
...
- author_name
- comic_name
- episode_name
- username
8 months ago
Mike Fährmann
8ffa0cd3c8
[webtoons] small optimization
...
don't extract the entire 'author_area' and
avoid creating a second 'text.extract_from()' object
8 months ago
Mike Fährmann
59cf4b3884
merge #4444 : [2ch] add 'thread' and 'board' extractors ( #1009 , #3540 )
8 months ago
Mike Fährmann
90b382304a
[deviantart] fix KeyError: 'premium_folder_data' ( #5063 )
8 months ago
Mike Fährmann
4cedf378d5
[deviantart] fix AttributeError for URLs without username ( #5065 )
...
caused by 4f367145
8 months ago
Mike Fährmann
68196589c4
[2ch] update
...
- simplify extractor code
- more metadata
- add tests
8 months ago
hunter-gatherer8
6c4abc982e
[2ch] add 'thread' and 'board' extractors
...
- [2ch] add thread extractor
- [2ch] add board extractor
- [2ch] add new entry to supported sites
8 months ago
Mike Fährmann
69726fc82c
[tests] skip tests requiring auth when non is provided
8 months ago
blankie
bb446b1598
[webtoons] extract more metadata
8 months ago
Mike Fährmann
355b909f46
merge #5041 : [steamgriddb] add support ( #5033 )
8 months ago
Mike Fährmann
71e2c3e5a2
merge #5037 : [hatenablog] add support ( #5036 )
8 months ago
blankie
9f53daabb8
[hatenablog] implement additional suggestion
8 months ago
blankie
293f1559df
[hatenablog] implement suggestions
8 months ago
blankie
65f42442f5
[steamgriddb] implement another suggestion
8 months ago
blankie
8995fd5f01
[steamgriddb] implement suggestions
8 months ago
Mike Fährmann
b1c175fdd1
allow using an empty string as argument for -D/--directory
8 months ago
Mike Fährmann
b97af09e03
[tests] include URL in failure report
8 months ago
Mike Fährmann
58e0665fbc
[tests] load config from external file
8 months ago
Mike Fährmann
2dcfb012ea
[patreon] download 'm3u8' manifests with ytdl
8 months ago
Mike Fährmann
1c68b7df01
[patreon] fix KeyError ( #5048 )
8 months ago
Mike Fährmann
2191e29e14
[nijie] fix image URL for single image posts ( #5049 )
8 months ago
Mike Fährmann
bbf96753e2
[gelbooru] only log "Incomplete API response" for favorites ( #5045 )
8 months ago
Mike Fährmann
39904c9e4e
[deviantart:avatar] add 'formats' option ( #4995 )
8 months ago
Mike Fährmann
5c43098a1a
[twitter] revert to using 'media' timeline by default ( #4953 )
...
This reverts commit a94f944148
.
8 months ago
Mike Fährmann
5f9a98cf0f
[deviantart:avatar] fix exception when 'comments' are enabled ( #4995 )
8 months ago
Mike Fährmann
887ade30a5
[batoto] support more mirror domains ( #5042 )
8 months ago
Mike Fährmann
0a382a5092
[batoto] improve 'manga_id' extraction ( #5042 )
8 months ago
blankie
0c88373a21
[docs] add steamgriddb to supportedsites.md
9 months ago
blankie
100966b122
[steamgriddb] fix linting error
9 months ago
blankie
2ccb7d3bd3
[steamgriddb] add support
9 months ago
Mike Fährmann
ec958a26bc
[fuskator] make metadata extraction non-fatal ( #5039 )
...
- prevent KeyErrors
- prevent HTTP redirect
- return file URLs as list
9 months ago
blankie
2cfe788f93
[hatenablog] fix extractor naming errors
9 months ago
blankie
be6949c55d
[hatenablog] fix linting error
9 months ago
blankie
61f3b2f820
[hatenablog] add support
9 months ago
Mike Fährmann
657ed93a22
[batoto] improve v2 manga URL pattern
...
and add tests
9 months ago
Mike Fährmann
50eef1b5cc
merge #5029 : [pixiv] update App API headers
9 months ago
Mike Fährmann
33f228756a
[mangadex] add 'list' extractor ( #5025 )
...
supports listing manga and chapters from list feed
9 months ago