Commit Graph

189 Commits (d1f2ef3b7b348a753918d738e76a0bea7e1cb449)

Author SHA1 Message Date
Mike Fährmann 29ea54dc41
[patreon] use '"browser": "firefox"' by default (#1117)
4 years ago
Mike Fährmann cf5fa75d4c
add 'browser' option (#1117)
4 years ago
Mike Fährmann e1a12761d7
strip '/' from instance root URLs
4 years ago
Mike Fährmann d656892670
remove cloudflare.py
4 years ago
Mike Fährmann 88fae99811
remove 'generate_extractors()'
4 years ago
Mike Fährmann 745a114c61
[common] implement BaseExtractor class
4 years ago
Mike Fährmann 0d406c8daf
[common] restrict values used in 'generate_extractors()'
4 years ago
Mike Fährmann 8ca7f54750
rename '_request_…' variables
4 years ago
Mike Fährmann c57a918f4a
[e621] implement delay via '_request_interval_min'
4 years ago
Mike Fährmann 1e3dd7330e
merge SharedConfigMixin functionality into Extractor
4 years ago
Mike Fährmann 198c33ec36
also collect post processors from 'basecategory' entries
4 years ago
Mike Fährmann 1e313d5b84
implement 'sleep-request' option
4 years ago
Mike Fährmann 055c32e0f7
precompute extractor config paths
4 years ago
Mike Fährmann 231dd4c800
accumulate postprocessor objects (#994)
4 years ago
Mike Fährmann f6fd449b59
reduce wait time growth rate from exponential to linear
4 years ago
Mike Fährmann 2c9766b29f
fix UnboundLocalError in Extractor.request()
4 years ago
Mike Fährmann d6a271d2c7
add 'response' objects to 'HttpError's
4 years ago
Mike Fährmann 53cc498d9c
improve config lookup when there are multiple possible locations
4 years ago
Mike Fährmann 1ae1df0d27
update '--write-pages' (#737)
4 years ago
Mike Fährmann 15c3d29062
move dump_response() into a separate function (#737)
4 years ago
Mike Fährmann a363da4b43
include redirects and headers in --write-pages dumps (#737)
4 years ago
Mike Fährmann 3201fe3521
add global SENTINEL object
4 years ago
Mike Fährmann f8f95e68a7
improve '--write-pages' (#737)
4 years ago
Vrihub 4cc761c730
Implement --write-pages option (#736)
4 years ago
Mike Fährmann 5d7ca76885
retry Cloudflare challenges
4 years ago
Mike Fährmann d02f7c1118
improve Extractor.wait()
5 years ago
Mike Fährmann 2a4f227e08
warn about expired cookies
5 years ago
Mike Fährmann 56f1c96168
implement 'parent-directory' option (#551)
5 years ago
Mike Fährmann 2a9be48511
improve util.load/save_cookiestxt() and add tests
5 years ago
Mike Fährmann c1a6862863
implement functions to load/save cookies.txt files (closes #586)
5 years ago
Mike Fährmann bd5ce9855c
allow GalleryExtractors to set URL-independent extensions
5 years ago
Mike Fährmann 3811fd8a25
fix time formatting for Python 3.4 and 3.5
5 years ago
Mike Fährmann 569747a78d
implement extractor.wait()
5 years ago
Mike Fährmann ce54b8c04c
let extractors opt-out of cookie option usage
5 years ago
Mike Fährmann d3e44e899d
raise NotFoundErrors for 404 responses in GalleryExtractors
5 years ago
Mike Fährmann a4dd8b3dab
improve _check_cookies()
5 years ago
Mike Fährmann 15f9bb3d14
add option to disable pyOpenSSL usage (#508)
5 years ago
Mike Fährmann e17907ee2a
change default value of 'cookies-update' to 'true'
5 years ago
Mike Fährmann e2710702d4
fix Cloudflare bypss
5 years ago
Mike Fährmann ae09f87602
improve SharedConfigMixin config lookups
5 years ago
Mike Fährmann f5604492c3
update interface of config functions
5 years ago
Mike Fährmann d45fabb79d
match user profile handling on deviantart and newgrounds
5 years ago
Mike Fährmann 1a197d2195
store the original cookiejar as Extractor._cookiejar
5 years ago
Mike Fährmann de83ae4576
make 'method' argument of Extractor.request keyword-only
5 years ago
Mike Fährmann d44f790e81
adjust output for HTTP status related errors
5 years ago
Mike Fährmann 389d2d7e38
implement 'cookies-update' option (#445)
5 years ago
Mike Fährmann 1693d97bd3
update extractor class hierarchies
5 years ago
Mike Fährmann f4bc75e854
fix rate limit handling for OAuth APIs (#368)
5 years ago
Mike Fährmann 21991acc49
add 'ciphers' option; update default User-Agent
5 years ago
Mike Fährmann 84f4d3bc0b
replace urllib3's default cipher list with Firefox's (#342)
5 years ago
Mike Fährmann 09f37fde39
[reddit] move date-min/-max handling into Extractor class
5 years ago
Mike Fährmann 56c7a66a4a
detect Cloudflare CAPTCHAs and update cipher list
5 years ago
Mike Fährmann fdec59f8e2
replace extractor.request() 'expect' argument
5 years ago
Mike Fährmann 69205df68d
allow '-1' for infinite retries (#300)
5 years ago
Mike Fährmann f7b5c4c3e7
use values of 'retries' options correctly
5 years ago
Mike Fährmann 399e8e965a
also update urllib3's cipher list for versions >= 1.25
5 years ago
Mike Fährmann c02f12ce2f
avoid Cloudflare CAPTCHAs for OpenSSL < 1.1.1
5 years ago
Mike Fährmann 5fd94c6b83
import urllib3 from requests.packages
5 years ago
Mike Fährmann 35f343206c
update default SSL cipher list in urllib3 < 1.25
5 years ago
Mike Fährmann e25ebc4bff
don't disable certificate checks anymore
6 years ago
Mike Fährmann 49a6522c38
ensure consistent headers and params ordering
6 years ago
Mike Fährmann f612284d24
cache cfclearance cookies
6 years ago
Mike Fährmann 591a07f20c
small code changes and cleanups
6 years ago
Mike Fährmann 6dae6bee37
automatically detect and bypass cloudflare challenge pages
6 years ago
Mike Fährmann 4ca4631bad
simplify auto-disabling certificate verification
6 years ago
Mike Fährmann 09d872a2b1
generalize extractor creation code
6 years ago
Mike Fährmann 3595cd582f
use GalleryExtractor as common base class
6 years ago
Mike Fährmann 5530871b5a
change results of text.nameext_from_url()
6 years ago
Mike Fährmann 32edf4fc7b
add '_extractor' info to manga extractor results
6 years ago
Mike Fährmann 2e516a1e3e
store the full original URL in Extractor.url
6 years ago
Mike Fährmann 580baef72c
change Chapter and MangaExtractor classes
6 years ago
Mike Fährmann 4b1880fa5e
propagate 'match' to base extractor constructor
6 years ago
Mike Fährmann 9a9cd32461
implement alternative constructor for extractors
6 years ago
Mike Fährmann 6284731107
simplify extractor constants
6 years ago
Mike Fährmann bc0951d974
allow for simplified test data structures
6 years ago
Mike Fährmann 00dc37ccbf
replace AsynchronousMixin Extractor with a Mixin
6 years ago
Mike Fährmann 4d656a81ca
replace SharedConfigExtractor class with a Mixin
6 years ago
Mike Fährmann bfbbac4495
[tsumino] add login capabilities (#161)
6 years ago
Mike Fährmann dd358b4564
improve cookie handling during logins
6 years ago
Mike Fährmann 06cbf5f9c4
implement 'chapter-reverse' option (#149)
6 years ago
Mike Fährmann 9a98b6769d
use extractor.request for API calls (#130)
6 years ago
Mike Fährmann b828473aa3
retry HTTP requests for more exception classes
6 years ago
Mike Fährmann c47482b110
smaller changes, missing docs, etc.
6 years ago
Mike Fährmann 2fa28a2609
update default user-agent string (closes #122)
6 years ago
Mike Fährmann c9861ca812
adjust message for status_code based exceptions
6 years ago
Mike Fährmann 4a348990f4
adjust value resolution for retries/timeout/verify options
6 years ago
Mike Fährmann f647f5d9c3
use 'verify' option for regular HTTP requests
6 years ago
Mike Fährmann 68d6033a5d
use 'retries' and 'timeout' options for regular HTTP requests
6 years ago
Mike Fährmann 017188d268
improve extractor.request()
6 years ago
Mike Fährmann 2d17a9e07f
improve extractor.request()
7 years ago
Mike Fährmann 8704d850bf
add explicit proxy support (#76)
7 years ago
Mike Fährmann 179bcdd349
adjust archive-ids
7 years ago
Mike Fährmann 3cec533c28
Merge branch 'archive'
7 years ago
Mike Fährmann 5b3c34aa96
use generic chapter-extractor in more modules
7 years ago
Mike Fährmann 7a412f5c32
implement generic manga-chapter extractor
7 years ago
Mike Fährmann 84a52a9256
add DownloadArchive class
7 years ago
Mike Fährmann cc0c2cca57
[reddit] add extractor for reddit-hosted images (closes #68)
7 years ago
Mike Fährmann e6814aebe2
add 'extractor.*.user-agent' config option
7 years ago
Mike Fährmann baf8094868
improve Extractor.request()'s retry behavior
7 years ago
Mike Fährmann 16783e327f
[common] fix UnboundLocalError in Extractor.request()
7 years ago