Mike Fährmann
84f4d3bc0b
replace urllib3's default cipher list with Firefox's ( #342 )
...
Avoids Cloudflare CAPTCHAs on both Linux in Windows without
pyOpenSSL installed.
5 years ago
Mike Fährmann
feb98cf196
[twitter] improve 'content' formatting; add option ( #338 )
...
- include emoticons
- leave newlines intact
- remove pic.twitter.com/ links at the end
5 years ago
Mike Fährmann
1740086d8a
add 'repl' and 'sep' arguments to text.replace_html()
5 years ago
Mike Fährmann
8d1ae9b715
[tumblr] enable date-min/-max/-format options ( #337 )
5 years ago
Mike Fährmann
09f37fde39
[reddit] move date-min/-max handling into Extractor class
5 years ago
Mike Fährmann
fb875d1ab8
add warning about NSFW sites in supportedsites.rst ( #335 )
5 years ago
Mike Fährmann
7b77ecc35a
fix paths for files without extension ( #220 )
5 years ago
Mike Fährmann
c41ff9441e
improve find() for downloaders and postprocessors
5 years ago
Mike Fährmann
0151e250f5
[twitter] extract 'content' metadata ( closes #333 )
5 years ago
Mike Fährmann
16c582aaf9
implement 'mtime' post-processor ( #332 )
...
This can set a file's modification time according to a UNIX timestamp
or a datetime object from its metadata.
5 years ago
Mike Fährmann
62097284fe
add 'download' option ( #220 )
5 years ago
Mike Fährmann
fe7805de7c
improve attribute access in DownloadJob.handle_url()
...
Storing a value in a local variable an accessing it that way is faster
than going through 'self' if it is accessed more than once.
5 years ago
Mike Fährmann
56c7a66a4a
detect Cloudflare CAPTCHAs and update cipher list
5 years ago
Mike Fährmann
a7b42b37a2
[35photo] fix extraction
5 years ago
Mike Fährmann
04b8d0894a
[newgrounds] improve metadata extraction
5 years ago
Mike Fährmann
12da6bd0c9
[simplyhentai] fix/improve extraction
5 years ago
Mike Fährmann
fdec59f8e2
replace extractor.request() 'expect' argument
...
with
- 'fatal': allow 4xx status codes
- 'notfound': raise NotFoundError on 404
5 years ago
Mike Fährmann
2ff73873f0
[erolord] add gallery extractor ( closes #326 )
5 years ago
Mike Fährmann
b4da8c5a97
[sexcom] add extractor for related pins ( #325 )
5 years ago
Mike Fährmann
69997e92db
[sexcom] skip unavailable pins ( #325 )
5 years ago
Mike Fährmann
8966930c5c
[downloader:http] try to import SSL exception class from OpenSSL
...
(#324 )
5 years ago
Mike Fährmann
bc6b0cfddc
[shopify] skip consecutive duplicate products
...
Not filtering duplicate URLs anymore caused the archive ID uniqueness
test to fail.
5 years ago
Mike Fährmann
b89f0d8d3c
update extractor result tests
5 years ago
Mike Fährmann
69205df68d
allow '-1' for infinite retries ( #300 )
5 years ago
Mike Fährmann
f7b5c4c3e7
use values of 'retries' options correctly
...
The RE-tries option now specifies exactly that: the maximum number a
failed HTTP request is re-tried. For example a value of 2 will now
correctly stop after 3 attempts: the initial one + 2 re-tries.
The maximum wait-time now also caps at 30min and increases exponentially
for both extractor.request() and downloader.http.download().
5 years ago
Mike Fährmann
6393b47db2
add '-A/--abort'; deprecate '--abort-on-skip'
5 years ago
Mike Fährmann
f2000a69aa
implement 'image-unique' and 'chapter-unique' options ( #303 )
...
The default value for both is 'false', i.e. duplicate URLs are NOT
ignored.
The previous behavior was to always ignore duplicate URLs to make
'--abort-on-skip' work properly when new images where added to the
beginning of a collection while gallery-dl is running.
5 years ago
Mike Fährmann
40da44b17f
Merge branch 'v1.9.0'
5 years ago
Mike Fährmann
9a216a6c6c
release version 1.8.7
5 years ago
Mike Fährmann
7a99e85943
[kissmanga] fix download URLs and file extensions
...
The current Blogspot image URLs hosted on Kissmanga end with an
"invalid" query parameter (/000.png&upx=...), which doesn't get
recognized by 'spliturl()' and 'parseurl()' as such and gets therefore
included in the 'extension' field from 'text.nameext_from_url()'.
5 years ago
Mike Fährmann
055102431f
[hitomi] handle Game CG galleries with scenes ( fixes #321 )
5 years ago
Mike Fährmann
a9c89085fb
[instagram] implement login support ( #195 )
5 years ago
Mike Fährmann
f1b0c2bf5c
[downloader:ytdl] forward cookies to youtube-dl
...
to be able to download private videos from Twitter, Instagram, etc.
5 years ago
Mike Fährmann
7856e5e7dc
]deviantart] "fix" scraps extraction
5 years ago
Mike Fährmann
b1985d6579
test default format strings during extractor result tests
...
A missing value or an invalid "syntax" for a format replacement field
will raise an exception.
5 years ago
Mike Fährmann
082cb24acd
[pururin] fix extraction
...
Missing metadata information would lead to unnecessary exceptions.
5 years ago
Mike Fährmann
98554cbab8
[mangoxo] fix login
5 years ago
Mike Fährmann
108963d138
[imagefap] include Referer headers
5 years ago
Mike Fährmann
e314621366
[nsfwalbum] fix default directory_fmt ( #287 )
5 years ago
Mike Fährmann
95b1e4c3c0
implement R<old>/<new>/ format option ( #318 )
5 years ago
Mike Fährmann
18a1f8c6cd
[vanillarock] add post and tag extractors ( closes #254 )
5 years ago
Mike Fährmann
f0c5093812
[nsfwalbum] add album extractor ( closes #287 )
5 years ago
Mike Fährmann
15e4ddf46d
implement custom logging formatter
...
supports custom log message formats for each loglevel and, by
extension, custom ANSI codes and colors for errors and warnings
(#304 )
5 years ago
Mike Fährmann
61e413d85d
[hentaifoundry] stop disabling IPv6 addresses
...
The rogue address mentioned in a138d58
is no longer included in the DNS
results for www.hentai-foundry.com.
5 years ago
Mike Fährmann
76ae9957c2
[deviantart] force legacy version for single deviations
...
Let's see how long this works ...
DeviantArt is rolling out a new version of their website, including a
new internal and potentially usable API (rewrite incoming, yay).
The issue with the new layout is that it doesn't include the "old"
UUIDs for single deviations, i.e. mapping a numeric deviation ID to its
UUID counterpart is impossible with the new layout.
5 years ago
Mike Fährmann
70713f0f28
fix extractor result tests
5 years ago
Mike Fährmann
db3f52881a
add 'mtime' option
5 years ago
Mike Fährmann
ee4d7c3d89
update downloader.find() and related code
...
Instead of replacing 'https' with 'http' for every URL in
'get_downloader()', this now only happens once during downloader
initialization. Also unit tests.
5 years ago
Mike Fährmann
f4ba98771d
use Last-Modified header to set file modification time
...
(#236 , #277 )
5 years ago
Mike Fährmann
179d112083
[downloader] overhaul http and text modules
...
Get rid of the modular structure and simplify/specialize those modules.
5 years ago