Mike Fährmann
e30ada162d
fix cookie tests
...
update _get_extractor():
- always return an Extractor instance with a _login_impl() method
- use Extractor.from_url()
5 years ago
Mike Fährmann
2316e0ed3d
fix strptime workaround from b0e85a4
...
Don't return a modified version of 'date_time' if strptime fails.
5 years ago
Mike Fährmann
6764847349
fix cookie tests
...
'cookies' is a CookieJar, not a dict,
and removing the call to '.keys()' doesn't have the same effect
5 years ago
Mike Fährmann
a5b060765d
improve code in tests
...
- use 'assertRaises' as context manager
- remove calls to .keys()
5 years ago
Mike Fährmann
b0e85a42e3
apply workaround from 4736912
in parse_datetime() itself
5 years ago
Mike Fährmann
4736912d4e
[pixiv] work around strptime limitations in Python < 3.7
...
"%z" doesn't allow a colon separator in older Python versions:
- "+0900" is OK
- "+09:00" raises an exception
5 years ago
Mike Fährmann
d09864b581
implement text.parse_datetime()
5 years ago
Mike Fährmann
5582b06ae4
fix tests with 'urllist' messages
5 years ago
Mike Fährmann
5018781898
allow type tests by name
5 years ago
Mike Fährmann
6264a46212
use 'utcfromtimestamp()'
...
'fromtimestamp()' converts its results to the local timezone and causes
problems when running tests on a different machine.
5 years ago
Mike Fährmann
d670de0344
implement 'text.parse_timestamp()'
5 years ago
Mike Fährmann
21a7e395a7
implement convenience wrapper for text.extract functionality
6 years ago
Mike Fährmann
e25ebc4bff
don't disable certificate checks anymore
...
Executables generated with PyInstaller auto-include the root certificate
file and certificate checks now work out-of-the-box.
6 years ago
Mike Fährmann
d6ddb74cde
update test results
...
- deviantart: 'index' is now an integer
- flickr: image file with lower quality
- paheal: image server name changed
- rule34: post got deleted
6 years ago
Mike Fährmann
d9b94a585d
[mangoxo] add login support ( #184 )
...
A very recent change: It is now only possible to see more
than the first 5 images of an album if you are logged in.
6 years ago
Mike Fährmann
e730fc9045
[twitter] add login support ( #214 )
6 years ago
Mike Fährmann
790f15a56f
[photobucket] use HTTPS
6 years ago
Mike Fährmann
c70b21248d
[wikiart] add extractors ( #179 )
...
for
- artists: https://www.wikiart.org/en/thomas-cole
- artist-listings: https://www.wikiart.org/en/artists-by-century/12
- artwork-listings: https://www.wikiart.org/en/paintings-by-media/grisaille
6 years ago
Mike Fährmann
0c991a3155
add convenience targets to Makefile
6 years ago
Mike Fährmann
6277a739e4
[35photo] add user-, genre-, and image-extractors ( #162 )
6 years ago
Mike Fährmann
973a720a7a
[weibo] fix unit test URL patterns
6 years ago
Mike Fährmann
6f57d44ec2
[seaotterscans] remove extractor
...
http://seaotterscans.com/ now redirects to their MangaDex profile
6 years ago
Mike Fährmann
0887fb61f4
[komikcast] update test results
6 years ago
Mike Fährmann
a881537b91
more util.py tests
6 years ago
Mike Fährmann
976ccb267f
[myportfolio] combine gallery and user extractors
...
An URL alone isn't good enough to distinguish between a gallery or a
gallery-listing, so the new extractor decides what to do based on the
page's content.
6 years ago
Mike Fährmann
9c0e2f294b
[shopify] add generic collection and product extractors ( #175 )
...
with fashionnova.com as a default domain
6 years ago
Mike Fährmann
176b7253a1
update function signature for config.load()
6 years ago
Mike Fährmann
e687a6095e
[luscious] raise exception if album is not available
6 years ago
Mike Fährmann
b09a8184ca
move TestJob into test module; test _extractor values
6 years ago
Mike Fährmann
5530871b5a
change results of text.nameext_from_url()
...
Instead of getting a complete 'filename' from an URL and splitting that
into 'name' and 'extension', the new approach gets rid of the complete
version and renames 'name' to 'filename'. (Using anything other than
{extension} for a filename extension doesn't really work anyway)
Example: "https://example.org/path/filename.ext "
before:
- filename : filename.ext
- name : filename
- extension: ext
now:
- filename : filename
- extension: ext
6 years ago
Mike Fährmann
148b8f15d0
update tests for util.py
6 years ago
Mike Fährmann
4b1880fa5e
propagate 'match' to base extractor constructor
6 years ago
Mike Fährmann
1f3422c28b
[mangahere] fix extraction
6 years ago
Mike Fährmann
84ae72b8d8
[ngomik] fix extraction
6 years ago
Mike Fährmann
9a9cd32461
implement alternative constructor for extractors
6 years ago
Mike Fährmann
abbd45d0f4
update handling of extractor URL patterns
...
When loading extractor classes during 'extractor.find(…)', their
'pattern' attribute will be replaced with a compiled version of itself.
6 years ago
Mike Fährmann
6284731107
simplify extractor constants
...
- single strings for URL patterns
- tuples instead of lists for 'directory_fmt' and 'test'
- single-tuple tests where applicable
6 years ago
Mike Fährmann
bc0951d974
allow for simplified test data structures
...
Instead of a strict list of (URL, RESULTS)-tuples, extractor result
tests can now be a single (URL, RESULTS)-tuple, if it's just one test,
and "only matching" tests can now be a simple string.
6 years ago
Mike Fährmann
347398f692
fix various tests
6 years ago
Mike Fährmann
e1d3e9a926
add 'ext_from_url' to text.py
6 years ago
Mike Fährmann
2d2953a5bf
add 'text.parse_float()' + cleanup in text.py
6 years ago
Mike Fährmann
0c32dc5858
[hentaifox] add extractor for search results ( #160 )
6 years ago
Mike Fährmann
217a0687ef
[behance] add 'collection' extractor ( closes #157 )
6 years ago
Mike Fährmann
b8fed34548
add generalized extractors for Mastodon instances ( #144 )
...
Extractors for Mastodon instances can now be dynamically generated,
based on the instance names in the 'extractor.mastodon.*' config path.
Example:
{
"extractor": {
"mastodon": {
"pawoo.net": { ... },
"mastodon.xyz": { ... },
"tabletop.social": { ... },
...
}
}
}
Each entry requires an 'access-token' value, which can be generated with
'gallery-dl oauth:mastodon:<instance URL>'.
An 'access-token' (as well as a 'client-id' and 'client-secret') for
pawoo.net is always available, but can be overwritten as necessary.
6 years ago
Mike Fährmann
66460337f1
[mangapark] fix extraction
6 years ago
Mike Fährmann
79c01ec7ae
implement J<separator>/ format option
...
J joins list elements by calling <separator>.join(list):
Example:
{f:J - /} -> "a - b - c" (if "f" is ["a", "b", "c"])
6 years ago
Mike Fährmann
9bbbadd93a
[hbrowse] use HTTPS
6 years ago
Mike Fährmann
98c6520384
[pinterest] update root URL of API calls
6 years ago
Mike Fährmann
751e535948
[nhentai] fix extraction ( closes #156 )
...
Use JSON embedded in webpage since API endpoints have been disabled
6 years ago
Mike Fährmann
1734a6c879
[reactor] detect "circular" redirects ( #148 )
6 years ago