gallery-dl

Commit Graph

Author	SHA1	Message	Date
Mike Fährmann	69a5e6ddb3	Merge branch 'master' into 1.4-dev	6 years ago
Mike Fährmann	3fe653d940	fix test_results for empty sets {} is an empty dict and doesn't support set operations	6 years ago
Mike Fährmann	d96b3474e5	[puremashiro] remove module site has been unreachable for a couple of weeks and now the DNS record is gone as well	6 years ago
Mike Fährmann	b44a296404	[gomanga] remove module site has been unreachable for a couple of weeks and the cloudflare status page shows host errors	6 years ago
Mike Fährmann	2395d870dd	[pinterest] unquote board and user names, better errors	6 years ago
Mike Fährmann	55d4d23860	[pinterest] use Pinterest's "Web" API (#83 ) no access tokens, no user credentials of any kind ...	7 years ago
Mike Fährmann	f471161920	Merge branch 'master' into 1.4-dev	7 years ago
Mike Fährmann	cc36f88586	rename safe_int to parse_int; move parse_* to text module	7 years ago
Mike Fährmann	10cc59f3b5	fix extractor names	7 years ago
Mike Fährmann	b1325d4d2c	fix extractor docstrings	7 years ago
Mike Fährmann	df7e18399e	[luscious] fix image order	7 years ago
Mike Fährmann	d10579edb5	[pinterest] improve PinterestAPI code; remove OAuth mentions on another note: access_tokens have been set to only allow for 10 requests per hour (from 200 yesterday)	7 years ago
Mike Fährmann	4bd182c107	[pinterest] implement `oauth:pinterest` (#83 ) Pinterest access tokens are rate limited at 200 requests per hour (or maybe per 2 or 3 hours?) so having just one access token for all users isn't going to work in the long run.	7 years ago
Mike Fährmann	dbe250f7e5	[pinterest] update access_token (#83 )	7 years ago
Mike Fährmann	5c487300ee	improve 'parse_query()' and add tests - another irrelevant micro-optimization ! - use urllib.parse.parse_qsl directly instead of parse_qs, which just packs the results of parse_qsl in a different data structure - reduced memory requirements since no additional dict and lists are created	7 years ago
Mike Fährmann	4ffa94f634	remove 'shorten_path()' and 'shorten_filename()'	7 years ago
Mike Fährmann	27eab4e467	rewrite text tests and improve functions - test more edge cases - consistently return an empty string for invalid arguments - remove the ungreedy-flag in 'remove_html()'	7 years ago
Mike Fährmann	e3f2bd4087	add tests for 'text.clean_xml()' and improve it	7 years ago
Mike Fährmann	6d8b191ea7	improve 'parse_query()' and add tests - another irrelevant micro-optimization ! - use urllib.parse.parse_qsl directly instead of parse_qs, which just packs the results of parse_qsl in a different data structure - reduced memory requirements since no additional dict and lists are created	7 years ago
Mike Fährmann	51ea699083	add 'abort()' as function to filter expressions calling 'abort()' in a filter aborts the current extractor run in a cleaner way than using something like 1/0, which causes an error message to be printed	7 years ago
Mike Fährmann	48a83a89e9	[loveisover] remove module archive.loveisover.me was shut down on 2018-03-29; https://www.archiveteam.org/index.php?title=4chan#archive.loveisover.me	7 years ago
Mike Fährmann	564e12ca8f	replace 'imgyt' with 'imxto' https://img.yt/ wasn't available for a couple of days, but has now re-emerged as https://imx.to/ with a new web-interface. Links to older images still work (see tests).	7 years ago
Mike Fährmann	d11fcf4804	smaller changes and fixes - fix the cloudflare challenge result if the last decimal places are zero (JS`s toFixed() removes trailing zeroes) - fix downloading of kissmanga chapter-pages hosted on blogspot (accessing blogspot with "kissmanga.com" as referrer yields a 401) - disable certificate validation for 'mangahere' tests - update flickr test result	7 years ago
Mike Fährmann	759ba26fb0	[luscious] proper image order for picture albums ... and (try) to start with the first image instead of somewhere in the middle of an album.	7 years ago
Mike Fährmann	0381ae5318	replace error handlers for stdout and co. Python3.5 and lower throw an UnicodeEncodeError when trying to print not-encodable characters when not using 'utf-8' as encoding. Setting their error handlers to 'replace' should help.	7 years ago
Mike Fährmann	64d7c85b55	[exhentai] improve metadata - add 'width', 'height' and 'size' (in bytes) for each image - change the former 'size' and 'size_units' into 'gallery_size'	7 years ago
Mike Fährmann	a112e3f2a0	[nijie] add doujin extractor adds support for "https://nijie.info/members_dojin.php?id=<artist_id>"	7 years ago
Mike Fährmann	299ae24996	[test] add a few downloader tests	7 years ago
Mike Fährmann	dd314279fb	[test] add unit tests for extractor module functions	7 years ago
Mike Fährmann	f5c6a2d7f5	[nhentai] use API to get gallery info	7 years ago
Mike Fährmann	8ef790de12	update .travis.yml - restrict builds to master branch and release tags - implement 'core' and 'results' test categories	7 years ago
Mike Fährmann	557cb94f81	[deviantart] use proper exponential backoff on API errors ... and use separate API credentials for unit tests.	7 years ago
Mike Fährmann	b69cc94f0e	[util] implement bencode()	7 years ago
Mike Fährmann	4d74749496	[tests] rework filters for extractor tests CI incompatible tests will now only be skipped if tests are run in a CI environment.	7 years ago
Mike Fährmann	32bbd12f08	update extractor tests	7 years ago
Mike Fährmann	749fbbfa6c	[mangadex] add chapter- and manga-extractor	7 years ago
Mike Fährmann	5008e105ee	update archive IDs ... to behave in a more straightforward way when dealing with bookmarks/favourites/etc. specific IDs are now grouped by their owner, album-id, ... to allow for duplicates when it would be expected.	7 years ago
Mike Fährmann	2fad0b1f1b	add 'U' conversion for format strings to unquote their content (#74)	7 years ago
Mike Fährmann	8f338347b6	[imagehosts] cleanup removed - chronos.to - unable to resolve hostname - coreimg.net - same - imgmaid.net - same - hosturimage.com - everything returns 404 - imageontime.org - redirects to some shady site - imgupload.yt - cloudflare error 522, host down - img4ever.net - read timeout	7 years ago
Mike Fährmann	e1e0668ca8	add option to set default replacement field value Missing or undefined keywords will now be replaced with the value set for 'keywords-default'. The default is Python's 'None', which is equivalent to setting this option to JSON's 'null'.	7 years ago
Mike Fährmann	ac3da8115e	[util] don't add text: URLs to list of downloaded URLs	7 years ago
Mike Fährmann	89440382ad	[tumblr] use separate API key for unit tests	7 years ago
Mike Fährmann	b50bdbf3d7	change config specifiers in input file format Instead of a dictionary/object, input file options are now specified by a 'key=value' pair starting with '-' for options only applying to the next URL or '-G' for Global options applying to all following URLs. See the docstring of parse_inputfile() for details. Example option specifiers: - filename = "{id}.{extension}" - extractor.pixiv.user.directory = ["Pixiv Users", "{user[id]}"] -spaces="are_optional" -G keywords = {"global": "option"}	7 years ago
Mike Fährmann	be3ea4425d	test archive-id creation and uniqueness	7 years ago
Mike Fährmann	b73b8b4f50	add OAuth unittests	7 years ago
Mike Fährmann	f5f2d29f56	[nijie] fix dojin extraction - correctly extract artist_id - set extension to "jpg" if it was empty and let filetype checks do the rest	7 years ago
Mike Fährmann	7a412f5c32	implement generic manga-chapter extractor	7 years ago
Mike Fährmann	aa38eab2be	allow not-defined fields in format strings ... and replace them with "None", for now	7 years ago
Mike Fährmann	619387cbb1	update extractor unittest results	7 years ago
Mike Fährmann	f94e3706a8	use logging module for error messages during downloads	7 years ago
Mike Fährmann	0dd48d644f	update test results nothing broke, but things got updated or changed	7 years ago
Mike Fährmann	1e93955170	[batoto] remove module Site officially shut down on 2018.01.18	7 years ago
Mike Fährmann	f10ffc0839	update extractor blacklist to also allow classes	7 years ago
Mike Fährmann	35e09869d1	[mangapark] fix image URLs and use HTTPS	7 years ago
Mike Fährmann	4edb25346e	[slideshare] support mobile URLs (closes #67 )	7 years ago
Mike Fährmann	b33efc99a4	[idolcomplex] add support for idol.sankakucomplex.com	7 years ago
Mike Fährmann	1a70857a12	update extractor-unittest capabilities - "count" can now be a string defining a comparison in the form of '<operator> <value>', for example: '> 12' or '!= 1'. If its value is not a string, it is assumed to be a concrete integer as before. - "keyword" can now be a dictionary defining tests for individual keys. These tests can either be a type, a concrete value or a regex starting with "re:". Dictionaries can be stacked inside each other. Optional keys can be indicated with a "?" before its name. For example: "keyword:" { "image_id": int, "gallery_id", 123, "name": "re:pattern", "user": { "id": 321, }, "?optional": None, }	7 years ago
Mike Fährmann	28cd78aae0	[kissmanga] extend chapter-string regex (closes #58 )	7 years ago
Mike Fährmann	fc7d165c97	[deviantart] add support for OAuth2 authentication Some user galleries [] require you to be either logged in or authenticated via OAuth2 to access their deviations. [] e.g. https://polinaegorussia.deviantart.com/gallery/ -------------- known issue: A deviantart 'refresh_token' can only be used once and gets updated whenever it is used to request a new 'access_token', so storing its initial value in a config file and reusing it again and again is not possible.	7 years ago
Mike Fährmann	0a9a07a6e1	[slideshare] improve metadata; flake8 - added 'views' and 'published' keywords - fixed longer titles and descriptions	7 years ago
Mike Fährmann	291369eab2	various smaller changes/additions	7 years ago
Mike Fährmann	300346ecdf	[mangazuki] remove extractors This site has been in "rebuild"-mode for a fairly long time and the current extractor code isn't going to work for the new version either.	7 years ago
Mike Fährmann	93482a1f88	implement 'util.advance()'	7 years ago
Mike Fährmann	a718c6c6cd	implement 'util.parse_bytes()'	7 years ago
Mike Fährmann	214972bc9a	[gelbooru] use manual extraction ... to compensate for their disabled API. (https://gelbooru.com/index.php?page=forum&s=view&id=3875) This also adds an extractor for image-pools.	7 years ago
Mike Fährmann	b14de6ffc2	[tumblr] small improvements - don't transform inline GIF URLs - set 'type' parameter for API calls if there is only one post type selected	7 years ago
Mike Fährmann	b8cdd42cab	[senmanga] fix extraction (again) this is basically a re-revert of `2ace5c7`	7 years ago
Mike Fährmann	6913eeaa40	[powermanga] replace manga extractor unit test My Hero Academia is gone	7 years ago
Mike Fährmann	f72318e593	[seiga] support more than 200 images Due to API restrictions and/or missing knowledge about and documentation of API usage, it was only possible to retrieve the latest 200 images of a niconico seiga user with said API. The new approach manually visits each HTML page and gets its information from there.	7 years ago
Mike Fährmann	2457b71633	skip tests on 5xx status codes	7 years ago
Mike Fährmann	305da540c3	[mangahere] fix metadata extraction	7 years ago
Mike Fährmann	035ef655f1	[imagefap] update unit tests old gallery/image has been deleted	7 years ago
Mike Fährmann	caf26412dd	add option to set alternate location of .part files (#29 ) Note: The path set for 'downloader.*.part-directory' needs to point to an already existing directory.	7 years ago
Mike Fährmann	27c026543f	re-enable download unit tests	7 years ago
Mike Fährmann	b0353aa02d	rewrite download modules (#29 ) - use '.part' files during file-download - implement continuation of incomplete downloads - check if file size matches the one reported by server	7 years ago
Mike Fährmann	6af921a952	[sankaku] rewrite/improve (fixes #44 ) - add wait-time between HTTP requests similar to exhentai - add 'wait-min' and 'wait-max' options - increase retry-count for HTTP requests to 10 - implement user authentication (non-authenticated users can only view images up to page 25) - implement 'skip()' functionality (only works up to page 50) - implement image-retrieval for pages >= 51 - fix issue with multiple tags	7 years ago
Mike Fährmann	75d3a1f72f	[deviantart] always download original images Deviation-objects returned by the DeviantArt API don't always contain the URL and metadata of the original image ([1]). Getting this information requires an additional API call [2], which is indicated by the 'is_downloadable' and 'download_filesize' metadata within a deviation-object. [1] https://myria-moon.deviantart.com/art/Aime-Moi-part-en-vadrouille-261986576 [2] https://www.deviantart.com/developers/http/v1/20160316/deviation_download/bed6982b88949bdb08b52cd6763fcafd	7 years ago
Mike Fährmann	8e6a767109	[util] restructure formatter for better exception propagation	7 years ago
Mike Fährmann	0386503c80	fix (sub)category-transfer for DownloadJob instances (#41 ) ... and extend "parent" parameters to TestJob- and DataJob-classes as well.	7 years ago
Mike Fährmann	41adb99e9c	[pawoo] fix extraction - changed access_token - use account-search instead of general search	7 years ago
Mike Fährmann	b319f4bab3	smaller code and text changes	7 years ago
Mike Fährmann	c1f0afe4c6	add custom string formatter class	7 years ago
Mike Fährmann	85a2b2ae59	[khinsider] fix extraction	7 years ago
Mike Fährmann	8e14714c2b	[imgspice] fix extraction	7 years ago
Mike Fährmann	a85f06d2d1	[foolslide] restructure; convert suitable values to int	7 years ago
Mike Fährmann	9fc1d0c901	implement and use 'util.safe_int()' same as Python's 'int()', except it doesn't raise any exceptions and accepts a default value	7 years ago
Mike Fährmann	a9e7145651	[hbrowse] extract hmanga metadata & general maintenance	7 years ago
Mike Fährmann	84d4450410	[fallenangels] extract manga metadata	7 years ago
Mike Fährmann	f32b1a0292	[imgyt] fix extraction	7 years ago
Mike Fährmann	31cd5b1c1d	[luscious] detect high-load responses	7 years ago
Mike Fährmann	81877bb5f6	add '-K' as shortcut for '--list-keywords'	7 years ago
Mike Fährmann	9b21d3f13c	add '--filter' command-line option This allows for image filtering via Python expressions by the same metadata that is also used to build filenames (--list-keywords). The usually shunned eval() function is used to evaluate filter-expressions, but it seemed quite appropriate in this case and shouldn't introduce any new security issues, as any attacker that could do > gallery-dl --filter "delete-everything()" ... could as well do > python -c "delete-everything()"	7 years ago
Mike Fährmann	31731cbefe	update unittests for util.py	7 years ago
Mike Fährmann	f98e3e8002	[luscious] fix tag extraction	7 years ago
Mike Fährmann	65997d835b	replace popular/ranking tests with older ones Metadata of several year old lists shouldn't change as much as it would for newer ones, which makes metadata-comparisons of the output of build_testresult_db.oy easier.	7 years ago
Mike Fährmann	c0755a4d5e	[exhentai] revert login-method to its old version (#37 ) Additional cookies don't seem to help and have to be manually set anyway. The older method is more likely to succeed, so I'd rather use this one.	7 years ago
Mike Fährmann	3ee39ffd93	[exhentai] update login procedure (#37 ) This new version behaves pretty much exactly like a browser would and caches all cookies sent to it and not just "ipb_member_id" and "ipb_pass_hash".	7 years ago
Mike Fährmann	07214f4007	[booru] place subcategories into base classes	7 years ago
Mike Fährmann	47bcf53ec1	implement support for additional unit test result types - "pattern" matches all resulting URLs against the given regex - "count" allows to specify the amount of returned URLs	7 years ago
Mike Fährmann	c7ec103e15	[batoto] fix extraction of chapter URLs	7 years ago

1 2 3 4 5

235 Commits (48a8717a7c988f49f457be31557fa72d94d9efb4)