You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
gallery-dl/docs/formatting.md

326 lines
12 KiB

This file contains invisible Unicode characters!

This file contains invisible Unicode characters that may be processed differently from what appears below. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to reveal hidden characters.

# String Formatting
Format strings in gallery-dl follow the general rules of [`str.format()`](https://docs.python.org/3/library/string.html#format-string-syntax) ([PEP 3101](https://www.python.org/dev/peps/pep-3101/)) plus several extras.
The syntax for replacement fields is `{<field-name>!<conversion>:<format-specifiers>}`, where `!<conversion>` and `:<format-specifiers>` are both optional and can be used to specify how the value selected by `<field-name>` should be transformed.
## Field Names
Field names select the metadata value to use in a replacement field.
While simple names are usually enough, more complex forms like accessing values by attribute, element index, or slicing are also supported.
| | Example | Result |
| -------------------- | ------------------- | ---------------------- |
| Name | `{title}` | `Hello World` |
| Element Index | `{title[6]}` | `W` |
| Slicing | `{title[3:8]}` | `lo Wo` |
| Slicing (Bytes) | `{title_ja[b3:18]}` | `ロー・ワー` |
| Alternatives | `{empty\|title}` | `Hello World` |
| Attribute Access | `{extractor.url}` | `https://example.org/` |
| Element Access | `{user[name]}` | `John Doe` |
| | `{user['name']}` | `John Doe` |
All of these methods can be combined as needed.
For example `{title[24]|empty|extractor.url[15:-1]}` would result in `.org`.
## Conversions
Conversion specifiers allow to *convert* the value to a different form or type. Such a specifier must only consist of 1 character. gallery-dl supports the default three (`s`, `r`, `a`) as well as several others:
<table>
<thead>
<tr>
<th>Conversion</th>
<th>Description</th>
<th>Example</th>
<th>Result</th>
</tr>
</thead>
<tbody>
<tr>
<td align="center"><code>l</code></td>
<td>Convert a string to lowercase</td>
<td><code>{foo!l}</code></td>
<td><code>foo bar</code></td>
</tr>
<tr>
<td align="center"><code>u</code></td>
<td>Convert a string to uppercase</td>
<td><code>{foo!u}</code></td>
<td><code>FOO BAR</code></td>
</tr>
<tr>
<td align="center"><code>c</code></td>
<td>Capitalize a string, i.e. convert the first character to uppercase and all others to lowercase</td>
<td><code>{foo!c}</code></td>
<td><code>Foo bar</code></td>
</tr>
<tr>
<td align="center"><code>C</code></td>
<td>Capitalize each word in a string</td>
<td><code>{foo!C}</code></td>
<td><code>Foo Bar</code></td>
</tr>
<tr>
<td align="center"><code>g</code></td>
<td>Slugify a value</td>
<td><code>{foo!g}</code></td>
<td><code>foo-bar</code></td>
</tr>
<tr>
<td align="center"><code>j</code></td>
<td>Serialize value to a JSON formatted string</td>
<td><code>{tags!j}</code></td>
<td><code>["sun", "tree", "water"]</code></td>
</tr>
<tr>
<td align="center"><code>t</code></td>
<td>Trim a string, i.e. remove leading and trailing whitespace characters</td>
<td><code>{bar!t}</code></td>
<td><code>FooBar</code></td>
</tr>
<tr>
<td align="center"><code>T</code></td>
<td>Convert a <code>datetime</code> object to a unix timestamp</td>
<td><code>{date!T}</code></td>
<td><code>1262304000</code></td>
</tr>
<tr>
<td align="center"><code>d</code></td>
<td>Convert a unix timestamp to a <code>datetime</code> object</td>
<td><code>{created!d}</code></td>
<td><code>2010-01-01 00:00:00</code></td>
</tr>
<tr>
<td align="center"><code>U</code></td>
<td>Convert HTML entities</td>
<td><code>{html!U}</code></td>
<td><code>&lt;p&gt;foo &amp; bar&lt;/p&gt;</code></td>
</tr>
<tr>
<td align="center"><code>H</code></td>
<td>Convert HTML entities &amp; remove HTML tags</td>
<td><code>{html!H}</code></td>
<td><code>foo &amp; bar</code></td>
</tr>
<tr>
<td align="center"><code>s</code></td>
<td>Convert value to <a href="https://docs.python.org/3/library/stdtypes.html#text-sequence-type-str" rel="nofollow"><code>str</code></a></td>
<td><code>{tags!s}</code></td>
<td><code>['sun', 'tree', 'water']</code></td>
</tr>
<tr>
<td align="center"><code>S</code></td>
<td>Convert value to <code>str</code> while providing a human-readable representation for lists</td>
<td><code>{tags!S}</code></td>
<td><code>sun, tree, water</code></td>
</tr>
<tr>
<td align="center"><code>r</code></td>
<td>Convert value to <code>str</code> using <a href="https://docs.python.org/3/library/functions.html#repr" rel="nofollow"><code>repr()</code></a></td>
<td></td>
<td></td>
</tr>
<tr>
<td align="center"><code>a</code></td>
<td>Convert value to <code>str</code> using <a href="https://docs.python.org/3/library/functions.html#ascii" rel="nofollow"><code>ascii()</code></a></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>
## Format Specifiers
Format specifiers can be used for advanced formatting by using the options provided by Python (see [Format Specification Mini-Language](https://docs.python.org/3/library/string.html#format-specification-mini-language)) like zero-filling a number (`{num:>03}`) or formatting a [`datetime`](https://docs.python.org/3/library/datetime.html#datetime.datetime) object (`{date:%Y%m%d}`), or with gallery-dl's extra formatting specifiers:
<table>
<thead>
<tr>
<th>Format Specifier</th>
<th>Description</th>
<th>Example</th>
<th>Result</th>
</tr>
</thead>
<tbody>
<tr>
<td rowspan="2"><code>?&lt;start&gt;/&lt;end&gt;/</code></td>
<td rowspan="2">Adds <code>&lt;start&gt;</code> and <code>&lt;end&gt;</code> to the actual value if it evaluates to <code>True</code>. Otherwise the whole replacement field becomes an empty string.</td>
<td><code>{foo:?[/]/}</code></td>
<td><code>[Foo&nbsp;Bar]</code></td>
</tr>
<tr>
<td><code>{empty:?[/]/}</code></td>
<td><code></code></td>
</tr>
<tr>
<td><code>[&lt;start&gt;:&lt;stop&gt;]</code></td>
<td>Applies a <a href="https://python-reference.readthedocs.io/en/latest/docs/brackets/slicing.html">Slicing</a> operation to the current value, similar to <a href="#field-names">Field Names</a></td>
<td><code>{foo:[1:-1]}</code></td>
<td><code>oo&nbsp;Ba</code></td>
</tr>
<tr>
<td><code>[b&lt;start&gt;:&lt;stop&gt;]</code></td>
<td>Same as above, but applies to the <a href="https://docs.python.org/3/library/stdtypes.html#bytes"><code>bytes()</code></a> representation of a string in <a href="https://docs.python.org/3/library/sys.html#sys.getfilesystemencoding">filesystem encoding</a></td>
<td><code>{foo_ja:[b3:-1]}</code></td>
<td><code>ー・バ</code></td>
</tr>
<tr>
<td rowspan="2"><code>L&lt;maxlen&gt;/&lt;repl&gt;/</code></td>
<td rowspan="2">Replaces the entire output with <code>&lt;repl&gt;</code> if its length exceeds <code>&lt;maxlen&gt;</code></td>
<td><code>{foo:L15/long/}</code></td>
<td><code>Foo&nbsp;Bar</code></td>
</tr>
<tr>
<td><code>{foo:L3/long/}</code></td>
<td><code>long</code></td>
</tr>
<tr>
<td rowspan="2"><code>X&lt;maxlen&gt;/&lt;ext&gt;/</code></td>
<td rowspan="2">Limit output to <code>&lt;maxlen&gt;</code> characters. Cut output and add <code>&lt;ext&gt;</code> to its end if its length exceeds <code>&lt;maxlen&gt;</code></td>
<td><code>{foo:X15/&nbsp;.../}</code></td>
<td><code>Foo&nbsp;Bar</code></td>
</tr>
<tr>
<td><code>{foo:L6/&nbsp;.../}</code></td>
<td><code>Fo&nbsp;...</code></td>
</tr>
<tr>
<td><code>J&lt;separator&gt;/</code></td>
<td>Concatenates elements of a list with <code>&lt;separator&gt;</code> using <a href="https://docs.python.org/3/library/stdtypes.html#str.join" rel="nofollow"><code>str.join()</code></a></td>
<td><code>{tags:J - /}</code></td>
<td><code>sun - tree - water</code></td>
</tr>
<tr>
<td><code>R&lt;old&gt;/&lt;new&gt;/</code></td>
<td>Replaces all occurrences of <code>&lt;old&gt;</code> with <code>&lt;new&gt;</code> using <a href="https://docs.python.org/3/library/stdtypes.html#str.replace" rel="nofollow"><code>str.replace()</code></a></td>
<td><code>{foo:Ro/()/}</code></td>
<td><code>F()()&nbsp;Bar</code></td>
</tr>
<tr>
<td><code>C&lt;conversion(s)&gt;/</code></td>
<td>Apply <a href="#conversions">Conversions</a> to the current value</td>
<td><code>{tags:CSgc/}</code></td>
<td><code>"Sun-tree-water"</code></td>
</tr>
<tr>
<td><code>S&lt;order&gt;/</code></td>
<td>Sort a list. <code>&lt;order&gt;</code> can be either <strong>a</strong>scending or <strong>d</strong>escending/<strong>r</strong>everse. (default: <strong>a</strong>)</td>
<td><code>{tags:Sd}</code></td>
<td><code>['water', 'tree', 'sun']</code></td>
</tr>
<tr>
<td><code>D&lt;format&gt;/</code></td>
<td>Parse a string value to a <code>datetime</code> object according to <a href="https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes"><code>&lt;format&gt;</code></a></td>
<td><code>{updated:D%b %d %Y %I:%M %p/}</code></td>
<td><code>2010-01-01 00:00:00</code></td>
</tr>
<tr>
<td><code>O&lt;offset&gt;/</code></td>
<td>Apply <code>&lt;offset&gt;</code> to a <code>datetime</code> object, either as <code>±HH:MM</code> or <code>local</code> for local UTC offset</td>
<td><code>{date:O-06:30/}</code></td>
<td><code>2009-12-31 17:30:00</code></td>
</tr>
</tbody>
</table>
All special format specifiers (`?`, `L`, `J`, `R`, `D`, `O`, etc)
can be chained and combined with one another,
but must always appear before any standard format specifiers:
For example `{foo:?//RF/B/Ro/e/> 10}` -> `   Bee Bar`
- `?//` - Tests if `foo` has a value
- `RF/B/` - Replaces `F` with `B`
- `Ro/e/` - Replaces `o` with `e`
- `> 10` - Left-fills the string with spaces until it is 10 characters long
## Global Replacement Fields
Replacement field names that are available in all format strings.
<table>
<thead>
<tr>
<th>Field Name</th>
<th>Description</th>
<th>Example</th>
<th>Result</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>_env</code></td>
<td>Environment variables</td>
<td><code>{_env[HOME]}</code></td>
<td><code>/home/john</code></td>
</tr>
<tr>
<td><code>_now</code></td>
<td>Current local date and time</td>
<td><code>{_now:%Y-%m}</code></td>
<td><code>2022-08</code></td>
</tr>
<tr>
<td rowspan="2"><code>_lit</code></td>
<td rowspan="2">String literals</td>
<td><code>{_lit[foo]}</code></td>
<td><code>foo</code></td>
</tr>
<tr>
<td><code>{'bar'}</code></td>
<td><code>bar</code></td>
</tr>
</tbody>
</table>
## Special Type Format Strings
Starting a format string with `\f<Type> ` allows to set a different format string type than the default. Available ones are:
<table>
<thead>
<tr>
<th>Type</th>
<th>Description</th>
<th width="32%">Usage</th>
</tr>
</thead>
<tbody>
<tr>
<td align="center"><code>F</code></td>
<td>An <a href="https://docs.python.org/3/tutorial/inputoutput.html#formatted-string-literals">f-string</a> literal</td>
<td><code>\fF '{title.strip()}' by {artist.capitalize()}</code></td>
</tr>
<tr>
<td align="center"><code>E</code></td>
<td>An arbitrary Python expression</td>
<td><code>\fE title.upper().replace(' ', '-')</code></td>
</tr>
<tr>
<td align="center"><code>T</code></td>
<td>Path to a template file containing a regular format string</td>
<td><code>\fT ~/.templates/booru.txt</code></td>
</tr>
<tr>
<td align="center"><code>TF</code></td>
<td>Path to a template file containing an <a href="https://docs.python.org/3/tutorial/inputoutput.html#formatted-string-literals">f-string</a> literal</td>
<td><code>\fTF ~/.templates/fstr.txt</code></td>
</tr>
<tr>
<td align="center"><code>M</code></td>
<td>Path or name of a Python module
followed by the name of one of its functions.
This function gets called with the current metadata dict as
argument and should return a string.</td>
<td><code>\fM my_module:generate_text</code></td>
</tr>
</tbody>
</table>