Static website generator based on HTML element tree rewriting
MIT License
Bot releases are hidden (Show)
Release blog post: https://soupault.app/blog/soupault-4.10.0-release/
The delete_element
widget has a new option: when_no_child
.
For example, suppose you have footnotes container in your template that looks like this:
<div id="footnotes"> <hr class="footnotes-separator"> </div>
. If a page has footnotes,
it would contain something like <p class="footnote">...
. If not, it would only have the <hr>
element in it.
Deleting it from pages that don't have any footnotes cannot be done with only_if_empty
because the container has that auxilliary element in it.
However, with the new option you can make the widget delete the container
only if nothing inside it matches the selector of actual footnotes.
[widgets.clean-up-footnote-containers]
after = "footnotes"
widget = "delete_element"
selector = "div#footnotes"
when_no_child = "p.footnote"
Published by dmbaturin 7 months ago
Release blog post: https://soupault.app/blog/soupault-4.9.0-release
startup
hook that runs before soupault processes any pages and can modify the global_data
variable.New Digest
module offers functions for calculating cryptographic hash sums of strings.
All those functions return hex digests.
Digest.md5(str)
Digest.sha1(str)
Digest.sha256(str)
Digest.sha512(str)
Digest.blake2s(str)
Digest.blake2b(str)
Other new functions:
Sys.basename_url(str)
and Sys.dirname_url(str)
— aliases for Sys.basename_unix
and Sys.dirname_unix
, respectively.Published by dmbaturin 9 months ago
Full announcement: https://soupault.app/blog/soupault-4.8.0-release (includes important information about future plans)
site_index
variable is now available to the post-build hook.index_entry
variable (the complete site index entry for the current page) is now available to post-index, save and post-save hooks and to Lua index processors.settings.ignore_path_regexes
and settings.ignore_directories
.HTML.inner_text()
— returns the text nodes from inside a node, stripped of all HTML tags.<style>
tags and similar no longer call issues<body>
tag inserted in the page (#58, report by Delan Azabani).Published by dmbaturin about 1 year ago
Release blog post: https://soupault.app/blog/soupault-4.7.0-release
max_items
option in index views allows limiting the number of displayed items.settings.page_character_encoding
option for correctly loading pages in encodings other than ASCII and UTF-8.post-build
hook that runs when all pages are processed and soupault is about to terminate.index_first = true
mode."page_included checks for %s: regex=%b, page=%b, section=%b"
CSV.from_string(str)
— parses CSV data and returns it as a list (i.e., an int-indexed table) of lists.CSV.unsafe_from_string(str)
— like CSV.from_string
but returns nil
on errors instead or raising an exception.CSV.to_list_of_tables(csv_data)
— converts CSV data with a header returned by CSV.from_string
into a list of string-indexed tables for easy rendering.HTML.swap(l, r)
— swaps two elements in an element tree.HTML.wrap(node, elem)
— wraps node
in elem
.global_data
hash table for sharing data between plugins.soupault_pass
plugin environment variable (0 when index_first = false
, 1 and 2 for the first and the second pass respectively when it's true).sort_strict = true
and sort_by
is unspecified.soupault --init
(s/ULRs/URLs/).New state
record now holds both the settings record and the TOML config datastructure,
plus the new global_data
and soupault_pass
variables, and can be easily extended to support global state new variables.
Official binaries are now available for Linux on ARM64.
Published by dmbaturin over 1 year ago
Release post: https://soupault.app/blog/soupault-4.6.0-release/
Sys.getenv(name, default_value)
function (default_value
is optional).String.ends_with(string, suffix)
.String.is_valid_utf8(string)
and String.is_valid_ascii(string)
functions.Table.length(table)
— returns the number of items in a table.Table.for_all(func, table)
— checks if boolean function func
is true for all items in a table.Table.for_any(func, table)
— checks if boolean function func
is true for at least one item in a table.Table.is_empty(t)
— returns true if t
has no items in it.Table.copy(t)
— returns a copy of the table t
.HTML.is_empty(e)
— returns true if e
has zero child nodes.HTML.is_root(e)
— returns true if e
has no parent node.HTML.is_document(e)
— returns true if e
is a soup (document) node rather than an element or a text.Value.is_html(v)
— returns true is v
is an HTML document or node.Published by dmbaturin over 1 year ago
Full announcement: https://soupault.app/blog/soupault-4.5.0-release
--no-caching
option allows the user to disable caching even if settings.caching
is true in the config.HTML.prepend_root(node, child)
function for inserting new nodes in HTML documents before all existing nodes.soupault --version
correctly prints a trailing newline again.Published by dmbaturin over 1 year ago
Support for caching the output of page preprocessors and commands used by preprocess_element
widgets.
[settings]
# Caching is off by default so you need to enable it
caching = true
# Change the cache directory name if you wish
cache_dir = ".soupault-cache"
You can force soupault to clear the cache and rebuild everything by running soupault --force
.
Published by dmbaturin almost 2 years ago
relative_links
widget now handles links to pages and files at the same level correctly and adds ./
to them.relative_links
widget now always prepends ./
to links at the same level or deeper to make the output deterministic.Published by dmbaturin almost 2 years ago
String.starts_with(str, prefix)
Sys.split_path(path_str)
for splitting native file paths (uses /
on UNIX-like systems, \
on Windows).Sys.split_path_unit
(aks Sys.split_path_url
) for splitting paths using the /
-convention regardless of the OS (safe for URLs).--help
message about the --config
option now correctly mentions that it takes a path.--profile
option is not given).Published by dmbaturin about 2 years ago
profile
option, like widgets, and can be limited to specific build profiles.--config
option for specifying custom config path without the use of environment variables.--version-number
option that prints just the version number (for easy use from scripts).soupault --init
how takes --site-dir
and --build-dir
options into account when generating the config.keep_extensions
and default_extension
options are now mentioned in configs genrated by soupault --init
.[]
is false
, any non-empty list is true).Published by dmbaturin about 2 years ago
See https://soupault.app/blog/soupault-4.1.0-release/ for details.
post-save
hook makes it easier to post-process generated page files (e.g., run them through an HTML minifier).Sys.get_program_output(cmd, input_string)
.Sys.strip_extensions
function for removing extensions from file names.save
hook code.Published by dmbaturin over 2 years ago
Release announcement: https://soupault.app/blog/soupault-4.0.1-release/
soupault 4.0.1 fixes two bugs:
clean_urls = false
, pages now have correct url
index fields (with original extension removed) (#44, report by @laumann).Published by dmbaturin over 2 years ago
Release announcement: https://soupault.app/blog/soupault-4.0.0-release
This release adds the long-promised system of page processing hooks, a way to write index processors in Lua (and create new pages from them to make taxonomies and paginated indices), and more.
Published by dmbaturin over 2 years ago
index_template
in index views now have access to soupalt config via soupault_config
environment variables.Value.repr()
.Table.get_nested_value(table, {"key", "sub-key"}
and Table.get_nested_value_default(table, {"key", "sub-key"}, default)
.Sys.join_url
alias for Sys.join_path_unix
.String.url_encode(str)
and String.url_decode(str)
.--dump-index-json <FILE>
option allows the user to produce a metadata dump without modifying soupault.toml
.soupault --debug
to get an exception trace and report a bug.When an index view had an index_template
option and a file
or lua_source
option whose value was bad (unreadable file path in file
or a non-string value in lua-source
), soupault would silently ignore the Lua code loading error and use index_template
. Now it correctly throws an error in that case since when file
or lua_source
is present, index_template
is supposed to be treated as a custom option passed to the Lua index processor.
Published by dmbaturin over 2 years ago
HTML.clone_page
function (first introduced in 4.0.0-beta1, not present in any stable release) was renamed to HTML.clone_document
for consistency with other function names.include_subsections
option is now supported by all widgets.page_file
variable is now available to all hooks.action
option that controls how the content is inserted (append_child
, prepend_child
, insert_before
...).settings.process_pages_first
option that tells soupault to process certain pages before all other.Published by dmbaturin over 2 years ago
Soupault 4.0.0 release will bring multiple new features, including:
Documentation and examples for the new features will be provided later. The goal of 4.0.0-beta1 is to make sure that old functionality is still working as before for everyone, save for a minor breaking change.
Index view option index_item_template
does not have a default value anymore.
If you have an index view without either index_template
, index_item_template
, or index_processor
and it's working fine for you, you need to add its original default value to your view explicitly
to make it work like before.
[index.views.some_view]
index_item_template = '''<div> <a href="{{url}}">{{title}}</a> </div>'''
It's now possible to mark certain index fields as required. If a required field is not present in a page,
soupault will display an error and stop.
Example:
[index.fields.title]
selector = ["h1#post-title", "h1"]
required = true
The first big feature in this release is the long-promised system of page processing hooks.
As of this release, there are the following hooks: pre-parse
, pre-process
, post-index
, render
, and save
.
pre-parse
: operates on the page text before it's parsed, must place the modified page source in the page_source
variable.pre-process
: operates on the page element tree just after parsing, may modify the page
variable and set target_dir
and target_file
variables.post-index
: operates on the page element tree after index data extraction, can add more fields and override fields in the index_entry
variable.render
: takes over the rendering process, must put rendered page text in the page_source
variable.save
: takes over the page output process.For example, this is how you can do global variable substituion with a pre-parse hook:
[hooks.pre-parse]
lua_source = '''
soupault_release = soupault_config["custom_options"]["latest_soupault_version"]
Log.debug("running pre-parse hook")
page_source = Regex.replace_all(page_source, "\\$SOUPAULT_RELEASE\\$", soupault_release)
'''
It's now possible to write index generators in Lua. Lua code can be given inline inside an index view using the lua_source
option or loaded from an external file (using the file
option).
For example, this is a reimplementation of the built-in index_template
option in Lua:
[index.views.blog]
index_selector = "div#blog-index"
index_template = """
{% for e in entries %}
<div class="entry">
<a href="{{e.url}}">{{e.title}}</a> (<time>{{e.date}}</time>)
</div>
{% endfor %}
"""
lua_source = """
env = {}
rendered_entries = HTML.parse(String.render_template(config["index_template"], env))
container = HTML.select_one(page, config["index_selector"])
HTML.append_child(container, rendered_entries)
"""
To support passing custom options to Lua index processors, index views now allow arbitrary options, like widgets config sections.
The most important advtange of Lua index processors is that they can generate new pages, so it's now possible to generate taxonomies and paginated indices in a single soupault run and without any external tools.
There's a new index.index_first
option.
[index]
index_first = true
When set to true
, that option will make soupault perform a reduced first pass where it does the bare minimum of work required
to produce the site index. It will read pages and run widgets specified in index.extract_after_widgets
, but will not finish
processing any pages and will not write them to disk.
Then it will perform a second pass to actually render the website. Every plugin running on every page can access that page's index entry
via a new index_entry
variable. This way you can avoid having to store index data externally and run soupault twice,
even though a certain amount of work is still done twice behind the scenes.
Lua plugin code can now be inlined in soupault.toml
:
[plugins.test-plugin]
lua_source = '''
Log.debug("Test plugin!")
Plugin.exit("Test plugin executed successfully")
'''
target_file
(path to the output file, relative to the current working directory).index_entry
(the index entry of the page being processed if index.index_first = true
, otherwise it's nil
).String.slugify_soft(string)
replaces all whitespace with hyphens, but doesn't touch any other characters.HTML.to_string(etree)
and HTML.pretty_print(etree)
return string representations of element trees, useful for save hooks.HTML.create_document()
creates an empty element tree.HTML.clone_page(etree)
make a copy of a complete element tree.HTML.append_root(etree, node)
adds a node after the last element.HTML.child_count(elem)
returns the number of children of an element.HTML.unwrap(elem)
yanks the child elements out of a parent and inserts them in its former place.Table.take(table, limit)
removes up to limit
items from a table and returns them.Table.chunks(table, size)
splits a table into chunks of up to size
items.Table.has_value(table, value)
returns true if value
is present in table
.Table.apply_to_values(func, table)
applies function func
to every value in table
(a simpler version of Table.apply
if you don't care about keys).Table.keys(table)
returns a list of all keys in table
.Sys.list_dir(path)
returns a list of all files in path
.String.length
is now Unicode-aware, the old implementation is still available as String.length_ascii
String.truncate
is now Unicode-aware, the old implementation is still available as String.truncate_ascii
numeric
index entry sorting method works correctly again.index.sort_by
is not set, entries are now sorted by their url
field rather than displayed in arbitrary order.Sys.list_dir
correctly handles errors when the argument is not a directory.running preprocessor "cmark --unsafe --smart"...
).Published by dmbaturin almost 3 years ago
New features:
persistent_data
table allows sharing data between different runs of the same Lua plugin.HTML.matches_selector
, HTML.matches_any_of_selectors
.settings.soupalt_version
option allows specifying the minimum supported version, trying to build with an older version will result in an error message.Bug fixes:
Published by dmbaturin about 3 years ago
Release announcement: https://soupault.app/blog/soupault-3.1.0-release/
Changes:
ignore_heading_selectors
option.Table.iter_ordered
and Table.iter_values_ordered
Sys.join_path_unix
, Sys.basename_unix
, Sys.dirname_unix
after = ["foo", "foo", "bar"]
no longer causes a false positive dependency cycle detection.Published by dmbaturin over 3 years ago
Release announcement: https://soupault.app/blog/soupault-3.0.0-release/
Highlights:
NO_COLOR
).title
widget now uses head title
selector to avoid issues with <title>
elements in inline SVG.Release announcement: https://soupault.app/blog/soupault-2.8.0-release
soupault --show-default-config
option for displaying the default config (as it would be generated by soupault --init
).soupault --show-effective-config
option for displaying the effective config (user-defined values from soupault.conf
plus default values).[settings]
are available to Lua plugins.The include_subsections
option in index views works correctly now (#35, report by toastal).
soupault --init
now includes all options, including pretty_print_html
, plugin_discovery
, and plugin_dirs
(report by Crystal-RainSlide).