The CLI: quartobot
A Python CLI for pre-render and out-of-render work. quartobot resolve
runs as a Quarto pre-render hook and calls manubot.cite directly to
populate the bibliography before pandoc starts. scan, validate,
init, and use round out the surface for CI-lint and scaffolding.
uv tool install git+https://github.com/quartobot/quartobotquartobot depends on manubot as a Python library. See
Install for uvx, editable, and post-v0.1-tag
pip install paths.
Pre-render commands
Section titled “Pre-render commands”quartobot scan
Section titled “quartobot scan”Walks .qmd, .md, .Rmd, and .ipynb files under a path, extracts
every cite key, classifies each one (manubot prefix, bare DOI, or
hand-curated), groups the results, and reports repetition counts and
cross-file duplicates with file:line locations. Pure read. No network.
Pure reporter, too — scan always exits 0 once it finishes; gating
lives in validate.
$ quartobot scan .arxiv: 2104.10729 (2x)doi: 10.1038/s41586-024-12345 10.1371/journal.pcbi.1007128 (3x)pmid: 31479462(hand-curated): quarto2024
5 unique key(s), 7 total occurrence(s) across 3 file(s).
Duplicates: @doi:10.1371/journal.pcbi.1007128: intro.qmd:14 methods.qmd:42 notebook.ipynb:cell3:9Prefixes are listed alphabetically; hand-curated keys appear last.
The scan is heuristic — it strips YAML/TOML frontmatter, fenced code
blocks (``` / ~~~), and inline code spans before searching, so
decoys like @fake:notacite inside backticks won’t surface. For
.ipynb files, only markdown cells are scanned; cell index appears
alongside line number (paper.ipynb:cell3:9). The authoritative
parse happens at render time inside pandoc citeproc.
Pass --no-recursive to scan only files directly under the given
path. Render outputs and tool caches (_site/, _book/, _freeze/,
.quarto/, .git/, .ipynb_checkpoints/, etc.) are skipped at any
depth.
Exit codes:
0— scan completed. Always. Repeated keys show up in the listing ((Nx)next to the identifier, plus a “Duplicates:” section when a key crosses file boundaries) but they don’t gate the exit.2— bad arguments.
Wire validate into pre-commit / CI when you want a gate.
quartobot resolve
Section titled “quartobot resolve”Pre-fetch persistent-identifier citations via manubot.cite and write
the resulting CSL JSON to disk. Designed to run as a Quarto
pre-render hook declared in _quarto.yml:
project: pre-render: quartobot resolve --from-scan . --output references.json --id-mode citation-key$ quartobot resolve --from-scan . --output references.json ✓ doi:10.1371/journal.pcbi.1007128 → YuJbg3zO ✓ pmid:31479462 → r3UbYxrJ ✓ arxiv:2104.10729 → OCxCvqZo (cached)
3 resolved (1 from cache). Wrote 3 entries to references.json.Pass keys as arguments (quartobot resolve doi:10.x/y pmid:12345) or
use --from-scan PATH to resolve every persistent-identifier key in a
project. Hand-curated keys (no recognized prefix) are skipped — those
live in references.bib and pandoc citeproc handles them.
--id-mode citation-key writes the CSL id field as the user’s
prose key (doi:10.1371/...) so pandoc-citeproc matches [@doi:...]
in the source directly. Without it, manubot’s canonical short hash
(YuJbg3zO) goes in id and pandoc-citeproc silently fails to match
prose keys. The pre-render hook architecture depends on this flag.
The --cache option defaults to --output, so re-runs are idempotent:
the output file IS the cache. --dry-run reports what would be
resolved without making any network calls.
Pass --output - to stream the CSL JSON to stdout instead of a file —
the one-shot lookup shape for shell-tool agents and scripts that pipe
through jq:
$ quartobot resolve --output - doi:10.1371/journal.pcbi.1007128 | jq '.[0].title'"Open collaborative writing with Manubot"In stdout mode the summary line goes to stderr and no cache write
happens. Cache reads still work when --cache <path> is set
explicitly.
Exit codes:
0— every key resolved (cache hits count as success).1— one or more keys failed (network error, Crossref 404, etc.).2— bad arguments.
quartobot validate
Section titled “quartobot validate”Pre-flight / CI-lint surface. Static config checks against a Quarto project — no network. Run this in CI to catch the most common foot-guns before they reach a render.
$ quartobot validate . ✓ _quarto.yml exists ✓ bibliography declared — 2 file(s): references.bib, references.json ✗ pre-render hook — `quartobot resolve` is invoked but `--id-mode citation-key` is missing. Without it, CSL `id`s are manubot's short hashes (`YuJbg3zO`), not the prose keys (`doi:10.1371/...`), and pandoc-citeproc silently fails to match any cites. ✓ references.json in bibliography — `references.json` listed in `bibliography:` ✓ no duplicate cite keys — 5 unique key(s) in 3 file(s)
1 of 5 check(s) failed. Exit 1.Checks run:
_quarto.ymlexists and parses as YAML.bibliography:is declared (as a string or list).project.pre-rendercallsquartobot resolvewith--id-mode citation-key. The flag is load-bearing — without it, manubot’s canonical short hashes replace the user’s prose keys and pandoc-citeproc silently fails to match anything.references.jsonappears inbibliography:— the most common silent failure under the pre-render hook architecture, since without it pandoc citeproc never reads whatquartobot resolvewrote.- No cite key appears in more than one file. Same-key-twice in the same file is the normal academic-writing case (one source, several claims) and is not flagged. The check is intentionally narrow: cross-file duplication is the case the chunked-content pattern can produce by accident; same-file repetition is intent.
Citation-resolution checks (“does this DOI actually resolve at
Crossref?”) are out of scope here — they need network. Run
quartobot resolve --dry-run --from-scan . separately for that.
Exit codes: 0 if every check passes, 1 on any failure.
Scaffolding commands
Section titled “Scaffolding commands”quartobot init
Section titled “quartobot init”Scaffold the citation pipeline into an existing (or empty) Quarto project:
$ quartobot initProject type: manuscript
+ _quarto.yml [written] + references.bib [written] ~ .gitignore [appended] — added 7 line(s)
Next steps: 1. Confirm `quartobot` is on PATH: `quartobot --version` (install with `uv tool install git+https://github.com/quartobot/quartobot`) 2. Add citations to your prose: @doi:..., @pmid:..., etc. 3. quarto render
To add the version banner + GitHub Actions CI, run `quartobot use github-ci` after this.init writes only what the citation pipeline needs: _quarto.yml
wired with the quartobot resolve pre-render hook and a
bibliography: list, a seed references.bib, and a .gitignore
augment so references.json (regenerated each render) stays out of
the repo. Three files, nothing else.
Conservative — never overwrites existing files. If _quarto.yml
already exists, prints a YAML snippet to merge in manually instead of
touching it. .gitignore is the one file modified in place
(idempotent, appends only).
--project-type {auto,manuscript,book} controls what gets written;
auto detects from _quarto.yml, falling back to manuscript.
quartobot use github-ci
Section titled “quartobot use github-ci”Scaffold the GitHub Actions render workflow, version banner, and
PR-preview cleanup — the manuscript-as-software CI machinery that used
to ride along with init. Opt-in, idempotent, scoped to one job.
$ quartobot use github-ciProject type: manuscript
+ _version-banner.html.template [written] + _version-banner.html [written] + .github/workflows/render.yml [written] + .github/workflows/pr-closed.yml [written]
# Add to your existing _quarto.yml so the version banner renders at# the top of the HTML output. PDF/DOCX outputs skip the include# automatically.
format: html: include-before-body: - _version-banner.html
Next steps: 1. Commit the new files and push to GitHub. 2. The render workflow fires on push to main and on PRs. 3. After the first push, CI swaps the dev banner for a per-commit permalink + 'latest' link.When _quarto.yml already declares the banner include, the
manual-merge snippet is suppressed. Re-running is safe: files
already on disk are left alone and report as skipped-exists.
use is a click group, designed to grow. github-ci is the first
inhabitant; future siblings (use jupyter-notebooks, use pre-commit,
use mcp, use joss-paper) are scoped but not yet shipped. The
naming convention follows R’s usethis package: one verb (use),
one role per subcommand.
Philosophy
Section titled “Philosophy”The CLI calls manubot.cite (the resolver library) directly from a
Quarto pre-render hook and lets pandoc citeproc (the renderer) consume
the resulting CSL JSON. Every command is either pre-render (do work
ahead of quarto render so the render itself is faster and more
reliable) or out-of-render (init, scan, validate — work that
doesn’t touch render at all).
Opaque-by-default for the CI surface: a consumer’s .github/workflows/render.yml
is a thin caller pointing at the upstream reusable workflow. quartobot detach (when it ships) is the escape hatch when consumers want to
fork the pipeline. The opposite of r-lib/actions, which copies 150
lines into every consumer repo. quartobot’s default is friendlier; the
escape hatch matches their model for users who want it.