Skip to content

How to use quartobot in a Jupyter notebook manuscript

Quarto renders .ipynb natively. You can author a manuscript in JupyterLab, run analysis in code cells, and have the same @doi: citation flow as a .qmd manuscript. This page covers the bits specific to the notebook shape — where cite keys belong, how the pre-render hook reads them, and what quartobot scan reports.

Markdown cells only. A cite key like @doi:10.1371/journal.pcbi.1007128 inside a markdown cell renders exactly the way it does in a .qmd file. Cite keys appearing in code-cell strings or comments are ignored — by both Quarto’s renderer and quartobot scan. Strings in code aren’t prose, so they aren’t citations.

A two-cell paper.ipynb looks like this in JSON:

{
"cells": [
{
"cell_type": "markdown",
"source": [
"## Introduction\n",
"\n",
"The manubot pattern [@doi:10.1371/journal.pcbi.1007128]\n",
"treats a paper as software.\n"
]
},
{
"cell_type": "code",
"source": [
"import pandas as pd\n",
"# @doi:10.5555/ignored — comment, not a citation\n"
]
}
]
}

The first cell holds prose with one DOI citation. The second cell is analysis code. scan reads the first cell and ignores the second, including the DOI-looking string in the comment.

Same shape as a .qmd manuscript. Create the project, swap the entry point to a notebook (or point at an existing .ipynb), then add the citation pipeline:

Terminal window
quarto create project manuscript my-paper
cd my-paper
rm index.qmd
jupyter lab paper.ipynb # author your notebook
quartobot init

Because quarto create project already wrote _quarto.yml, init won’t touch it — it prints a YAML snippet to paste in. Copy that snippet (the pre-render: line plus the bibliography: list) into the existing _quarto.yml under the existing project: block. The hook then reads .ipynb markdown cells the same way it reads .qmd body text — same scanner, same cite-key extraction, same resolver underneath. The first-manuscript tutorial covers the full sequence end-to-end; it works identically with notebook source.

The relevant _quarto.yml lines after init:

project:
type: manuscript
pre-render: quartobot resolve --from-scan . --output references.json --id-mode citation-key
bibliography:
- references.bib
- references.json

Run quarto render. The hook resolves every cite key in every markdown cell, writes CSL JSON to references.json, and pandoc-citeproc formats the citations in the rendered output.

On the two-cell paper.ipynb from above:

$ quartobot scan paper.ipynb
doi:
10.1371/journal.pcbi.1007128
1 unique key(s), 1 total occurrence(s) across 1 file(s).

On a larger project with duplicates across files, the locations show which cell a cite key came from:

$ quartobot scan .
doi:
10.1371/journal.pcbi.1007128 (2x)
pmid:
31479462
2 unique key(s), 3 total occurrence(s) across 2 file(s).
Duplicates:
@doi:10.1371/journal.pcbi.1007128:
paper.ipynb:cell3:5
methods.qmd:14

The cellN index is 1-based and matches the order cells appear in the notebook JSON — the same order JupyterLab shows in the cell sidebar. :5 is the 1-based line within that cell. Cell 3 line 5 in JupyterLab is the same cell 3 line 5 that scan reports.

quartobot scan . walks the project recursively; quartobot scan paper.ipynb scans one file directly. Both work the same way on notebooks.

scan skips .ipynb_checkpoints/ at any depth — JupyterLab’s autosave directory is never useful to walk.

It also skips *.quarto_ipynb files. These are notebook copies Quarto generates during render from .qmd source; they’re staged artifacts, not authoring source, and quartobot init adds them to .gitignore. scan ignoring them keeps duplicate counts honest when a project mixes .qmd and .ipynb inputs.