Typst_of_jupyter: Better notebook PDFs
If you have followed my posts (unlikely, given the visitor counts of my website), you have seen one or two posts about Typst, the new text-based typesetting system. I’ve already shown in a previous post that Typst can be very useful as an intermediate language for generating PDFs.
Disappointed by the various ways of exporting Jupyter notebooks – printing from the browser, converting to LaTeX, pandoc, etc. – I was looking for an approach that allows for archiving and sharing high-quality versions of Jupyter notebooks preferably as PDF files. The idea to use Typst for generating such PDFs was obvious!
I started a prototype in Rust (github: dermesser/jupyter2typst), but
quickly got tired from wrangling the notebook format, which is just a large JSON object, in Rust. I used tinyjson
as
JSON library, and even though it is more flexible than serde_json
, it just wasn’t very fun. And fun is what I’m
looking for in projects for my free time.
But now I had dipped my toes (and at least the lower legs too) in
OCaml, which seemed like a handy language for implementing such a basic JSON-wrangling format
converter. And indeed, within one day on a train I had written more functionality than I probably could have written in
Rust in a whole week of coding. That’s about 750 lines of formatted OCaml, which you can find for the time being on my
Mercurial server at borgac.net/lbo/hg/typst_of_jupyter. You can clone it
from that URL using hg clone
.
The funny name, typst_of_jupyter, stems from OCaml’s convention for naming conversion functions, like Int.of_string
or String.t_of_sexp
. And although it’s not finished as of writing this, it already provides useful output. I wrote all
of the code in VS Code using the OCaml plugin. (I also have the ocamllsp
plugin for lspconfig
in my neovim
configuration, but it’s just not at the same level, unfortunately. Vim nerds, please do not send me your helpful
advice.)
Basic components
As mentioned, a Jupyter notebook file consists of a JSON object with various metadata entries and a list of all the cells in your notebook. Each cell either contains markdown or code; in the former case it may also contain attachments, in the latter it will contain some output or errors as well. The format is described in detail (and incorrectly!) on the nbformat homepage.
In order to generate Typst markup, three major steps are necessary:
- Parse the JSON object and sort out the cells by type.
- For Markdown cells, parse the markdown and convert the markdown into Typst markup. This is a comparatively complex aspect compared to code cells. In addition, attachments need to be deserialized and stored in files where they can be found by the typst compiler later.
- Generate the actual Typst markup, by rendering code cells in the desired style, and writing the converted markdown cells.
The resulting Typst file can then be compiled like any Typst document. The (preliminary) result without much refinement in terms of design looks like the following:
The next big step is supporting custom styles, using a templating engine, and cleaning up the conversion code a bit. At that point, I’d say the tool is already quite useful and produces nicer results than the LaTeX exporter, in my opinion. It also doesn’t require a texlive install or such.