Typst_of_jupyter: Better notebook PDFs

Thu, Jul 27, 2023 tags: [ typst programming ocaml ]

If you have followed my posts (unlikely, given the visitor counts of my website), you have seen one or two posts about Typst, the new text-based typesetting system. I’ve already shown in a previous post that Typst can be very useful as an intermediate language for generating PDFs.

Disappointed by the various ways of exporting Jupyter notebooks – printing from the browser, converting to LaTeX, pandoc, etc. – I was looking for an approach that allows for archiving and sharing high-quality versions of Jupyter notebooks preferably as PDF files. The idea to use Typst for generating such PDFs was obvious!

I started a prototype in Rust (github: dermesser/jupyter2typst), but quickly got tired from wrangling the notebook format, which is just a large JSON object, in Rust. I used tinyjson as JSON library, and even though it is more flexible than serde_json, it just wasn’t very fun. And fun is what I’m looking for in projects for my free time.

But now I had dipped my toes (and at least the lower legs too) in OCaml, which seemed like a handy language for implementing such a basic JSON-wrangling format converter. And indeed, within one day on a train I had written more functionality than I probably could have written in Rust in a whole week of coding. That’s about 750 lines of formatted OCaml, which you can find for the time being on my Mercurial server at borgac.net/lbo/hg/typst_of_jupyter. You can clone it from that URL using hg clone.

The funny name, typst_of_jupyter, stems from OCaml’s convention for naming conversion functions, like Int.of_string or String.t_of_sexp. And although it’s not finished as of writing this, it already provides useful output. I wrote all of the code in VS Code using the OCaml plugin. (I also have the ocamllsp plugin for lspconfig in my neovim configuration, but it’s just not at the same level, unfortunately. Vim nerds, please do not send me your helpful advice.)

Basic components

As mentioned, a Jupyter notebook file consists of a JSON object with various metadata entries and a list of all the cells in your notebook. Each cell either contains markdown or code; in the former case it may also contain attachments, in the latter it will contain some output or errors as well. The format is described in detail (and incorrectly!) on the nbformat homepage.

In order to generate Typst markup, three major steps are necessary:

The resulting Typst file can then be compiled like any Typst document. The (preliminary) result without much refinement in terms of design looks like the following:

Sample document Sample document

The next big step is supporting custom styles, using a templating engine, and cleaning up the conversion code a bit. At that point, I’d say the tool is already quite useful and produces nicer results than the LaTeX exporter, in my opinion. It also doesn’t require a texlive install or such.