Jupyter Notebook Format
RelationalText implements Jupyter notebook (.ipynb) import and export. Features are stored under the org.jupyter.facet namespace. The Jupyter cell model maps directly to the RelationalText block model: each cell becomes one block.
Package: relational-text/jupyterNamespace: org.jupyter.facet
Functions
import { from, to } from 'relational-text/registry'from('jupyter', input: JupyterNotebook | string): Document
Parse a Jupyter notebook JSON object (or a raw JSON string) into a Document.
import type { JupyterNotebook } from 'relational-text/jupyter'
const notebook: JupyterNotebook = JSON.parse(fs.readFileSync('notebook.ipynb', 'utf8'))
const doc = from('jupyter', notebook)
// Or pass a raw JSON string directly:
const doc2 = from('jupyter', fs.readFileSync('notebook.ipynb', 'utf8'))Each cell in notebook.cells becomes one block:
cell_type: 'code'→codeblock with{ language, id? }attrscell_type: 'markdown'or'raw'→markdownblock with{ id? }attrs
The default language is resolved from notebook.metadata.kernelspec.language, then notebook.metadata.language_info.name, defaulting to 'python' if neither is present.
Cell source may be a string or an array of strings; array sources are joined before storage.
Cell outputs and execution counts are not imported — only the source text is stored.
to('jupyter', doc: Document, language?: string): JupyterNotebook
Render a Document to a Jupyter notebook JSON object.
const notebook = to('jupyter', doc, 'python')
// notebook.nbformat === 4
// notebook.cells === [...]- Automatically applies any registered lenses targeting
org.jupyter.facetvialensGraph.autoTransform() - Documents from other formats convert automatically through the lens graph
- The optional
languageparameter sets the fallback kernel language when nolanguageattr is found on acodeblock markdownblocks become{ cell_type: 'markdown', source: [text], metadata: {} }codeblocks become{ cell_type: 'code', source: [text], metadata: {}, execution_count: null, outputs: [] }- The notebook
metadata.kernelspecis populated from the first code cell's language (or thelanguageparameter) - Non-block nodes in the HIR are skipped
The returned object always has nbformat: 4 and nbformat_minor: 5.
ensureJupyterLexicon(): void
Explicitly register the Jupyter lexicon (org.jupyter.facet#* types) and its lens to the RT hub. Called automatically by from('jupyter', ...) and to('jupyter', ...) on first use. Safe to call multiple times — subsequent calls are no-ops.
Exported Types
export interface JupyterCell {
cell_type: 'markdown' | 'code' | 'raw'
id?: string
source: string | string[]
metadata?: Record<string, unknown>
execution_count?: number | null
outputs?: unknown[]
}
export interface JupyterNotebook {
nbformat?: number
nbformat_minor?: number
metadata?: {
kernelspec?: { language?: string; display_name?: string; name?: string }
language_info?: { name?: string }
[key: string]: unknown
}
cells: JupyterCell[]
}Feature Mapping
Block Elements
The Jupyter lexicon contains exactly two feature types:
| Cell type | Feature name | Attrs |
|---|---|---|
markdown / raw | markdown | { id?: string } |
code | code | { language: string, id?: string } |
Inline markup within cell source is not parsed — cell content is stored as opaque text in the document. The implicitBlockType in the lexicon is markdown, so bare text without a block marker defaults to a markdown cell.
Lens to RelationalText Hub
The jupyter-to-relationaltext.lens.json lens is marked invertible: false. It maps:
| Jupyter | RelationalText |
|---|---|
markdown | paragraph |
code | code-block |
The language attr on code is preserved through the lens onto the RT code-block feature.
Examples
Import from file
import { from } from 'relational-text/registry'
import { readFileSync } from 'fs'
const raw = readFileSync('analysis.ipynb', 'utf8')
const doc = from('jupyter', raw)
console.log(doc.text)
// Text content of all cells concatenated, separated by block markersExport to notebook
import { from, to } from 'relational-text/registry'
const doc = from('jupyter', {
nbformat: 4,
nbformat_minor: 5,
metadata: {
kernelspec: { language: 'python', display_name: 'Python', name: 'python' },
language_info: { name: 'python' },
},
cells: [
{ cell_type: 'markdown', source: '# Hello\n\nA description.' },
{ cell_type: 'code', source: 'print("hello")', execution_count: null, outputs: [] },
],
})
const notebook = to('jupyter', doc, 'python')
// {
// nbformat: 4,
// nbformat_minor: 5,
// metadata: { kernelspec: { language: 'python', display_name: 'Python', name: 'python' }, ... },
// cells: [
// { cell_type: 'markdown', source: ['# Hello\n\nA description.'], metadata: {} },
// { cell_type: 'code', source: ['print("hello")'], metadata: {}, execution_count: null, outputs: [] },
// ]
// }Cross-Format Conversion
import { from, to } from 'relational-text/registry'
// CommonMark blocks become notebook cells:
// paragraphs and headings → markdown, fenced code → code
const doc = from('markdown', '# Analysis\n\nThis is a paragraph.\n\n```python\nx = 1\n```')
const notebook = to('jupyter', doc, 'python')Notes
- Cell content is opaque:
from('jupyter', ...)stores each cell's source text verbatim. Inline Markdown within a markdown cell (headings, bold, links, etc.) is not parsed into facets. Usefrom('markdown', cell.source)separately if you need to parse individual cell content. - Cell ID preservation: when a cell has an
idfield (nbformat 4.5+), it is stored in theidattr on the block feature and written back tocell.idon export. - Raw cells:
cell_type: 'raw'is treated identically to'markdown'— both becomemarkdownblocks. The distinction is not preserved through the hub. - Outputs and execution counts: cell outputs and
execution_countvalues are not stored in the Document. Exported notebooks always haveexecution_count: nullandoutputs: []on code cells. - Kernel language detection:
from('jupyter', ...)reads the kernel language frommetadata.kernelspec.language, thenmetadata.language_info.name, falling back to'python'. This language is stored in thelanguageattr of everycodeblock. to('jupyter', ...)kernel metadata: thekernelspecin the output notebook is derived from the first code cell's language (or thelanguageparameter). Thedisplay_nameis the language string with its first letter capitalized (e.g.,'python'→'Python').- Lens invertibility: the lens is marked
invertible: falsebecausemarkdownandcodecarry structural meaning (cell type, language) that cannot be reliably recovered from generic RTparagraphandcode-blockfeatures when converting back from the hub. - Hub lenses: The RT↔CommonMark↔HTML hub lenses are registered on demand by the renderers (
to('html', ...),to('markdown', ...)) when called.