Structured Text and Dast format

    Structured Text content is stored as a JSON object consisting of two mandatory keys:

    • document: the content, expressed as a unist tree;

    • schema: a string that specifies the unist dialect used inside the document itself.

    {
    "schema": "dast",
    "document": {
    "type": "root",
    "children": [...]
    }
    }

    Generally speaking, you want to set the schema key to the dast dialect (which stands for DatoCMS Abstract Syntax Tree), so that:

    • you can take advantage of the default Structured Text editor that DatoCMS offers;

    • you can reinforce a number of additional validations to ensure consistency on the document.

    If you would like to to use custom unist format rather than dast, please let us know!

    DatoCMS Abstract Syntax Tree (dast) specification

    The dast specification adheres to the Unified collective, which offers a big ecosystem of utilities to parse, transform, manipulate, convert and serialize content of any kind.

    Unified is implemented and used as foundation by several popular libraries, such as rehype (HTML parser), remark (Markdown parser) and the MDX project. All these different projects are able to integrate with each other due to the fact that, to describe the content they treat, they all use the same common JSON format called unist.

    Just like HTML, a dast document it's composed of nodes within nodes:

    • Each node has a type attribute type

    • The top-level node in the dast specification must be of type root

    • Most nodes have a children attribute to specify the nodes it contains

    • The leaves of the tree are nodes of type span, which do not offer a children attribute but store the final text as a string in their value attribute

    • The specs define exactly which attributes and children each nodes permits.

    Let's just see an example of it:

    {
    "type": "root",
    "children": [
    {
    "type": "heading",
    "level": 1,
    "children": [
    {
    "type": "span",
    "marks": [],
    "value": "This is a title!"
    }
    ]
    },
    {
    "type": "paragraph",
    "children": [
    {
    "type": "span",
    "value": "This is a "
    }
    {
    "type": "span",
    "marks": ["strong"],
    "value": "paragraph!"
    }
    ]
    },
    {
    "type": "list",
    "style": "bulleted",
    "children": [
    {
    "type": "listItem",
    {
    "type": "paragraph",
    "children": [
    {
    "type": "span",
    "value": "And this is a list!"
    }
    ]
    },
    }
    ]
    }
    ]
    }

    Working with dast documents

    You can take advantage of several utilities to work with nodes in a dast document. For example, you can:

    import u from 'unist-builder';
    import inspect from 'unist-util-inspect';
    const document =
    u('root', [
    u('heading', { level: 1}, [
    u('span', 'This is the title!')
    ]),
    u('paragraph', [
    u('span', 'And '),
    u('span', { marks: ['strong'] }, 'this'),
    u('span', ' is a paragraph!')
    ])
    ]);
    console.log(inspect(document));
    root[2]
    ├─0 heading[1]
    │ │ level: 1
    │ └─0 span "This is the title!"
    └─1 paragraph[3]
    ├─0 span "And "
    ├─1 span "this"
    │ marks: ["strong"]
    └─2 span " is a paragraph!"

    In addition to that, we built a set of specific tools tools to work with Structured Text and dast documents:

    JSON Schema for dast

    The latest dast format specification is always available at the following url:

    https://site-api.datocms.com/docs/dast-schema.json

    Nodes

    root

    Every dast document MUST start with a root node.

    It allows the following children nodes : paragraph, heading, list, code, blockquote, block and thematicBreak.

    type  "root"  Required
    children  Array  Required
    {
    "type": "root",
    "children": [
    {
    "type": "heading",
    "level": 1,
    "children": [
    {
    "type": "span",
    "value": "Title"
    }
    ]
    },
    {
    "type": "paragraph",
    "children": [
    {
    "type": "span",
    "value": "A simple paragraph!"
    }
    ]
    }
    ]
    }

    paragraph

    A paragraph node represents a unit of textual content.

    It allows the following children nodes : span, link, itemLink and inlineItem.

    type  "paragraph"  Required
    children  Array  Required
    {
    "type": "paragraph",
    "children": [
    {
    "type": "span",
    "value": "A simple paragraph!"
    }
    ]
    }

    span

    A span node represents a text node. It might optionally contain decorators called marks. It is worth mentioning that you can use the \n newline character to express line breaks.

    It does not allow children nodes.

    type  "span"  Required
    value  string  Required
    marks  Array<string>  Optional

    Array of decorators for the current chunk of text. Valid marks are: strong, code, emphasis, underline, strikethrough and highlight.

    {
    "type": "span",
    "marks": ["highlight", "emphasis"],
    "value": "Some random text here, move on!"
    }

    link

    A link node represents a normal hyperlink. It might optionally contain a number of additional custom information under the meta key. You can also link to DatoCMS records using the itemLink node.

    It allows the following children nodes : span.

    type  "link"  Required
    url  string  Required

    The actual URL where the link points to. Can be any string, no specific format is enforced.

    children  Array<object>  Required
    meta  Array<object>  Optional

    Array of tuples containing custom meta-information for the link.

    {
    "type": "link",
    "url": "https://www.datocms.com/"
    "meta": [
    { "id": "rel", "value": "nofollow" },
    { "id": "target", "value": "_blank" }
    ],
    "children": [
    {
    "type": "span",
    "value": "The best CMS in town"
    }
    ]
    }

    itemLink

    An itemLink node is similar to a link node node, but instead of linking a portion of text to a URL, it links the document to another record present in the same DatoCMS project.

    It might optionally contain a number of additional custom information under the meta key.

    If you want to link to a DatoCMS record without having to specify some inner content, then please use the inlineItem node.

    It allows the following children nodes : span.

    type  "itemLink"  Required
    item  string  Required

    The linked DatoCMS record ID

    children  Array<object>  Required
    meta  Array<object>  Optional

    Array of tuples containing custom meta-information for the link.

    {
    "type": "itemLink",
    "item": "38945648",
    "meta": [
    { "id": "rel", "value": "nofollow" },
    { "id": "target", "value": "_blank" }
    ],
    "children": [
    {
    "type": "span",
    "value": "Matteo Giaccone"
    }
    ]
    }

    inlineItem

    An inlineItem, similarly to itemLink, links the document to another record but does not specify any inner content (children).

    It can be used in situations where it is up to the frontend to decide how to present the record (ie. a widget, or an <a> tag pointing to the URL of the record with a text that is the title of the linked record).

    It does not allow children nodes.

    type  "inlineItem"  Required
    item  string  Required

    The DatoCMS record ID

    {
    "type": "inlineItem",
    "item": "74619345"
    }

    heading

    An heading node represents a heading of a section. Using the level attribute you can control the rank of the heading.

    It allows the following children nodes : span, link, itemLink and inlineItem.

    type  "heading"  Required
    level  number  Required
    children  Array  Required
    {
    "type": "heading",
    "level": 2,
    "children": [
    {
    "type": "span",
    "value": "An h2 heading!"
    }
    ]
    }

    list

    A list node represents a list of items. Unordered lists must have its style field set to bulleted, while ordered lists, instead, have its style field set to numbered.

    It allows the following children nodes : listItem.

    type  "list"  Required
    style  enum  Required
    children  Array<object>  Required
    {
    "type": "list",
    "style": "bulleted",
    "children": [
    {
    "type": "listItem",
    "children": [
    {
    "type": "paragraph",
    "children": [
    {
    "type": "span",
    "value": "This is a list item!"
    }
    ]
    }
    ]
    }
    ]
    }

    listItem

    A listItem node represents an item in a list.

    It allows the following children nodes : paragraph and list.

    type  "listItem"  Required
    children  Array  Required
    {
    "type": "listItem",
    "children": [
    {
    "type": "paragraph",
    "children": [
    {
    "type": "span",
    "value": "This is a list item!"
    }
    ]
    }
    ]
    }

    code

    A code node represents a block of preformatted text, such as computer code.

    It does not allow children nodes.

    type  "code"  Required
    code  string  Required

    The marked up computer code

    language  string  Optional

    The language of computer code being marked up (ie. "javascript")

    highlight  Array<number>  Optional

    A zero-based array of line numbers to highlight (ie. [0, 1, 3])

    {
    "type": "code",
    "language": "javascript",
    "highlight": [1],
    "code": "function greetings() {\n console.log('Hi!');\n}"
    }

    blockquote

    A blockquote node is a containter that represents text which is an extended quotation.

    It allows the following children nodes : paragraph.

    type  "blockquote"  Required
    children  Array<object>  Required
    attribution  string  Optional

    Attribution for the quote (ie "Mark Smith")

    {
    "type": "blockquote",
    "attribution": "Oscar Wilde",
    "children": [
    {
    "type": "paragraph",
    "children": [
    {
    "type": "span",
    "value": "Be yourself; everyone else is taken."
    }
    ]
    }
    ]
    }

    block

    Similarly to Modular Content fields, you can also embed block records into Structured Text. A block node stores a reference to a DatoCMS block record embedded inside the dast document.

    This type of node can only be put as a direct child of the root node.

    It does not allow children nodes.

    type  "block"  Required
    item  string  Required

    The DatoCMS block record ID

    {
    "type": "block",
    "item": "1238455312"
    }

    thematicBreak

    A thematicBreak node represents a thematic break between paragraph-level elements: for example, a change of scene in a story, or a shift of topic within a section.

    It does not allow children nodes.

    type  "thematicBreak"  Required
    {
    "type": "thematicBreak"
    }