Structured Text and Dast format

    Structured Text content is stored as a JSON object consisting of two mandatory keys:

    • document: the content, expressed as a unist tree;

    • schema: a string that specifies the unist dialect used inside the document itself.

    {
    "schema": "dast",
    "document": {
    "type": "root",
    "children": [...]
    }
    }

    Generally speaking, you want to set the schema key to the dast dialect (which stands for DatoCMS Abstract Syntax Tree), so that:

    • you can take advantage of the default Structured Text editor that DatoCMS offers;

    • you can reinforce a number of additional validations to ensure consistency on the document.

    If you would like to to use custom unist format rather than dast, please let us know!

    DatoCMS Abstract Syntax Tree (dast) specification

    The dast specification adheres to the Unified collective, which offers a big ecosystem of utilities to parse, transform, manipulate, convert and serialize content of any kind.

    Unified is implemented and used as foundation by several popular libraries, such as rehype (HTML parser), remark (Markdown parser) and the MDX project. All these different projects are able to integrate with each other due to the fact that, to describe the content they treat, they all use the same common JSON format called unist.

    Just like HTML, a dast document it's composed of nodes within nodes:

    • Each node has a type attribute type

    • The top-level node in the dast specification must be of type root

    • Most nodes have a children attribute to specify the nodes it contains

    • The leaves of the tree are nodes of type span, which do not offer a children attribute but store the final text as a string in their value attribute

    • The specs define exactly which attributes and children each nodes permits.

    Let's just see an example of it:

    {
    "type": "root",
    "children": [
    {
    "type": "heading",
    "level": 1,
    "children": [
    {
    "type": "span",
    "marks": [],
    "value": "This is a title!"
    }
    ]
    },
    {
    "type": "paragraph",
    "children": [
    {
    "type": "span",
    "value": "This is a "
    }
    {
    "type": "span",
    "marks": ["strong"],
    "value": "paragraph!"
    }
    ]
    },
    {
    "type": "list",
    "style": "bulleted",
    "children": [
    {
    "type": "listItem",
    {
    "type": "paragraph",
    "children": [
    {
    "type": "span",
    "value": "And this is a list!"
    }
    ]
    },
    }
    ]
    }
    ]
    }

    Working with dast documents

    You can take advantage of several utilities to work with nodes in a dast document. For example, you can:

    import u from 'unist-builder';
    import inspect from 'unist-util-inspect';
    const document =
    u('root', [
    u('heading', { level: 1}, [
    u('span', 'This is the title!')
    ]),
    u('paragraph', [
    u('span', 'And '),
    u('span', { marks: ['strong'] }, 'this'),
    u('span', ' is a paragraph!')
    ])
    ]);
    console.log(inspect(document));
    root[2]
    ├─0 heading[1]
    │ │ level: 1
    │ └─0 span "This is the title!"
    └─1 paragraph[3]
    ├─0 span "And "
    ├─1 span "this"
    │ marks: ["strong"]
    └─2 span " is a paragraph!"

    In addition to that, we built a set of specific tools tools to work with Structured Text and dast documents:

    JSON Schema for dast

    The latest dast format specification is always available at the following url:

    https://site-api.datocms.com/docs/dast-schema.json

    Nodes

    root

    Every dast document MUST start with a root node.

    It allows the following children nodes : paragraph, heading, list, code, blockquote, block and thematicBreak.

    type  Required  "root"
    children  Required  Array
    {
    "type": "root",
    "children": [
    {
    "type": "heading",
    "level": 1,
    "children": [
    {
    "type": "span",
    "value": "Title"
    }
    ]
    },
    {
    "type": "paragraph",
    "children": [
    {
    "type": "span",
    "value": "A simple paragraph!"
    }
    ]
    }
    ]
    }

    paragraph

    A paragraph node represents a unit of textual content.

    It allows the following children nodes : span, link, itemLink and inlineItem.

    type  Required  "paragraph"
    children  Required  Array
    style  Optional  string

    Custom style applied to the node. Styles can be configured using the Plugin SDK

    {
    "type": "paragraph",
    "children": [
    {
    "type": "span",
    "value": "A simple paragraph!"
    }
    ]
    }

    span

    A span node represents a text node. It might optionally contain decorators called marks. It is worth mentioning that you can use the \n newline character to express line breaks.

    It does not allow children nodes.

    type  Required  "span"
    value  Required  string
    marks  Optional  Array

    Array of decorators for the current chunk of text. Default marks: strong, code, emphasis, underline, strikethrough and highlight. Additional custom marks can be defined via plugin.

    {
    "type": "span",
    "marks": ["highlight", "emphasis"],
    "value": "Some random text here, move on!"
    }

    link

    A link node represents a normal hyperlink. It might optionally contain a number of additional custom information under the meta key. You can also link to DatoCMS records using the itemLink node.

    It allows the following children nodes : span.

    type  Required  "link"
    url  Required  string

    The actual URL where the link points to. Can be any string, no specific format is enforced.

    children  Required  Array<object>
    meta  Optional  Array<object>

    Array of tuples containing custom meta-information for the link.

    {
    "type": "link",
    "url": "https://www.datocms.com/",
    "meta": [
    { "id": "rel", "value": "nofollow" },
    { "id": "target", "value": "_blank" }
    ],
    "children": [
    {
    "type": "span",
    "value": "The best CMS in town"
    }
    ]
    }

    itemLink

    An itemLink node is similar to a link node node, but instead of linking a portion of text to a URL, it links the document to another record present in the same DatoCMS project.

    It might optionally contain a number of additional custom information under the meta key.

    If you want to link to a DatoCMS record without having to specify some inner content, then please use the inlineItem node.

    It allows the following children nodes : span.

    type  Required  "itemLink"
    item  Required  string

    The linked DatoCMS record ID

    children  Required  Array<object>
    meta  Optional  Array<object>

    Array of tuples containing custom meta-information for the link.

    {
    "type": "itemLink",
    "item": "38945648",
    "meta": [
    { "id": "rel", "value": "nofollow" },
    { "id": "target", "value": "_blank" }
    ],
    "children": [
    {
    "type": "span",
    "value": "Matteo Giaccone"
    }
    ]
    }

    inlineItem

    An inlineItem, similarly to itemLink, links the document to another record but does not specify any inner content (children).

    It can be used in situations where it is up to the frontend to decide how to present the record (ie. a widget, or an <a> tag pointing to the URL of the record with a text that is the title of the linked record).

    It does not allow children nodes.

    type  Required  "inlineItem"
    item  Required  string

    The DatoCMS record ID

    {
    "type": "inlineItem",
    "item": "74619345"
    }

    heading

    An heading node represents a heading of a section. Using the level attribute you can control the rank of the heading.

    It allows the following children nodes : span, link, itemLink and inlineItem.

    type  Required  "heading"
    level  Required  number
    children  Required  Array
    style  Optional  string

    Custom style applied to the node. Styles can be configured using the Plugin SDK

    {
    "type": "heading",
    "level": 2,
    "children": [
    {
    "type": "span",
    "value": "An h2 heading!"
    }
    ]
    }

    list

    A list node represents a list of items. Unordered lists must have its style field set to bulleted, while ordered lists, instead, have its style field set to numbered.

    It allows the following children nodes : listItem.

    type  Required  "list"
    style  Required  enum
    children  Required  Array<object>
    {
    "type": "list",
    "style": "bulleted",
    "children": [
    {
    "type": "listItem",
    "children": [
    {
    "type": "paragraph",
    "children": [
    {
    "type": "span",
    "value": "This is a list item!"
    }
    ]
    }
    ]
    }
    ]
    }

    listItem

    A listItem node represents an item in a list.

    It allows the following children nodes : paragraph and list.

    type  Required  "listItem"
    children  Required  Array
    {
    "type": "listItem",
    "children": [
    {
    "type": "paragraph",
    "children": [
    {
    "type": "span",
    "value": "This is a list item!"
    }
    ]
    }
    ]
    }

    code

    A code node represents a block of preformatted text, such as computer code.

    It does not allow children nodes.

    type  Required  "code"
    code  Required  string

    The marked up computer code

    language  Optional  string

    The language of computer code being marked up (ie. "javascript")

    highlight  Optional  Array<number>

    A zero-based array of line numbers to highlight (ie. [0, 1, 3])

    {
    "type": "code",
    "language": "javascript",
    "highlight": [1],
    "code": "function greetings() {\n console.log('Hi!');\n}"
    }

    blockquote

    A blockquote node is a containter that represents text which is an extended quotation.

    It allows the following children nodes : paragraph.

    type  Required  "blockquote"
    children  Required  Array<object>
    attribution  Optional  string

    Attribution for the quote (ie "Mark Smith")

    {
    "type": "blockquote",
    "attribution": "Oscar Wilde",
    "children": [
    {
    "type": "paragraph",
    "children": [
    {
    "type": "span",
    "value": "Be yourself; everyone else is taken."
    }
    ]
    }
    ]
    }

    block

    Similarly to Modular Content fields, you can also embed block records into Structured Text. A block node stores a reference to a DatoCMS block record embedded inside the dast document.

    This type of node can only be put as a direct child of the root node.

    It does not allow children nodes.

    type  Required  "block"
    item  Required  string

    The DatoCMS block record ID

    {
    "type": "block",
    "item": "1238455312"
    }

    thematicBreak

    A thematicBreak node represents a thematic break between paragraph-level elements: for example, a change of scene in a story, or a shift of topic within a section.

    It does not allow children nodes.

    type  Required  "thematicBreak"
    {
    "type": "thematicBreak"
    }