# Structured Text and \`dast\` format

Structured Text content is stored as a JSON object consisting of two mandatory keys:

-   `document`: the content, expressed as a [`unist`](https://github.com/syntax-tree/unist) tree;
    
-   `schema`: a string that specifies the unist dialect used inside the `document` itself.
    

```json
{
  "schema": "dast",
  "document": {
    "type": "root",
    "children": [...]
  }
}
```

Generally speaking, you want to set the `schema` key to the `dast` dialect (which stands for **D**atoCMS **A**bstract **S**yntax **T**ree), so that:

-   you can take advantage of the default **Structured Text** editor that DatoCMS offers;
    
-   you can reinforce a number of additional validations to ensure consistency within the document.
    

If you would like to to use a custom [unist](https://github.com/syntax-tree/unist) format rather than `dast`, please [let us know!](https://www.datocms.com/support.md?topics=feature-request)

### DatoCMS Abstract Syntax Tree (`dast`) specification

The `dast` specification adheres to the [Unified](https://unifiedjs.com/) collective, which offers a large ecosystem of utilities to parse, transform, manipulate, convert, and serialize content of any kind.

Unified is implemented and used as foundation by several popular libraries, such as [rehype](https://github.com/rehypejs/rehype) (HTML parser), [remark](https://github.com/remarkjs/remark) (Markdown parser) and the [MDX project](https://mdxjs.com/). All these different projects are able to integrate with each other due to the fact that, to describe the content they treat, they all use the same common JSON format called [`unist`](https://github.com/syntax-tree/unist).

(Image content)

Just like HTML, a `dast` document is composed of nodes within nodes:

-   Each node has a type attributed called `type`
    
-   The top-level node in the `dast` specification must be of type `root`
    
-   Most nodes have a `children` attribute to specify the nodes it contains
    
-   The leaves of the tree are nodes of type `span`, which do not offer a `children` attribute but store the final text as a string in their `value` attribute
    
-   The specs define exactly which attributes and children each node permits.
    

Let's look at an example:

```json
{
  "type": "root",
  "children": [
    {
      "type": "heading",
      "level": 1,
      "children": [
        {
          "type": "span",
          "marks": [],
          "value": "This is a title!"
        }
      ]
    },
    {
      "type": "paragraph",
      "children": [
        {
          "type": "span",
          "value": "This is a "
        }
        {
          "type": "span",
          "marks": ["strong"],
          "value": "paragraph!"
        }
      ]
    },
    {
      "type": "list",
      "style": "bulleted",
      "children": [
        {
          "type": "listItem",
          {
            "type": "paragraph",
            "children": [
              {
                "type": "span",
                "value": "And this is a list!"
              }
            ]
          },
        }
      ]
    }
  ]
}
```

### Working with `dast` documents

The package [`datocms-structured-text-utils`](https://github.com/datocms/structured-text/tree/main/packages/utils) offers JavaScript nodes definitions, Typescript types and type guards and many tree manipulation utilities.

Additionally, you can take advantage of [several `unist` utilities](https://github.com/syntax-tree/unist#list-of-utilities) to work with nodes in a `dast` document. For example, you can compose and assemble a document with [`unist-builder`](https://github.com/syntax-tree/unist-builder), select nodes with a CSS-like syntax using [`unist-util-select`](https://github.com/syntax-tree/unist-util-select) or have a compact representation of the document via [`unist-util-inspect`](https://github.com/syntax-tree/unist-util-inspect):

```javascript
import u from 'unist-builder';
import inspect from 'unist-util-inspect';

const document =
  u('root', [
    u('heading', { level: 1}, [
      u('span', 'This is the title!')
    ]),
    u('paragraph', [
      u('span', 'And '),
      u('span', { marks: ['strong'] }, 'this'),
      u('span', ' is a paragraph!')
    ])
  ]);

console.log(inspect(document));

root[2]
├─0 heading[1]
│   │ level: 1
│   └─0 span "This is the title!"
└─1 paragraph[3]
    ├─0 span "And "
    ├─1 span "this"
    │     marks: ["strong"]
    └─2 span " is a paragraph!"
```

### Converting HTML to Structured Text and vice versa

These are the utilities contained within **datocms/structured-text**:

**Conversion utilities**

-   [`datocms-html-to-structured-text`](https://github.com/datocms/structured-text/tree/main/packages/html-to-structured-text) — Convert HTML/Markdown into Structured Text
    

**Rendering utilities**

-   [`datocms-structured-text-to-plain-text`](https://github.com/datocms/structured-text/tree/main/packages/to-plain-text) — Render Structured Text as plain text
    
-   [`datocms-structured-text-to-markdown`](https://github.com/datocms/structured-text/tree/main/packages/to-markdown) — Render Structured Text as Markdown
    
-   [`datocms-structured-text-to-html-string`](https://github.com/datocms/structured-text/tree/main/packages/to-html-string) — Render Structured Text as an HTML string
    
-   [`datocms-structured-text-to-dom-nodes`](https://github.com/datocms/structured-text/tree/main/packages/to-dom-nodes) — Transform Structured Text into a list of DOM nodes
    

**Framework components**

-   **React** → [`<StructuredText />`](https://github.com/datocms/react-datocms#structured-text)
    
-   **Vue** → [`<datocms-structured-text />`](https://github.com/datocms/vue-datocms#structured-text)
    
-   **Svelte / SvelteKit** → [`<StructuredText />`](https://github.com/datocms/datocms-svelte/tree/main/src/lib/components/StructuredText)
    
-   **Astro** → [`<StructuredText />`](https://github.com/datocms/astro-datocms/tree/main/src/StructuredText)
    

### JSON Schema for `dast`

The latest `dast` format specification is always available at the following URL:

[https://site-api.datocms.com/docs/dast-schema.json](https://site-api.datocms.com/docs/dast-schema.json)

### `root`

Every `dast` document MUST start with a `root` node.

It allows the following children nodes: [`paragraph`](https://www.datocms.com/docs/structured-text/dast.md#paragraph), [`heading`](https://www.datocms.com/docs/structured-text/dast.md#heading), [`list`](https://www.datocms.com/docs/structured-text/dast.md#list), [`code`](https://www.datocms.com/docs/structured-text/dast.md#code), [`blockquote`](https://www.datocms.com/docs/structured-text/dast.md#blockquote), [`block`](https://www.datocms.com/docs/structured-text/dast.md#block) and [`thematicBreak`](https://www.datocms.com/docs/structured-text/dast.md#thematicBreak).

```json
{
  "type": "root",
  "children": [
    {
      "type": "heading",
      "level": 1,
      "children": [
        {
          "type": "span",
          "value": "Title"
        }
      ]
    },
    {
      "type": "paragraph",
      "children": [
        {
          "type": "span",
          "value": "A simple paragraph!"
        }
      ]
    }
  ]
}
```

### `paragraph`

A `paragraph` node represents a unit of textual content.

It allows the following children nodes: [`span`](https://www.datocms.com/docs/structured-text/dast.md#span), [`link`](https://www.datocms.com/docs/structured-text/dast.md#link), [`itemLink`](https://www.datocms.com/docs/structured-text/dast.md#itemLink), [`inlineItem`](https://www.datocms.com/docs/structured-text/dast.md#inlineItem) and [`inlineBlock`](https://www.datocms.com/docs/structured-text/dast.md#inlineBlock).

```json
{
  "type": "paragraph",
  "children": [
    {
      "type": "span",
      "value": "A simple paragraph!"
    }
  ]
}
```

### `span`

A `span` node represents a text node. It might optionally contain decorators called `marks`. It is worth mentioning that you can use the `\n` newline character to express line breaks.

It does not allow children nodes.

```json
{
  "type": "span",
  "marks": ["highlight", "emphasis"],
  "value": "Some random text here, move on!"
}
```

### `link`

A `link` node represents a normal hyperlink. It might optionally contain a number of additional custom information under the `meta` key. You can also link to DatoCMS records using the [`itemLink`](https://www.datocms.com/docs/structured-text/dast.md#itemLink) node.

It allows the following children nodes: [`span`](https://www.datocms.com/docs/structured-text/dast.md#span).

```json
{
  "type": "link",
  "url": "https://www.datocms.com/",
  "meta": [
    { "id": "rel", "value": "nofollow" },
    { "id": "target", "value": "_blank" }
  ],
  "children": [
    {
      "type": "span",
      "value": "The best CMS in town"
    }
  ]
}
```

### `itemLink`

An `itemLink` node is similar to a [`link`](https://www.datocms.com/docs/structured-text/dast.md#link) node node, but instead of linking a portion of text to a URL, it links the document to another record present in the same DatoCMS project.

It might optionally contain a number of additional custom information under the `meta` key.

If you want to link to a DatoCMS record without having to specify some inner content, then please use the [`inlineItem`](https://www.datocms.com/docs/structured-text/dast.md#inlineItem) node.

It allows the following children nodes: [`span`](https://www.datocms.com/docs/structured-text/dast.md#span).

```json
{
  "type": "itemLink",
  "item": "38945648",
  "meta": [
    { "id": "rel", "value": "nofollow" },
    { "id": "target", "value": "_blank" }
  ],
  "children": [
    {
      "type": "span",
      "value": "Matteo Giaccone"
    }
  ]
}
```

### `inlineItem`

An `inlineItem`, similarly to [`itemLink`](https://www.datocms.com/docs/structured-text/dast.md#itemLink), links the document to another record but does not specify any inner content (children).

It can be used in situations where it is up to the frontend to decide how to present the record (ie. a widget, or an `<a>` tag pointing to the URL of the record with a text that is the title of the linked record).

It does not allow children nodes.

```json
{
  "type": "inlineItem",
  "item": "74619345"
}
```

### `inlineBlock`

It does not allow children nodes.

```json
{
  "type": "inlineBlock",
  "item": "1238455312"
}
```

### `heading`

An `heading` node represents a heading of a section. Using the `level` attribute you can control the rank of the heading.

It allows the following children nodes: [`span`](https://www.datocms.com/docs/structured-text/dast.md#span), [`link`](https://www.datocms.com/docs/structured-text/dast.md#link), [`itemLink`](https://www.datocms.com/docs/structured-text/dast.md#itemLink), [`inlineItem`](https://www.datocms.com/docs/structured-text/dast.md#inlineItem) and [`inlineBlock`](https://www.datocms.com/docs/structured-text/dast.md#inlineBlock).

```json
{
  "type": "heading",
  "level": 2,
  "children": [
    {
      "type": "span",
      "value": "An h2 heading!"
    }
  ]
}
```

### `list`

A `list` node represents a list of items. Unordered lists must have its `style` field set to `bulleted`, while ordered lists, instead, have its `style` field set to `numbered`.

It allows the following children nodes: [`listItem`](https://www.datocms.com/docs/structured-text/dast.md#listItem).

```json
{
  "type": "list",
  "style": "bulleted",
  "children": [
    {
      "type": "listItem",
      "children": [
        {
          "type": "paragraph",
          "children": [
            {
              "type": "span",
              "value": "This is a list item!"
            }
          ]
        }
      ]
    }
  ]
}
```

### `listItem`

A `listItem` node represents an item in a list.

It allows the following children nodes: [`paragraph`](https://www.datocms.com/docs/structured-text/dast.md#paragraph) and [`list`](https://www.datocms.com/docs/structured-text/dast.md#list).

```json
{
  "type": "listItem",
  "children": [
    {
      "type": "paragraph",
      "children": [
        {
          "type": "span",
          "value": "This is a list item!"
        }
      ]
    }
  ]
}
```

### `code`

A `code` node represents a block of preformatted text, such as computer code.

It does not allow children nodes.

```json
{
  "type": "code",
  "language": "javascript",
  "highlight": [1],
  "code": "function greetings() {\n  console.log('Hi!');\n}"
}
```

### `blockquote`

A `blockquote` node is a containter that represents text which is an extended quotation.

It allows the following children nodes: [`paragraph`](https://www.datocms.com/docs/structured-text/dast.md#paragraph).

```json
{
  "type": "blockquote",
  "attribution": "Oscar Wilde",
  "children": [
    {
      "type": "paragraph",
      "children": [
        {
          "type": "span",
          "value": "Be yourself; everyone else is taken."
        }
      ]
    }
  ]
}
```

### `block`

Similarly to [Modular Content](https://www.datocms.com/docs/content-modelling/modular-content.md) fields, you can also embed block records into Structured Text. A `block` node stores a reference to a DatoCMS block record embedded inside the `dast` document.

This type of node can only be put as a direct child of the [`root`](https://www.datocms.com/docs/structured-text/dast.md#root) node.

It does not allow children nodes.

```json
{
  "type": "block",
  "item": "1238455312"
}
```

### `thematicBreak`

A `thematicBreak` node represents a thematic break between paragraph-level elements: for example, a change of scene in a story, or a shift of topic within a section.

It does not allow children nodes.

```json
{
  "type": "thematicBreak"
}
```

## Related content in "Working with Structured Text"

- [Structured Text and `dast` format](https://www.datocms.com/docs/structured-text/dast.md)

- [Migrating content to Structured Text](https://www.datocms.com/docs/structured-text/migrating-content-to-structured-text.md)