!yaml.info Learn Libraries Contribute

Learn Documents Quoting Flow Style Schema Best Practices

What is YAML?

YAML is a computer data serialization language.

A YAML document represents a computer program's native data structure in a human readable text form. A node in a YAML document can have three basic data types:

On top of that, YAML allows to serialize all other data types and classes:

Additionally to the indentation based Block Style there is a more compact Flow Style syntax.

One YAML File (or Stream) can consist of more than one Document.

Tutorial

The following examples will introduce you with YAML syntax elements step by step.

Invoice

Let's write an invoice.

It has a number, a name and an address, order items and more.

Mapping

The most common top level data type are mappings. A mapping maps values to keys.
Keys and values are separated with a colon and a space : .
Each Key/Value pair is on its own line.

invoice number: 314159
name: Santa Claus
address: North Pole

An alternative way to write it:

---
invoice number: 314159
name: Santa Claus
address: North Pole

The --- is explicity starting a Document.

It marks the following content as YAML, but it is optional.

It has some use cases, and it is needed when you have multiple Documents in one file.

Read more about it in the Document Chapter.

Nested Mappings

Now we replace the address string with another mapping. In that case the colon is followed by a linebreak. Mapping values that are not scalars must always start on a new line.

Nested items must always be indented more then the parent node, with at least one space. The typical indentation is two spaces.

Tabs are forbidden as indentation.

invoice number: 314159
name: Santa Claus

address:
  street: Santa Claus Lane
  zip: 12345
  city: North Pole

Don't forget the indentation. If you write it like this:

invoice number: 314159
name: Santa Claus

address:
street: Santa Claus Lane
zip: 12345
city: North Pole

... then it will actually mean this:

invoice number: 314159
name: Santa Claus

address: null
street: Santa Claus Lane
zip: 12345
city: North Pole

Sequence

A sequence is a list (or array) of scalars (or other sequences or mappings).
A sequence item starts with a hyphen and a space - .
Here is the list of YAML inventors:

- Oren Ben-Kiki
- Clark Evans
- Ingy döt Net

Now back to our invoice.
We map a list of scalars to the key order items.

The sequence must start on the next line:

invoice number: 314159
name: Santa Claus
address:
  street: Santa Claus Lane
  zip: 12345
  city: North Pole

order items:
  - Sled
  - Wrapping Paper

Because the - counts as indentation, you can also write it like this:

invoice number: 314159
name: Santa Claus
address:
  street: Santa Claus Lane
  zip: 12345
  city: North Pole

order items:
- Sled
- Wrapping Paper

Nested Sequences

You can also nest sequences. The typical example is a List of Dice Rolls.

The nested sequence items can follow directly on the same line:

---
- - 2
  - 3
- - 3
  - 6

YAML allows to write that in a more compact way, the Flow Style:

---
- [ 2, 3 ]
- [ 3, 6 ]

Read more about it in the Flow Style Chapter.

Aliases / Anchors

Let's add a billing address to the invoice.

In our case it is the same as the shipping address. We rename address to shipping address and add billing address:

invoice number: 314159
name: Santa Claus

shipping address:
  street: Santa Claus Lane
  zip: 12345
  city: North Pole
billing address:
  street: Santa Claus Lane
  zip: 12345
  city: North Pole

order items:
- Sled
- Wrapping Paper

Now that's a bit wasted space. If it's the same address, you don't need to repeat it. Use an Alias.

In the native data structure of a programming language, this would be a reference, pointer, or alias.

Before an Alias can be used, it has to be created with an Anchor:

invoice number: 314159
name: Santa Claus

shipping address: &address     #   Anchor
  street: Santa Claus Lane     # ┐
  zip: 12345                   # │ Anchor content
  city: North Pole             # ┘
billing address: *address      #   Alias

order items:
- Sled
- Wrapping Paper

When loaded into a native data structure, the shipping address and billing address point to the same data structure.
It depends on the capabilities of the programming language how this is implemented internally.
(Link to Alias chapter)

Configuration Management

YAML is used in all kinds of applications as a configuration language.

Continuous Integration

One category is the configuration of Continuous Integration systems.

Here is a minimal example of a GitHub Action Workflow.

name: Linux
on: [push]    # Compact Flow Style Sequence
jobs:
  build:
    name: Run Tests
    runs-on: ubuntu-latest
    steps:
    - name: Say Hello
      run: echo hello

The value for steps is a list of mappings. A mapping can start directly on the same line as the -.

Usually a step has a name, which will be shown as the title when running the job, and a run, which is a shell command, or multiple commands.

Let's add a more realistic scenario, with one step to checkout the code, and one with multiple commands.

If you use Double Quotes, which work like JSON strings, it looks like this:

steps:
# Plugin provided by GitHub to checkout the code
- uses: actions/checkout@v2
# Run multiple commands
- name: Run Tests
  run: "./configure\nmake\nmake test\n"

One of the advantages of YAML here is that this can be formatted in a way that's easy to write and read with Block Scalars:

steps:
- uses: actions/checkout@v2
- name: Run Tests
  run: |      # Literal Block Scalar
    ./configure
    make
    make test

The Literal Block Scalar, as the name says, contains the literal content of the string. Tabs and similar characters are always literal. All trailing spaces will be kept.

Let's say, you have a number of longer commands that you would like to break up into multiple lines for readability:

steps:
- uses: actions/checkout@v2
- name: Install dependencies
  run: >      # Folded Block Scalar
    apt-get update
    && apt-get install -y
    git tig vim jq tmux tmate git-subrepo cpanminus

    cpanm -n -l local
    YAML::PP YAML::XS ...

The Folded Block Scalar is like the Literal Block Scalar, but with special folding rules.

Consecutive lines starting at the same indentation level will be folded with spaces, and empty lines create a linebreak.

Read more about Block Scalars and all other ways of quoting in the Quoting Chapter.

Variables

YAML itself has no concept of "variables" or "functions".

Systems like GitHub Actions usually provide a way to access certain information and environment variables with a Templating Syntax.

We set up a "matrix" test to build the code with gcc and clang.

strategy:
  matrix:
    compiler: [gcc, clang]
steps:
- ...

The strategy.matrix entry will create two jobs instead of one, providing the compiler in a "context" item that we can pass as an environment variable to the step:

strategy:
  matrix:
    compiler: [gcc, clang]
steps:
- uses: actions/checkout@v2
- name: Run Tests
  env:
    CC: ${{ matrix.compiler }}
  run: |
    ./configure
    make
    make test

This sets the environment variable CC to gcc or clang, respectively.

The ${{ matrix.compiler }} syntax is not a special YAML syntax.
It is a simple plain scalar that could also have been written in quotes:

env:
  CC: '${{ matrix.compiler }}'

It's the GitHub Action application that recognizes such variables and replaces them with their content at runtime.

Such variables can look different, depending on the application.

For example, Ansible is using the Jinja2 templating engine, where variables look like this:

wuth_items: '{{ user.names }}'

It is important to add quotes here, because the { at the start actually would start a Flow Style Mapping otherwise.

So it's clever that GitHub Actions chose the ${{ ... }} syntax, because the $ at the start is not special in YAML and doesn't need quotes.

Page Source