mirror of https://github.com/status-im/NimYAML.git
406 lines
15 KiB
Plaintext
406 lines
15 KiB
Plaintext
======================
|
|
Serialization Overview
|
|
======================
|
|
|
|
Introduction
|
|
============
|
|
|
|
NimYAML tries hard to make transforming YAML characters streams to native Nim
|
|
types and vice versa as easy as possible. In simple scenarios, you might not
|
|
need anything else than the two procs
|
|
`dump <yaml.serialization.html#dump,K,Stream,TagStyle,AnchorStyle,PresentationOptions>`_
|
|
and `load <yaml.serialization.html#load,,K>`_. On the other side, the process
|
|
should be as customizable as possible to allow the user to tightly control how
|
|
the generated YAML character stream will look and how a YAML character stream is
|
|
interpreted.
|
|
|
|
An important thing to remember in NimYAML is that unlike in interpreted
|
|
languages like Ruby, Nim cannot load a YAML character stream without knowing the
|
|
resulting type beforehand. For example, if you want to load this piece of YAML:
|
|
|
|
.. code-block:: yaml
|
|
|
|
%YAML 1.2
|
|
--- !nim:system:seq(nim:system:int8)
|
|
- 1
|
|
- 2
|
|
- 3
|
|
|
|
You would need to know that it will load a ``seq[int8]`` *at compile time*. This
|
|
is not really a problem because without knowing which type you will load, you
|
|
cannot do anything useful with the result afterwards in the code. But it may be
|
|
unfamiliar for programmers who are used to the YAML libraries of Python or Ruby.
|
|
|
|
Supported Types
|
|
===============
|
|
|
|
NimYAML supports a growing number of types of Nim's ``system`` module and
|
|
standard library, and it also supports user-defined object, tuple and enum types
|
|
out of the box. A complete list of explicitly supported types is available in
|
|
`Schema <schema.html>`_.
|
|
|
|
**Important**: NimYAML currently does not support polymorphism. This may be
|
|
added in the future.
|
|
|
|
This also means that NimYAML is generally able to work with object, tuple and
|
|
enum types defined in the standard library or a third-party library without
|
|
further configuration, given that all fields of the object are accessible at the
|
|
code point where NimYAML's facilities are invoked.
|
|
|
|
Scalar Types
|
|
------------
|
|
|
|
The following integer types are supported by NimYAML: ``int``, ``int8``,
|
|
``int16``, ``int32``, ``int64``, ``uint8``, ``uint16``, ``uint32``, ``uint64``.
|
|
Note that the ``int`` type has a variable size dependent on the target
|
|
operation system. To make sure that it round-trips properly between 32-bit and
|
|
64-bit operating systems, it will be converted to an ``int32`` during loading
|
|
and dumping. This will raise an exception for values outside of the range
|
|
``int32.low .. int32.high``! If you define the types you serialize yourself,
|
|
always consider using an integer type with explicit length. The same goes for
|
|
``uint``.
|
|
|
|
The floating point types ``float``, ``float32`` and ``float64`` are also
|
|
supported. There is currently no problem with ``float``, because it is always a
|
|
``float64``.
|
|
|
|
``string`` is supported and one of the few Nim types which directly map to a
|
|
standard YAML type. NimYAML is able to handle strings that are ``nil``, they
|
|
will be serialized with the special tag ``!nim:nil:string``. ``char`` is also
|
|
supported.
|
|
|
|
To support new scalar types, you must implement the ``constructObject()`` and
|
|
``representObject()`` procs on that type (see below).
|
|
|
|
Collection Types
|
|
----------------
|
|
|
|
NimYAML supports Nim's ``array``, ``set``, ``seq``, ``Table`` and
|
|
``OrderedTable`` types out of the box. Unlike the native YAML types ``!!seq``
|
|
and ``!!map``, Nim's collection types define the type of all their contained
|
|
items (or keys and values). So YAML objects with heterogeneous types in them
|
|
cannot be loaded to Nim collection types. For example, this sequence:
|
|
|
|
.. code-block:: yaml
|
|
|
|
%YAML 1.2
|
|
--- !!seq
|
|
- !!int 1
|
|
- !!string foo
|
|
|
|
Cannot be loaded to a Nim ``seq``. For this reason, you cannot load YAML's
|
|
native ``!!map`` and ``!!seq`` types directly into Nim types. However, you can
|
|
use variant object types to process heterogeneous value lists, see below.
|
|
|
|
Nim ``seq`` types may be ``nil``. This is handled by serializing them to an
|
|
empty scalar with the tag ``!nim:nil:seq``.
|
|
|
|
Reference Types
|
|
---------------
|
|
|
|
A reference to any supported non-reference type (including user defined types,
|
|
see below) is supported by NimYAML. A reference type will be treated like its
|
|
base type, but NimYAML is able to detect multiple references to the same object
|
|
and dump the structure properly with anchors and aliases in place. It is
|
|
possible to dump and load cyclic data structures without further configuration.
|
|
It is possible for reference types to hold a ``nil`` value, which will be mapped
|
|
to the ``!!null`` YAML scalar type.
|
|
|
|
``ptr`` types are not supported because it seems dangerous to automatically
|
|
allocate memory which the user must then manually deallocate.
|
|
|
|
User Defined Types
|
|
------------------
|
|
|
|
For an object or tuple type to be directly usable with NimYAML, the following
|
|
conditions must be met:
|
|
|
|
- Every type contained in the object/tuple must be supported
|
|
- All fields of an object type must be accessible from the code position where
|
|
you call NimYAML. If an object has non-public member fields, it can only be
|
|
processed in the module where it is defined.
|
|
- The object may not have a generic parameter
|
|
|
|
NimYAML will present enum types as YAML scalars, and tuple and object types as
|
|
YAML maps. Some of the conditions above may be loosened in future releases.
|
|
|
|
Variant Object Types
|
|
....................
|
|
|
|
A *variant object type* is an object type that contains one or more ``case``
|
|
clauses. NimYAML supports variant object types. Only the currently accessible
|
|
fields of a variant object type are dumped, and only those may be present when
|
|
loading.
|
|
|
|
The value of a discriminator field must be loaded before any value of a field
|
|
that depends on it. Therefore, a YAML mapping cannot be used to serialize
|
|
variant object types - the YAML specification explicitly states that the order
|
|
of key-value pairs in a mapping must not be used to convey content information.
|
|
So, any variant object type is serialized as a list of key-value pairs.
|
|
|
|
For example, this type:
|
|
|
|
.. code-block:: nim
|
|
type
|
|
AnimalKind = enum
|
|
akCat, akDog
|
|
|
|
Animal = object
|
|
name: string
|
|
case kind: AnimalKind
|
|
of akCat:
|
|
purringIntensity: int
|
|
of akDog:
|
|
barkometer: int
|
|
|
|
will be serialized as:
|
|
|
|
.. code-block:: yaml
|
|
%YAML 1.2
|
|
--- !nim:custom:Animal
|
|
- name: Bastet
|
|
- kind: akCat
|
|
- purringIntensity: 7
|
|
|
|
You can also use variant object types for processing heterogeneous data sets.
|
|
For example, if you have a YAML document which contains differently typed values
|
|
in the same list like this:
|
|
|
|
.. code-block:: yaml
|
|
%YAML 1.2
|
|
---
|
|
- 42
|
|
- this is a string
|
|
- !!null
|
|
|
|
You can define a variant object type that can hold all types that occur in this
|
|
list in order to load it:
|
|
|
|
.. code-block:: nim
|
|
import yaml
|
|
|
|
type
|
|
ContainerKind = enum
|
|
ckInt, ckString, ckNone
|
|
Container = object
|
|
case kind: ContainerKind
|
|
of ckInt:
|
|
intVal: int
|
|
of ckString:
|
|
strVal: string
|
|
of ckNone:
|
|
discard
|
|
|
|
markAsImplicit(Container)
|
|
|
|
var
|
|
list: seq[Container]
|
|
s = newFileStream("in.yaml")
|
|
load(s, list)
|
|
|
|
``markAsImplicit`` tells NimYAML that you want to use the type ``Container``
|
|
implicitly, i.e. its fields are not visible in YAML, and are set dependent on
|
|
the value type that gets loaded into it. The type ``Container`` must fullfil the
|
|
following requirements:
|
|
|
|
- It must contain exactly one ``case`` clause, and nothing else.
|
|
- Each branch of the ``case`` clause must contain exactly one field, with one
|
|
exception: There may be at most one branch that contains no field at all.
|
|
- It must not be a derived object type (this is currently not enforced)
|
|
|
|
When loading the sequence, NimYAML writes the value into the first field that
|
|
can hold the value's type. All complex values (i.e. non-scalar values) *must*
|
|
have a tag in the YAML source, because NimYAML would otherwise be unable to
|
|
determine their type. The type of scalar values will be guessed if no tag is
|
|
available, but be aware that ``42`` can fit in both ``int8`` and ``int16``, so
|
|
in the case you have fields for both types, you should annotate the value.
|
|
|
|
When dumping the sequence, NimYAML will always annotate a tag to each value it
|
|
outputs. This is to avoid possible ambiguity when loading. If a branch without
|
|
a field exists, it is represented as a ``!!null`` value.
|
|
|
|
Tags
|
|
====
|
|
|
|
NimYAML uses local tags to represent Nim types that do not map directly to a
|
|
YAML type. For example, ``int8`` is presented with the tag ``!nim:system:int8``.
|
|
Tags are mostly unnecessary when loading YAML data because the user already
|
|
defines the target Nim type which usually defines all types of the structure.
|
|
However, there is one case where a tag is necessary: A reference type with the
|
|
value ``nil`` is represented in YAML as a ``!!null`` scalar. This will be
|
|
automatically detected by type guessing, but if it is for example a reference to
|
|
a string with the value ``"~"``, it must be tagged with ``!!string``, because
|
|
otherwise, it would be loaded as ``nil``.
|
|
|
|
As you might have noticed in the example above, the YAML tag of a ``seq``
|
|
depends on its generic type parameter. The same applies to ``Table``. So, a
|
|
table that maps ``int8`` to string sequences would be presented with the tag
|
|
``!n!tables:Table(tag:nimyaml.org,2016:int8,tag:nimyaml.org,2016:system:seq(tag:yaml.org,2002:string))``.
|
|
These tags are generated on the fly based on the types you instantiate
|
|
``Table`` or ``seq`` with.
|
|
|
|
You may customize the tags used for your types by using the template
|
|
`setTagUri <yaml.html#setTagUri.t,typedesc,string>`_. It may not
|
|
be applied to scalar and collection types implemented by NimYAML, but you can
|
|
for example use it on a certain ``seq`` type:
|
|
|
|
.. code-block:: nim
|
|
|
|
setTagUri(seq[string], "!nim:my:seq")
|
|
|
|
Customizing Field Handling
|
|
==========================
|
|
|
|
NimYAML allows the user to specify special handling of certain object fields.
|
|
This configuration will be applied at compile time when NimYAML generates the
|
|
(de)serialization code for an object type. It is important that the
|
|
configuration happens before any YAML operations (e.g. ``load`` or ``dump``) are
|
|
executed on a variable of the object type.
|
|
|
|
Transient Fields
|
|
----------------
|
|
|
|
It may happen that certain fields of an object type are transient, i.e. they are
|
|
used in a way that makes (de)serializing them unnecessary. Such fields can be
|
|
marked as transient. This will cause them not to be serialized to YAML. They
|
|
will also not be accepted when loading the object.
|
|
|
|
Example:
|
|
|
|
.. code-block:: nim
|
|
|
|
type MyObject: object
|
|
storable: string
|
|
temporary: string
|
|
|
|
markAsTransient(MyObject, temporary)
|
|
|
|
Default Values
|
|
--------------
|
|
|
|
When you load YAML that has been written by a human, you might want to allow the
|
|
user to omit certain fields, which should then be filled with a default value.
|
|
You can do that like this:
|
|
|
|
.. code-block:: nim
|
|
|
|
type MyObject: object
|
|
required: string
|
|
optional: string
|
|
|
|
setDefaultValue(MyObject, optional, "default value")
|
|
|
|
Whenever ``MyObject`` now is loaded and the input stream does not contain the
|
|
field ``optional``, that field will be set to the value ``"default value"``.
|
|
|
|
Customize Serialization
|
|
=======================
|
|
|
|
It is possible to customize the serialization of a type. For this, you need to
|
|
implement two procs, ``constructObject̀`` and ``representObject``. If you only
|
|
need to process the type in one direction (loading or dumping), you can omit
|
|
the other proc.
|
|
|
|
constructObject
|
|
---------------
|
|
|
|
.. code-block:: nim
|
|
|
|
proc constructObject*(s: var YamlStream, c: ConstructionContext,
|
|
result: var MyObject)
|
|
{.raises: [YamlConstructionError, YamlStreamError.}
|
|
|
|
This proc should construct the type from a ``YamlStream``. Follow the following
|
|
guidelines when implementing a custom ``constructObject`` proc:
|
|
|
|
- For constructing a value from a YAML scalar, consider using the
|
|
``constructScalarItem`` template, which will automatically catch exceptions
|
|
and wrap them with a ``YamlConstructionError``, and also will assure that the
|
|
item you use for construction is a ``yamlScalar``. See below for an example.
|
|
- For constructing a value from a YAML sequence or map, you **must** use the
|
|
``constructChild`` proc for child values if you want to use their
|
|
``constructObject`` implementation. This will check their tag and anchor.
|
|
Always try to construct child values that way.
|
|
- For non-scalars, make sure that the last value you remove from the stream is
|
|
the object's ending event (``yamlEndMap`` or ``yamlEndSequence``)
|
|
- Use `peek <yaml.html#peek,YamlStream>`_ for inspecting the next event in
|
|
the ``YamlStream`` without removing it.
|
|
- Never write a ``constructObject`` proc for a ``ref`` type. ``ref`` types are
|
|
always handled by NimYAML itself. You can only customize the construction of
|
|
the underlying object.
|
|
|
|
The following example for constructing from a YAML scalar value is the actual
|
|
implementation of constructing ``int`` types:
|
|
|
|
.. code-block:: nim
|
|
|
|
proc constructObject*[T: int8|int16|int32|int64](
|
|
s: var YamlStream, c: ConstructionContext, result: var T)
|
|
{.raises: [YamlConstructionError, YamlStreamError].} =
|
|
var item: YamlStreamEvent
|
|
constructScalarItem(s, item, name(T)):
|
|
result = T(parseBiggestInt(item.scalarContent))
|
|
|
|
The following example for constructiong from a YAML non-scalar is the actual
|
|
implementation of constructing ``seq`` types:
|
|
|
|
.. code-block:: nim
|
|
|
|
proc constructObject*[T](s: var YamlStream, c: ConstructionContext,
|
|
result: var seq[T])
|
|
{.raises: [YamlConstructionError, YamlStreamError].} =
|
|
let event = s.next()
|
|
if event.kind != yamlStartSequence:
|
|
raise newException(YamlConstructionError, "Expected sequence start")
|
|
result = newSeq[T]()
|
|
while s.peek().kind != yamlEndSequence:
|
|
var item: T
|
|
constructChild(s, c, item)
|
|
result.add(item)
|
|
discard s.next()
|
|
|
|
representObject
|
|
---------------
|
|
|
|
.. code-block:: nim
|
|
|
|
proc representObject*(value: MyObject, ts: TagStyle = tsNone,
|
|
c: SerializationContext, tag: TagId): {.raises: [].}
|
|
|
|
This proc should push a list of tokens that represent the type into the
|
|
serialization context via ``c.put``. Follow the following guidelines when
|
|
implementing a custom ``representObject`` proc:
|
|
|
|
- You can use the helper template
|
|
`presentTag <yaml.html#presentTag,typedesc,TagStyle>`_ for outputting the
|
|
tag.
|
|
- Always output the first token with a ``yAnchorNone``. Anchors will be set
|
|
automatically by ``ref`` type handling.
|
|
- When outputting non-scalar types, you should use the ``representObject``
|
|
implementation of the child types, if possible.
|
|
- Always use the ``tag`` parameter as tag for the first token you generate.
|
|
- Never write a ``representObject`` proc for ``ref`` types.
|
|
|
|
The following example for representing to a YAML scalar is the actual
|
|
implementation of representing ``int`` types:
|
|
|
|
.. code-block:: nim
|
|
|
|
proc representObject*[T: int8|int16|int32|int64](value: T, ts: TagStyle,
|
|
c: SerializationContext, tag: TagId) {.raises: [].} =
|
|
## represents an integer value as YAML scalar
|
|
c.put(scalarEvent($value, tag, yAnchorNone))
|
|
|
|
The following example for representing to a YAML non-scalar is the actual
|
|
implementation of representing ``seq`` and ``set`` types:
|
|
|
|
.. code-block:: nim
|
|
|
|
proc representObject*[T](value: seq[T]|set[T], ts: TagStyle,
|
|
c: SerializationContext, tag: TagId) {.raises: [YamlStreamError].} =
|
|
## represents a Nim seq as YAML sequence
|
|
let childTagStyle = if ts == tsRootOnly: tsNone else: ts
|
|
c.put(startSeqEvent(tag))
|
|
for item in value:
|
|
representChild(item, childTagStyle, c)
|
|
c.put(endSeqEvent()) |