Deprecated DOM API, YamlNode now supported by serialization API

* implements #48
 * also fixes #108
 * updated docs
This commit is contained in:
Felix Krause 2022-06-05 12:27:19 +02:00
parent 8993f928d4
commit 741fd18047
5 changed files with 130 additions and 145 deletions

View File

@ -1,3 +1,17 @@
## 1.0.0 (upcoming)
Features:
* ``YamlNode`` can now be used with the serialization API (``load`` / ``dump``)
and can be used to hold substructures that should not be deserialized to
native types (#48).
Bufgixes:
* Raise a proper exception when a stream contains no documents but one is
expected (#108)
* Comments after a block scalar do not lead to a crash anymore (#106)
## 0.16.0 ## 0.16.0
Features: Features:

View File

@ -25,8 +25,8 @@ Intermediate Representation
=========================== ===========================
The base of all YAML processing with NimYAML is the The base of all YAML processing with NimYAML is the
`YamlStream <yaml.stream.html#YamlStream>`_. This is basically an iterator over `YamlStream <api/stream.html#YamlStream>`_. This is basically an iterator over
`YamlStreamEvent <yaml.stream.html#YamlStreamEvent>`_ objects. Every proc that `YamlStreamEvent <api/stream.html#YamlStreamEvent>`_ objects. Every proc that
represents a single stage of the loading or dumping process will either take a represents a single stage of the loading or dumping process will either take a
``YamlStream`` as input or return a ``YamlStream``. Procs that implement the ``YamlStream`` as input or return a ``YamlStream``. Procs that implement the
whole process in one step hide the ``YamlStream`` from the user. Every proc that whole process in one step hide the ``YamlStream`` from the user. Every proc that
@ -45,14 +45,14 @@ Loading YAML
============ ============
If you want to load YAML character data directly into a native Nim variable, you If you want to load YAML character data directly into a native Nim variable, you
can use `load <yaml.serialization.html#load,,K>`_. This is the easiest and can use `load <api/serialization.html#load,,K>`_. This is the easiest and
recommended way to load YAML data. This section gives an overview about how recommended way to load YAML data. This section gives an overview about how
``load`` is implemented. It is absolutely possible to reimplement the loading ``load`` is implemented. It is absolutely possible to reimplement the loading
step using the low-level API. step using the low-level API.
For parsing, a `YamlParser <yaml.parser.html#YamlParser>`_ object is needed. For parsing, a `YamlParser <api/parser.html#YamlParser>`_ object is needed.
This object stores some state while parsing that may be useful for error This object stores some state while parsing that may be useful for error
reporting to the user. The `parse <yaml.parser.html#parse,YamlParser,Stream>`_ reporting to the user. The `parse <api/parser.html#parse,YamlParser,Stream>`_
proc implements the YAML processing step of the same name. All syntax errors in proc implements the YAML processing step of the same name. All syntax errors in
the input character stream are processed by ``parse``, which will raise a the input character stream are processed by ``parse``, which will raise a
``YamlParserError`` if it encounters a syntax error. ``YamlParserError`` if it encounters a syntax error.
@ -68,13 +68,13 @@ Dumping YAML
============ ============
Dumping is preferredly done with Dumping is preferredly done with
`dump <yaml.serialization.html#dump,K,Stream,TagStyle,AnchorStyle,PresentationOptions>`_, `dump <api/serialization.html#dump,K,Stream,TagStyle,AnchorStyle,PresentationOptions>`_,
which serializes a native Nim variable to a character stream. As with ``load``, which serializes a native Nim variable to a character stream. As with ``load``,
the following paragraph describes how ``dump`` is implemented using the the following paragraph describes how ``dump`` is implemented using the
low-level API. low-level API.
A Nim value is transformed into a ``YamlStream`` with A Nim value is transformed into a ``YamlStream`` with
`represent <yaml.serialization.html#represent,T,TagStyle,AnchorStyle>`_. `represent <api/serialization.html#represent,T,TagStyle,AnchorStyle>`_.
Depending on the ``AnchorStyle`` you specify, this will transform ``ref`` Depending on the ``AnchorStyle`` you specify, this will transform ``ref``
variables with multiple instances into anchored elements and aliases (for variables with multiple instances into anchored elements and aliases (for
``asTidy`` and ``asAlways``) or write the same element into all places it ``asTidy`` and ``asAlways``) or write the same element into all places it
@ -82,7 +82,7 @@ occurs (for ``asNone``). Be aware that if you use ``asNone``, the value you
serialize might not round-trip. serialize might not round-trip.
Transforming a ``YamlStream`` into YAML character data is done with Transforming a ``YamlStream`` into YAML character data is done with
`present <yaml.presenter.html#present,YamlStream,Stream,TagLibrary,PresentationOptions>`_. `present <api/presenter.html#present,YamlStream,Stream,TagLibrary,PresentationOptions>`_.
You can choose from multiple presentation styles. ``psJson`` is not able to You can choose from multiple presentation styles. ``psJson`` is not able to
process some features of ``YamlStream`` s, the other styles support all features process some features of ``YamlStream`` s, the other styles support all features
and are guaranteed to round-trip to the same ``YamlStream`` if you parse the and are guaranteed to round-trip to the same ``YamlStream`` if you parse the
@ -91,21 +91,19 @@ generated YAML character stream again.
The Document Object Model The Document Object Model
========================= =========================
Much like XML, YAML also defines a *document object model*. If you cannot or do Unlike XML, YAML does not define an official *document object model*. However,
not want to load a YAML character stream to native Nim types, you can instead if you cannot or do not want to load a YAML input stream to native Nim types,
load it into a `YamlDocument <yaml.dom.html#YamlDocument>`_. This you can load it into the predefined type `YamlNode <api/dom.html#YamlNode>`_.
``YamlDocument`` can also be serialized into a YAML character stream. All tags You can also use this type inside your native types to deserialize parts of the
will be preserved exactly as they are when transforming from and to a YAML input into it. Likewise, you can serialize a ``YamlNode`` into YAML. You
``YamlDocument``. The only important thing to remember is that when a value has can use this to preserve parts of YAML data you do not wish to or cannot fully
no tag, it will get the non-specific tag ``"!"`` for quoted scalars and ``"?"`` deserialize.
for all other nodes.
While tags are preserved, anchors will be resolved during loading and re-added A ``YamlNode`` preserves its given tag and the tags of any child nodes. However,
during serialization. It is allowed for a ``YamlNode`` to occur multiple times anchors will be resolved during loading and re-added during serialization. It is
within a ``YamlDocument``, in which case it will be serialized once and referred allowed for a ``YamlNode`` to occur multiple times within source/target root
to afterwards via aliases. object, in which case it will be serialized once and referred to afterwards via
aliases.
The document object model is provided for completeness, but you are encouraged ``YamlNode`` is allocated on the heap and using it will be slower and consume
to use native Nim types as start- or endpoint instead. That may be significantly more memory than deserializing into native types.
faster, as every ``YamlNode`` is allocated on the heap and subject to garbage
collection.

View File

@ -95,3 +95,29 @@ suite "DOM":
scalarEvent("a", anchor = "b".Anchor), scalarEvent("a", anchor = "b".Anchor),
scalarEvent("b", anchor="c".Anchor), aliasEvent("b".Anchor), scalarEvent("b", anchor="c".Anchor), aliasEvent("b".Anchor),
endSeqEvent(), endDocEvent(), endStreamEvent()) endSeqEvent(), endDocEvent(), endStreamEvent())
test "Deserialize parts of the input into YamlNode":
let
input = "a: b\nc: [d, e]"
type Root = object
a: string
c: YamlNode
var result = loadAs[Root](input)
assert result.a == "b"
assert result.c.kind == ySequence
assert result.c.len == 2
assert result.c[0].kind == yScalar
assert result.c[0].content == "d"
assert result.c[1].kind == yScalar
assert result.c[1].content == "e"
test "Serialize value that contains a YamlNode":
type Root = object
a: string
c: YamlNode
let value = Root(
a: "b",
c: newYamlNode([newYamlNode("d"), newYamlNode("e")]))
var result = represent(value, tsNone, handles = @[])
ensure(result, startStreamEvent(), startDocEvent(), startMapEvent(),
scalarEvent("a"), scalarEvent("b"), scalarEvent("c"), startSeqEvent(),
scalarEvent("d"), scalarEvent("e"), endSeqEvent(), endMapEvent(),
endDocEvent(), endStreamEvent())

View File

@ -32,7 +32,7 @@ type
YamlNodeKind* = enum YamlNodeKind* = enum
yScalar, yMapping, ySequence yScalar, yMapping, ySequence
YamlNode* = ref YamlNodeObj not nil YamlNode* = ref YamlNodeObj
## Represents a node in a ``YamlDocument``. ## Represents a node in a ``YamlDocument``.
YamlNodeObj* = object YamlNodeObj* = object
@ -44,6 +44,7 @@ type
# compiler does not like Table[YamlNode, YamlNode] # compiler does not like Table[YamlNode, YamlNode]
YamlDocument* = object YamlDocument* = object
{.deprecated: "use YamlNode with serialization API instead".}
## Represents a YAML document. ## Represents a YAML document.
root*: YamlNode root*: YamlNode
@ -135,8 +136,9 @@ proc newYamlNode*(fields: openarray[(YamlNode, YamlNode)],
proc initYamlDoc*(root: YamlNode): YamlDocument = proc initYamlDoc*(root: YamlNode): YamlDocument =
result = YamlDocument(root: root) result = YamlDocument(root: root)
proc composeNode(s: var YamlStream, c: ConstructionContext): proc constructChild*(s: var YamlStream, c: ConstructionContext,
YamlNode {.raises: [YamlStreamError, YamlConstructionError].} = result: var YamlNode)
{.raises: [YamlStreamError, YamlConstructionError].} =
template addAnchor(c: ConstructionContext, target: Anchor) = template addAnchor(c: ConstructionContext, target: Anchor) =
if target != yAnchorNone: if target != yAnchorNone:
yAssert(not c.refs.hasKey(target)) yAssert(not c.refs.hasKey(target))
@ -144,17 +146,18 @@ proc composeNode(s: var YamlStream, c: ConstructionContext):
var start: Event var start: Event
shallowCopy(start, s.next()) shallowCopy(start, s.next())
new(result)
try:
case start.kind case start.kind
of yamlStartMap: of yamlStartMap:
result = YamlNode(tag: start.mapProperties.tag, result = YamlNode(tag: start.mapProperties.tag,
kind: yMapping, kind: yMapping,
fields: newTable[YamlNode, YamlNode]()) fields: newTable[YamlNode, YamlNode]())
while s.peek().kind != yamlEndMap: while s.peek().kind != yamlEndMap:
let var
key = composeNode(s, c) key: YamlNode = nil
value = composeNode(s, c) value: YamlNode = nil
constructChild(s, c, key)
constructChild(s, c, value)
if result.fields.hasKeyOrPut(key, value): if result.fields.hasKeyOrPut(key, value):
raise newException(YamlConstructionError, raise newException(YamlConstructionError,
"Duplicate key: " & $key) "Duplicate key: " & $key)
@ -165,7 +168,9 @@ proc composeNode(s: var YamlStream, c: ConstructionContext):
kind: ySequence, kind: ySequence,
elems: newSeq[YamlNode]()) elems: newSeq[YamlNode]())
while s.peek().kind != yamlEndSeq: while s.peek().kind != yamlEndSeq:
result.elems.add(composeNode(s, c)) var item: YamlNode = nil
constructChild(s, c, item)
result.elems.add(item)
addAnchor(c, start.seqProperties.anchor) addAnchor(c, start.seqProperties.anchor)
discard s.next() discard s.next()
of yamlScalar: of yamlScalar:
@ -174,47 +179,22 @@ proc composeNode(s: var YamlStream, c: ConstructionContext):
shallowCopy(result.content, start.scalarContent) shallowCopy(result.content, start.scalarContent)
addAnchor(c, start.scalarProperties.anchor) addAnchor(c, start.scalarProperties.anchor)
of yamlAlias: of yamlAlias:
result = cast[YamlNode](c.refs[start.aliasTarget].p) result = cast[YamlNode](c.refs.getOrDefault(start.aliasTarget).p)
else: internalError("Malformed YamlStream") else: internalError("Malformed YamlStream")
except KeyError:
raise newException(YamlConstructionError,
"Wrong tag library: TagId missing")
proc compose*(s: var YamlStream): YamlDocument proc compose*(s: var YamlStream): YamlDocument
{.raises: [YamlStreamError, YamlConstructionError].} = {.raises: [YamlStreamError, YamlConstructionError],
var context = newConstructionContext() deprecated: "use construct(s, root) instead".} =
var n: Event construct(s, result.root)
shallowCopy(n, s.next())
yAssert n.kind == yamlStartDoc
result = YamlDocument(root: composeNode(s, context))
n = s.next()
yAssert n.kind == yamlEndDoc
proc loadDom*(s: Stream | string): YamlDocument proc loadDom*(s: Stream | string): YamlDocument
{.raises: [IOError, OSError, YamlParserError, YamlConstructionError].} = {.raises: [IOError, OSError, YamlParserError, YamlConstructionError]
var deprecated: "use loadAs[YamlNode](s) instead".} =
parser = initYamlParser() load(s, result.root)
events = parser.parse(s)
e: Event
try:
e = events.next()
yAssert(e.kind == yamlStartStream)
result = compose(events)
e = events.next()
if e.kind != yamlEndStream:
raise newYamlConstructionError(events, e.startPos, "stream contains multiple documents")
except YamlStreamError:
let ex = getCurrentException()
if ex.parent of YamlParserError:
raise (ref YamlParserError)(ex.parent)
elif ex.parent of IOError:
raise (ref IOError)(ex.parent)
elif ex.parent of OSError:
raise (ref OSError)(ex.parent)
else: internalError("Unexpected exception: " & ex.parent.repr)
proc loadMultiDom*(s: Stream | string): seq[YamlDocument] proc loadMultiDom*(s: Stream | string): seq[YamlDocument]
{.raises: [IOError, OSError, YamlParserError, YamlConstructionError].} = {.raises: [IOError, OSError, YamlParserError, YamlConstructionError]
deprecated: "use loadMultiDoc[YamlNode](s, target) instead".} =
var var
parser = initYamlParser(tagLib) parser = initYamlParser(tagLib)
events = parser.parse(s) events = parser.parse(s)
@ -236,68 +216,33 @@ proc loadMultiDom*(s: Stream | string): seq[YamlDocument]
raise (ref OSError)(ex.parent) raise (ref OSError)(ex.parent)
else: internalError("Unexpected exception: " & ex.parent.repr) else: internalError("Unexpected exception: " & ex.parent.repr)
proc serializeNode(n: YamlNode, c: SerializationContext, a: AnchorStyle) proc representChild*(value: YamlNodeObj, ts: TagStyle,
{.raises: [].}= c: SerializationContext) =
var anchor = yAnchorNone let childTagStyle = if ts == tsRootOnly: tsNone else: ts
let p = cast[pointer](n) case value.kind
if a != asNone and c.refs.hasKey(p): of yScalar: c.put(scalarEvent(value.content, value.tag))
anchor = c.refs.getOrDefault(p).a
c.refs[p] = (anchor, true)
c.put(aliasEvent(anchor))
return
if a != asNone:
anchor = c.nextAnchorId.Anchor
c.refs[p] = (c.nextAnchorId.Anchor, false)
nextAnchor(c.nextAnchorId, len(c.nextAnchorId) - 1)
case n.kind
of yScalar: c.put(scalarEvent(n.content, n.tag, anchor))
of ySequence: of ySequence:
c.put(startSeqEvent(csBlock, (anchor, n.tag))) c.put(startSeqEvent(tag = value.tag))
for item in n.elems: for item in value.elems: representChild(item, childTagStyle, c)
serializeNode(item, c, a)
c.put(endSeqEvent()) c.put(endSeqEvent())
of yMapping: of yMapping:
c.put(startMapEvent(csBlock, (anchor, n.tag))) c.put(startMapEvent(tag = value.tag))
for key, value in n.fields.pairs: for key, value in value.fields.pairs:
serializeNode(key, c, a) representChild(key, childTagStyle, c)
serializeNode(value, c, a) representChild(value, childTagStyle, c)
c.put(endMapEvent()) c.put(endMapEvent())
proc serialize*(doc: YamlDocument, a: AnchorStyle = asTidy): proc serialize*(doc: YamlDocument, a: AnchorStyle = asTidy): YamlStream
YamlStream {.raises: [].} = {.deprecated: "use represent[YamlNode] instead".} =
var result = represent(doc.root, tsAll, a = a, handles = @[])
bys = newBufferYamlStream()
c = newSerializationContext(a, proc(e: Event) {.raises: [].} =
bys.put(e)
)
c.put(startStreamEvent())
c.put(startDocEvent())
serializeNode(doc.root, c, a)
c.put(endDocEvent())
c.put(endStreamEvent())
if a == asTidy:
var ctx = initAnchorContext()
for event in bys.mitems():
case event.kind
of yamlScalar: ctx.process(event.scalarProperties, c.refs)
of yamlStartMap: ctx.process(event.mapProperties, c.refs)
of yamlStartSeq: ctx.process(event.seqProperties, c.refs)
of yamlAlias:
event.aliasTarget = ctx.map(event.aliasTarget)
else: discard
result = bys
proc dumpDom*(doc: YamlDocument, target: Stream, proc dumpDom*(doc: YamlDocument, target: Stream,
anchorStyle: AnchorStyle = asTidy, anchorStyle: AnchorStyle = asTidy,
options: PresentationOptions = defaultPresentationOptions) options: PresentationOptions = defaultPresentationOptions)
{.raises: [YamlPresenterJsonError, YamlPresenterOutputError, {.deprecated: "use dump[YamlNode] instead".} =
YamlStreamError].} =
## Dump a YamlDocument as YAML character stream. ## Dump a YamlDocument as YAML character stream.
var dump(doc.root, target, tsAll, anchorStyle = anchorStyle, options = options,
events = serialize(doc, handles = @[])
if options.style == psJson: asNone else: anchorStyle)
present(events, target, options)
proc `[]`*(node: YamlNode, i: int): YamlNode = proc `[]`*(node: YamlNode, i: int): YamlNode =
## Get the node at index *i* from a sequence. *node* must be a *ySequence*. ## Get the node at index *i* from a sequence. *node* must be a *ySequence*.

View File

@ -1340,6 +1340,8 @@ proc load*[K](input: Stream | string, target: var K)
events = parser.parse(input) events = parser.parse(input)
e = events.next() e = events.next()
yAssert(e.kind == yamlStartStream) yAssert(e.kind == yamlStartStream)
if events.peek().kind != yamlStartDoc:
raise constructionError(events, e.startPos, "stream contains no documents")
construct(events, target) construct(events, target)
e = events.next() e = events.next()
if e.kind != yamlEndStream: if e.kind != yamlEndStream: