diff --git a/CHANGELOG.md b/CHANGELOG.md index 20d8d6a..7e2cd99 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,3 +1,17 @@ +## 1.0.0 (upcoming) + +Features: + + * ``YamlNode`` can now be used with the serialization API (``load`` / ``dump``) + and can be used to hold substructures that should not be deserialized to + native types (#48). + +Bufgixes: + + * Raise a proper exception when a stream contains no documents but one is + expected (#108) + * Comments after a block scalar do not lead to a crash anymore (#106) + ## 0.16.0 Features: diff --git a/doc/api.txt b/doc/api.txt index 9d38333..c56a05e 100644 --- a/doc/api.txt +++ b/doc/api.txt @@ -25,8 +25,8 @@ Intermediate Representation =========================== The base of all YAML processing with NimYAML is the -`YamlStream `_. This is basically an iterator over -`YamlStreamEvent `_ objects. Every proc that +`YamlStream `_. This is basically an iterator over +`YamlStreamEvent `_ objects. Every proc that represents a single stage of the loading or dumping process will either take a ``YamlStream`` as input or return a ``YamlStream``. Procs that implement the whole process in one step hide the ``YamlStream`` from the user. Every proc that @@ -45,14 +45,14 @@ Loading YAML ============ If you want to load YAML character data directly into a native Nim variable, you -can use `load `_. This is the easiest and +can use `load `_. This is the easiest and recommended way to load YAML data. This section gives an overview about how ``load`` is implemented. It is absolutely possible to reimplement the loading step using the low-level API. -For parsing, a `YamlParser `_ object is needed. +For parsing, a `YamlParser `_ object is needed. This object stores some state while parsing that may be useful for error -reporting to the user. The `parse `_ +reporting to the user. The `parse `_ proc implements the YAML processing step of the same name. All syntax errors in the input character stream are processed by ``parse``, which will raise a ``YamlParserError`` if it encounters a syntax error. @@ -68,13 +68,13 @@ Dumping YAML ============ Dumping is preferredly done with -`dump `_, +`dump `_, which serializes a native Nim variable to a character stream. As with ``load``, the following paragraph describes how ``dump`` is implemented using the low-level API. A Nim value is transformed into a ``YamlStream`` with -`represent `_. +`represent `_. Depending on the ``AnchorStyle`` you specify, this will transform ``ref`` variables with multiple instances into anchored elements and aliases (for ``asTidy`` and ``asAlways``) or write the same element into all places it @@ -82,7 +82,7 @@ occurs (for ``asNone``). Be aware that if you use ``asNone``, the value you serialize might not round-trip. Transforming a ``YamlStream`` into YAML character data is done with -`present `_. +`present `_. You can choose from multiple presentation styles. ``psJson`` is not able to process some features of ``YamlStream`` s, the other styles support all features and are guaranteed to round-trip to the same ``YamlStream`` if you parse the @@ -91,21 +91,19 @@ generated YAML character stream again. The Document Object Model ========================= -Much like XML, YAML also defines a *document object model*. If you cannot or do -not want to load a YAML character stream to native Nim types, you can instead -load it into a `YamlDocument `_. This -``YamlDocument`` can also be serialized into a YAML character stream. All tags -will be preserved exactly as they are when transforming from and to a -``YamlDocument``. The only important thing to remember is that when a value has -no tag, it will get the non-specific tag ``"!"`` for quoted scalars and ``"?"`` -for all other nodes. +Unlike XML, YAML does not define an official *document object model*. However, +if you cannot or do not want to load a YAML input stream to native Nim types, +you can load it into the predefined type `YamlNode `_. +You can also use this type inside your native types to deserialize parts of the +YAML input into it. Likewise, you can serialize a ``YamlNode`` into YAML. You +can use this to preserve parts of YAML data you do not wish to or cannot fully +deserialize. -While tags are preserved, anchors will be resolved during loading and re-added -during serialization. It is allowed for a ``YamlNode`` to occur multiple times -within a ``YamlDocument``, in which case it will be serialized once and referred -to afterwards via aliases. +A ``YamlNode`` preserves its given tag and the tags of any child nodes. However, +anchors will be resolved during loading and re-added during serialization. It is +allowed for a ``YamlNode`` to occur multiple times within source/target root +object, in which case it will be serialized once and referred to afterwards via +aliases. -The document object model is provided for completeness, but you are encouraged -to use native Nim types as start- or endpoint instead. That may be significantly -faster, as every ``YamlNode`` is allocated on the heap and subject to garbage -collection. \ No newline at end of file +``YamlNode`` is allocated on the heap and using it will be slower and consume +more memory than deserializing into native types. \ No newline at end of file diff --git a/test/tdom.nim b/test/tdom.nim index d4d03b4..d73da8f 100644 --- a/test/tdom.nim +++ b/test/tdom.nim @@ -94,4 +94,30 @@ suite "DOM": startSeqEvent(anchor="a".Anchor), scalarEvent("a", anchor = "b".Anchor), scalarEvent("b", anchor="c".Anchor), aliasEvent("b".Anchor), - endSeqEvent(), endDocEvent(), endStreamEvent()) \ No newline at end of file + endSeqEvent(), endDocEvent(), endStreamEvent()) + test "Deserialize parts of the input into YamlNode": + let + input = "a: b\nc: [d, e]" + type Root = object + a: string + c: YamlNode + var result = loadAs[Root](input) + assert result.a == "b" + assert result.c.kind == ySequence + assert result.c.len == 2 + assert result.c[0].kind == yScalar + assert result.c[0].content == "d" + assert result.c[1].kind == yScalar + assert result.c[1].content == "e" + test "Serialize value that contains a YamlNode": + type Root = object + a: string + c: YamlNode + let value = Root( + a: "b", + c: newYamlNode([newYamlNode("d"), newYamlNode("e")])) + var result = represent(value, tsNone, handles = @[]) + ensure(result, startStreamEvent(), startDocEvent(), startMapEvent(), + scalarEvent("a"), scalarEvent("b"), scalarEvent("c"), startSeqEvent(), + scalarEvent("d"), scalarEvent("e"), endSeqEvent(), endMapEvent(), + endDocEvent(), endStreamEvent()) \ No newline at end of file diff --git a/yaml/dom.nim b/yaml/dom.nim index 942dade..d733d2b 100644 --- a/yaml/dom.nim +++ b/yaml/dom.nim @@ -32,7 +32,7 @@ type YamlNodeKind* = enum yScalar, yMapping, ySequence - YamlNode* = ref YamlNodeObj not nil + YamlNode* = ref YamlNodeObj ## Represents a node in a ``YamlDocument``. YamlNodeObj* = object @@ -44,6 +44,7 @@ type # compiler does not like Table[YamlNode, YamlNode] YamlDocument* = object + {.deprecated: "use YamlNode with serialization API instead".} ## Represents a YAML document. root*: YamlNode @@ -135,8 +136,9 @@ proc newYamlNode*(fields: openarray[(YamlNode, YamlNode)], proc initYamlDoc*(root: YamlNode): YamlDocument = result = YamlDocument(root: root) -proc composeNode(s: var YamlStream, c: ConstructionContext): - YamlNode {.raises: [YamlStreamError, YamlConstructionError].} = +proc constructChild*(s: var YamlStream, c: ConstructionContext, + result: var YamlNode) + {.raises: [YamlStreamError, YamlConstructionError].} = template addAnchor(c: ConstructionContext, target: Anchor) = if target != yAnchorNone: yAssert(not c.refs.hasKey(target)) @@ -144,77 +146,55 @@ proc composeNode(s: var YamlStream, c: ConstructionContext): var start: Event shallowCopy(start, s.next()) - new(result) - try: - case start.kind - of yamlStartMap: - result = YamlNode(tag: start.mapProperties.tag, - kind: yMapping, - fields: newTable[YamlNode, YamlNode]()) - while s.peek().kind != yamlEndMap: - let - key = composeNode(s, c) - value = composeNode(s, c) - if result.fields.hasKeyOrPut(key, value): - raise newException(YamlConstructionError, - "Duplicate key: " & $key) - discard s.next() - addAnchor(c, start.mapProperties.anchor) - of yamlStartSeq: - result = YamlNode(tag: start.seqProperties.tag, - kind: ySequence, - elems: newSeq[YamlNode]()) - while s.peek().kind != yamlEndSeq: - result.elems.add(composeNode(s, c)) - addAnchor(c, start.seqProperties.anchor) - discard s.next() - of yamlScalar: - result = YamlNode(tag: start.scalarProperties.tag, - kind: yScalar) - shallowCopy(result.content, start.scalarContent) - addAnchor(c, start.scalarProperties.anchor) - of yamlAlias: - result = cast[YamlNode](c.refs[start.aliasTarget].p) - else: internalError("Malformed YamlStream") - except KeyError: - raise newException(YamlConstructionError, - "Wrong tag library: TagId missing") + + case start.kind + of yamlStartMap: + result = YamlNode(tag: start.mapProperties.tag, + kind: yMapping, + fields: newTable[YamlNode, YamlNode]()) + while s.peek().kind != yamlEndMap: + var + key: YamlNode = nil + value: YamlNode = nil + constructChild(s, c, key) + constructChild(s, c, value) + if result.fields.hasKeyOrPut(key, value): + raise newException(YamlConstructionError, + "Duplicate key: " & $key) + discard s.next() + addAnchor(c, start.mapProperties.anchor) + of yamlStartSeq: + result = YamlNode(tag: start.seqProperties.tag, + kind: ySequence, + elems: newSeq[YamlNode]()) + while s.peek().kind != yamlEndSeq: + var item: YamlNode = nil + constructChild(s, c, item) + result.elems.add(item) + addAnchor(c, start.seqProperties.anchor) + discard s.next() + of yamlScalar: + result = YamlNode(tag: start.scalarProperties.tag, + kind: yScalar) + shallowCopy(result.content, start.scalarContent) + addAnchor(c, start.scalarProperties.anchor) + of yamlAlias: + result = cast[YamlNode](c.refs.getOrDefault(start.aliasTarget).p) + else: internalError("Malformed YamlStream") proc compose*(s: var YamlStream): YamlDocument - {.raises: [YamlStreamError, YamlConstructionError].} = - var context = newConstructionContext() - var n: Event - shallowCopy(n, s.next()) - yAssert n.kind == yamlStartDoc - result = YamlDocument(root: composeNode(s, context)) - n = s.next() - yAssert n.kind == yamlEndDoc + {.raises: [YamlStreamError, YamlConstructionError], + deprecated: "use construct(s, root) instead".} = + construct(s, result.root) proc loadDom*(s: Stream | string): YamlDocument - {.raises: [IOError, OSError, YamlParserError, YamlConstructionError].} = - var - parser = initYamlParser() - events = parser.parse(s) - e: Event - try: - e = events.next() - yAssert(e.kind == yamlStartStream) - result = compose(events) - e = events.next() - if e.kind != yamlEndStream: - raise newYamlConstructionError(events, e.startPos, "stream contains multiple documents") - except YamlStreamError: - let ex = getCurrentException() - if ex.parent of YamlParserError: - raise (ref YamlParserError)(ex.parent) - elif ex.parent of IOError: - raise (ref IOError)(ex.parent) - elif ex.parent of OSError: - raise (ref OSError)(ex.parent) - else: internalError("Unexpected exception: " & ex.parent.repr) + {.raises: [IOError, OSError, YamlParserError, YamlConstructionError] + deprecated: "use loadAs[YamlNode](s) instead".} = + load(s, result.root) proc loadMultiDom*(s: Stream | string): seq[YamlDocument] - {.raises: [IOError, OSError, YamlParserError, YamlConstructionError].} = + {.raises: [IOError, OSError, YamlParserError, YamlConstructionError] + deprecated: "use loadMultiDoc[YamlNode](s, target) instead".} = var parser = initYamlParser(tagLib) events = parser.parse(s) @@ -236,68 +216,33 @@ proc loadMultiDom*(s: Stream | string): seq[YamlDocument] raise (ref OSError)(ex.parent) else: internalError("Unexpected exception: " & ex.parent.repr) -proc serializeNode(n: YamlNode, c: SerializationContext, a: AnchorStyle) - {.raises: [].}= - var anchor = yAnchorNone - let p = cast[pointer](n) - if a != asNone and c.refs.hasKey(p): - anchor = c.refs.getOrDefault(p).a - c.refs[p] = (anchor, true) - c.put(aliasEvent(anchor)) - return - if a != asNone: - anchor = c.nextAnchorId.Anchor - c.refs[p] = (c.nextAnchorId.Anchor, false) - nextAnchor(c.nextAnchorId, len(c.nextAnchorId) - 1) - - case n.kind - of yScalar: c.put(scalarEvent(n.content, n.tag, anchor)) +proc representChild*(value: YamlNodeObj, ts: TagStyle, + c: SerializationContext) = + let childTagStyle = if ts == tsRootOnly: tsNone else: ts + case value.kind + of yScalar: c.put(scalarEvent(value.content, value.tag)) of ySequence: - c.put(startSeqEvent(csBlock, (anchor, n.tag))) - for item in n.elems: - serializeNode(item, c, a) + c.put(startSeqEvent(tag = value.tag)) + for item in value.elems: representChild(item, childTagStyle, c) c.put(endSeqEvent()) of yMapping: - c.put(startMapEvent(csBlock, (anchor, n.tag))) - for key, value in n.fields.pairs: - serializeNode(key, c, a) - serializeNode(value, c, a) + c.put(startMapEvent(tag = value.tag)) + for key, value in value.fields.pairs: + representChild(key, childTagStyle, c) + representChild(value, childTagStyle, c) c.put(endMapEvent()) -proc serialize*(doc: YamlDocument, a: AnchorStyle = asTidy): - YamlStream {.raises: [].} = - var - bys = newBufferYamlStream() - c = newSerializationContext(a, proc(e: Event) {.raises: [].} = - bys.put(e) - ) - c.put(startStreamEvent()) - c.put(startDocEvent()) - serializeNode(doc.root, c, a) - c.put(endDocEvent()) - c.put(endStreamEvent()) - if a == asTidy: - var ctx = initAnchorContext() - for event in bys.mitems(): - case event.kind - of yamlScalar: ctx.process(event.scalarProperties, c.refs) - of yamlStartMap: ctx.process(event.mapProperties, c.refs) - of yamlStartSeq: ctx.process(event.seqProperties, c.refs) - of yamlAlias: - event.aliasTarget = ctx.map(event.aliasTarget) - else: discard - result = bys +proc serialize*(doc: YamlDocument, a: AnchorStyle = asTidy): YamlStream + {.deprecated: "use represent[YamlNode] instead".} = + result = represent(doc.root, tsAll, a = a, handles = @[]) proc dumpDom*(doc: YamlDocument, target: Stream, anchorStyle: AnchorStyle = asTidy, options: PresentationOptions = defaultPresentationOptions) - {.raises: [YamlPresenterJsonError, YamlPresenterOutputError, - YamlStreamError].} = + {.deprecated: "use dump[YamlNode] instead".} = ## Dump a YamlDocument as YAML character stream. - var - events = serialize(doc, - if options.style == psJson: asNone else: anchorStyle) - present(events, target, options) + dump(doc.root, target, tsAll, anchorStyle = anchorStyle, options = options, + handles = @[]) proc `[]`*(node: YamlNode, i: int): YamlNode = ## Get the node at index *i* from a sequence. *node* must be a *ySequence*. diff --git a/yaml/serialization.nim b/yaml/serialization.nim index 0a75412..40d8a4e 100644 --- a/yaml/serialization.nim +++ b/yaml/serialization.nim @@ -1340,6 +1340,8 @@ proc load*[K](input: Stream | string, target: var K) events = parser.parse(input) e = events.next() yAssert(e.kind == yamlStartStream) + if events.peek().kind != yamlStartDoc: + raise constructionError(events, e.startPos, "stream contains no documents") construct(events, target) e = events.next() if e.kind != yamlEndStream: