376 lines
12 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# nim-toml-serialization
[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)
[![License: Apache](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
[![TOML](assets/badge-TOML.svg)](https://github.com/toml-lang/toml/releases/tag/1.0.0-rc.2)
![Stability: experimental](https://img.shields.io/badge/stability-experimental-orange.svg)
![nimble](https://img.shields.io/badge/available%20on-nimble-yellow.svg?style=flat-square)
![Github action](https://github.com/status-im/nim-toml-serialization/workflows/CI/badge.svg)
Flexible TOML serialization [not] relying on run-time type information.
## Overview
nim-toml-serialization is a member of [nim-serialization](https://github.com/status-im/nim-serialization)
family and provides several operation modes:
- Decode into Nim data types without any intermediate steps using only a **subset** of TOML.
- Unlike typical lexer based parser, nim-toml-serialization is very efficient because
the parser convert text directly into Nim data types and using no intermediate `token`.
- Decode into Nim data types mixed with `TomlValueRef` to parse any valid TOML value.
- Using `TomlValueRef` can offer more flexibility but also require more memory.
If you can avoid using dotted key, there is no reason to use `TomlValueRef`.
- Decode into `TomlValueRef` from any valid TOML.
- Encode Nim data types into a **subset** of TOML.
- Encode `TomlValueRef` into full spec TOML.
- Both encoder and decoder support `keyed` mode.
- Allow skipping unknown fields using `TomlUnknownFields` flag.
- Skipping unknown fields also done efficiently, no token produced.
But skipped fields should contains valid TOML value or the parser will raise exception.
## Spec compliance
nim-toml-serialization implements [v1.0.0](https://github.com/toml-lang/toml/releases/tag/1.0.0)
TOML spec and pass these test suites:
- [iarna toml test suite](https://github.com/iarna/toml-spec-tests)
- [burntsushi toml test suite](https://github.com/BurntSushi/toml-test)
## Non standard features
- TOML key comparison according to the spec is case sensitive and this is the default mode
for both encoder/decoder. But nim-toml-serialization also support:
- Case insensitive key comparison.
- Nim ident sensitivity key comparison mode (only the first char is case sensitive).
TOML key supports Unicode chars but the comparison mentioned above only applied to ASCII chars.
- TOML inline table disallow newline inside the table.
nim-toml-serialization provide a switch to enable newline in inline table via `TomlInlineTableNewline`.
- TOML standard does not support xHH escape sequence, only uHHHH or UHHHHHHHH.
Use `TomlHexEscape` to enable this feature, otherwise it will raise exception.
- TOML standard requires time in HH:MM:SS format, `TomlHourMinute` flags will allow HH:MM format.
## Keyed mode
When decoding, only object, tuple or `TomlValueRef` allowed at top level.
All others Nim basic datatypes such as floats, ints, array, boolean must
be a value of a key.
nim-toml-serialization offers `keyed` mode decoding to overcome this limitation.
The parser can skip any non-matching key-value pair efficiently because
the parser produce no token but at the same time can validate the syntax correctly.
```toml
[server]
name = "TOML Server"
port = 8005
```
```Nim
var x = Toml.decode(rawToml, string, "server.name")
assert x == "TOML Server"
or
var y = Toml.decode(rawToml, string, "server.name", caseSensitivity)
```
where `caseSensitivity` is one of:
- TomlCaseSensitive
- TomlCaseInsensitive
- TomlCaseNim
Key must be valid Toml basic-key, quoted-key, or dotted-key.
Gotcha:
```toml
server = { ip = "127.0.0.1", port = 8005, name = "TOML Server" }
```
It maybe tempting to use keyed mode for above example like this:
```nim
var x = Toml.decode(rawToml, string, "server.name")
```
But it won't work because the grammar of TOML make it very difficult
to `exit` from inline-table-parser in a clean way.
## Decoder
```nim
type
NimServer = object
name: string
port: int
MixedServer = object
name: TomlValueRef
port: int
StringServer = object
name: string
port: string
# decode into native Nim
var nim_native = Toml.decode(rawtoml, NimServer)
# decode into mixed Nim + TomlValueRef
var nim_mixed = Toml.decode(rawtoml, MixedServer)
# decode any value into string
var nim_string = Toml.decode(rawtoml, StringServer)
# decode any valid TOML
var toml_value = Toml.decode(rawtoml, TomlValueRef)
```
## Parse inline table with newline
```toml
# this is a non standard toml
server = {
ip = "127.0.0.1",
port = 8005,
name = "TOML Server"
}
```
```Nim
# turn on newline in inline table mode
var x = Toml.decode(rawtoml, Server, flags = {TomlInlineTableNewline})
```
## Load and save
```Nim
var server = Toml.loadFile("filename.toml", Server)
var ip = Toml.loadFile("filename.toml", string, "server.ip")
Toml.saveFile("filename.toml", server)
Toml.saveFile("filename.toml", ip, "server.ip")
Toml.saveFile("filename.toml", server, flags = {TomlInlineTableNewline})
```
## TOML we can['t] do
- Date Time.
TOML date time format is described in [RFC 3339](https://tools.ietf.org/html/rfc3339).
When parsing TOML date time, use `string`, `TomlDateTime`, or `TomlValueRef`.
- Date.
You can parse TOML date using `string`, `TomlDate`, `TomlDateTime`, or `TomlValueRef`.
- Time.
You can parse TOML time using `string`, `TomlTime`, `TomlDateTime`, or `TomlValueRef`.
- Heterogenous array.
When parsing heterogenous array, use `string` or `TomlValueRef`.
- Floats.
Floats should be implemented as IEEE 754 binary64 values.
Standard TOML float are float64.
When parsing floats, use `string` or `TomlValueRef` or `SomeFloat`.
- Integers.
TOML integer is an 64 bit (signed long) range expected (9,223,372,036,854,775,808 to 9,223,372,036,854,775,807).
When parsing integers, use `string` or `SomeInteger`, or `TomlValueRef`.
- Array of tables.
Array of tables can be parsed via `TomlValueRef` or parsed as a field of object.
Parsing with keyed mode also works.
- Dotted key.
When parse into nim object, key must not a dotted key.
Dotted key is supported via `keyed` decoding or `TomlValueRef`.
## Option[T]
Option[T] works as usual.
## Bignum
TOML integer maxed at int64. But nim-toml-serialization can extend this to arbitrary precision bignum.
Parsing bignum is achieved via helper function `parseNumber`.
```Nim
# this is an example how to parse bignum with `parseNumber` and `stint`.
import stint, toml_serialization
proc readValue*(r: var TomlReader, value: var Uint256) =
var z: string
let (sign, base) = r.parseNumber(z)
if sign == Sign.Neg:
raiseTomlErr(r.lex, errNegateUint)
case base
of base10: value = parse(z, Uint256, 10)
of base16: value = parse(z, Uint256, 16)
of base8: value = parse(z, Uint256, 8)
of base2: value = parse(z, Uint256, 2)
var z = Toml.decode("bignum = 1234567890_1234567890", Uint256, "bignum")
assert $z == "12345678901234567890"
```
## Table
Decoding a table can be achieved via `parseTable` template.
To parse the value, you can use one of the helper functions or use `readValue`.
Table can be used to parse top level value, regular table, and inline table
in a manner similar to object.
No builtin `readValue` for table provided, you must overload it yourself depends on your need.
`Table` can be stdlib table, ordered table, table ref, or any table like data types.
```Nim
proc readValue*(r: var TomlReader, table: var Table[string, int]) =
parseTable(r, key):
table[key] = r.parseInt(int)
```
## Sets and list-like
Similar to `Table`, sets and list or array like data structure can be parsed using
`parseList` template. It come with two flavors, indexed and non indexed.
Builtin `readValue` for regular `seq` and `array` are implemented for you.
No builtin `readValue` for `set` or `set-like` provided, you must overload it yourself depends on your need.
```nim
type
HoldArray = object
data: array[3, int]
HoldSeq = object
data: seq[int]
WelderFlag = enum
TIG
MIG
MMA
Welder = object
flags: set[WelderFlag]
proc readValue*(r: var TomlReader, value: var HoldArray) =
# parseList with index, `i` can be any valid identifier
r.parseList(i):
value.data[i] = r.parseInt(int)
proc readValue*(r: var TomlReader, value: var HoldSeq) =
# parseList without index
r.parseList:
let lastPos = value.data.len
value.data.setLen(lastPos + 1)
readValue(r, value.data[lastPos])
proc readValue*(r: var TomlReader, value: var Welder) =
# populating set also okay
r.parseList:
value.flags.incl r.parseEnum(WelderFlag)
```
## Enums
There is no enums in TOML specification. The reader/decoder is able to parse both
`ordinal` or `string` representation of an enum. While on the other hand, the
writer/encoder only have `ordinal` builtin writer. But that is not a limitation,
you can always overload the `writeValue` to produce whatever representation of
enum you need.
`Ordinal` representation of an enum is TOML integer, and `string` representation
is TOML `basic string` or `literal string`. Both multi-line basic string(e.g. """TOML""") or
multi-line literal string(e.g. '''TOML''') are not allowed for enum value.
```toml
# fruits.toml
fruit1 = "Apple" # basic string
fruit2 = 1 # ordinal value
fruit3 = 'Orange' # literal string
```
```Nim
type
Fruits = enum
Apple
Banana
Orange
FruitBasket = object
fruit1: Fruits
fruit2: Fruits
fruit3: Fruits
var x = Toml.loadFile("fruits.toml", FruitBasket)
assert x.fruit1 == Apple
assert x.fruit2 == Banana
assert x.fruit3 == Orange
# write enum output as string
proc writeValue*(w: var TomlWriter, val: Fruits) =
w.writeValue $val
let z = FruitBasket(fruit1: Apple, fruit2: Banana, fruit3: Orange)
let res = Toml.encode(z)
assert res == "fruit1 = \"Apple\"\nfruit2 = \"Banana\"\nfruit3 = \"Orange\"\n"
```
## Helper functions
- `parseNumber(r: var TomlReader, value: var string): (Sign, NumberBase)`
- `parseDateTime(r: var TomlReader): TomlDateTime`
- `parseString(r: var TomlReader, value: var string): (bool, bool)`
- `parseAsString(r: var TomlReader): string`
- `parseFloat(r: var TomlReader, value: var string): Sign`
- `parseTime(r: var TomlReader): TomlTime`
- `parseDate(r: var TomlReader): TomlDate`
- `parseValue(r: var TomlReader): TomlValueRef`
- `parseEnum(r: var TomlReader, T: type enum): T`
- `parseInt(r: var TomlReader, T: type SomeInteger): T`
`parseAsString` can parse any valid TOML value into Nim string including mixed array or inline table.
`parseString` return a tuple:
- field 0:
- false: is a single line string.
- true: is a multi line string.
- field 1:
- false: is a basic string.
- true: is a literal string.
`Sign` can be one of:
- `Sign.None`
- `Sign.Pos`
- `Sign.Neg`
## Implementation specifics
TomlTime contains subsecond field. The spec says the precision is implementation specific.
In nim-toml-serialization the default is 6 digits precision.
Longer precision will be truncated by the parser.
You can override this using compiler switch `-d:subsecondPrecision=numDigits`.
## Installation
You can install the developement version of the library through nimble with the following command
```
nimble install https://github.com/status-im/nim-toml-serialization@#master
```
or install latest release version
```
nimble install toml_serialization
```
## License
Licensed and distributed under either of
* MIT license: [LICENSE-MIT](LICENSE-MIT) or http://opensource.org/licenses/MIT
or
* Apache License, Version 2.0, ([LICENSE-APACHEv2](LICENSE-APACHEv2) or http://www.apache.org/licenses/LICENSE-2.0)
at your option. This file may not be copied, modified, or distributed except according to those terms.
## Credits
A portion of toml decoder was taken from PMunch's [`parsetoml`](https://github.com/NimParsers/parsetoml)