update unicode section in toc.md
This commit is contained in:
parent
c1f75b87f3
commit
4e0b7cbced
11
docs/toc.md
11
docs/toc.md
|
@ -89,13 +89,14 @@ can bring down the service, both lexer and parser are configurable to mitigate t
|
|||
- Input string:
|
||||
- Accepted encoding for input string are UTF-8.
|
||||
- Escaped unicode in quoted string take the form of UTF-16 BE:
|
||||
- Fixed 4 digit hex: e.g. `\u000A`
|
||||
- variable length: `\u{1F4A9}` with range (>= 0x0000 and <= 0xD7FF or >= 0xE000 and <= 0x10FFFF)
|
||||
- Fixed length notation using 4 digit hex: e.g. `\u000A`.
|
||||
- Variable length notation using curly braces `\u{1F4A9}` with range (0x0000..0xD7FF, 0xE000..0x10FFFF).
|
||||
- Escape sequences are only meaningful within a single-quoted string.
|
||||
In multiline string, unicode char must be encoded using UTF-8.
|
||||
- SurrogatePair: "\uD83D\uDCA9" is equal to "\u{1F4A9}"
|
||||
- In multiline string, unicode char must be encoded using UTF-8.
|
||||
- Surrogate pair using fixed length notation "\uD83D\uDCA9" is equal to variable length notation "\u{1F4A9}".
|
||||
- Orphaned surrogate will result in error.
|
||||
|
||||
- Output string:
|
||||
- Output string subject to output serialization format specification.
|
||||
- For example, output using json as serialization format will result in UTF-8 encoded string.
|
||||
- Or if the escape flag is set, it will use UTF-16 BE 4 digit hex fixed length similar to GraphQL escape sequence.
|
||||
- If the escape flag is set, it will use UTF-16 BE 4 digit hex fixed length similar to GraphQL escape sequence.
|
||||
|
|
Loading…
Reference in New Issue