diff --git a/docs.wrm/api/utils/hashing.wrm b/docs.wrm/api/utils/hashing.wrm index 88981753..1eb58e84 100644 --- a/docs.wrm/api/utils/hashing.wrm +++ b/docs.wrm/api/utils/hashing.wrm @@ -10,44 +10,44 @@ _subsection: Cryptographic Hashing The [Cryptographic Hash Functions](https://en.wikipedia.org/wiki/Cryptographic_hash_function) are a specific family of hash functions. -_property: utils.keccak256(aBytesLike) => string<[[datahexstring]]<32>> @ @SRC +_property: ethers.utils.keccak256(aBytesLike) => string<[[datahexstring]]<32>> @ @SRC Returns the [KECCAK256](https://en.wikipedia.org/wiki/SHA-3) digest //aBytesLike//. -_property: utils.ripemd160(aBytesLike) => string<[[datahexstring]]<20>> @ @SRC +_property: ethers.utils.ripemd160(aBytesLike) => string<[[datahexstring]]<20>> @ @SRC Returns the [RIPEMD-160](https://en.m.wikipedia.org/wiki/RIPEMD) digest of //aBytesLike//. -_property: utils.sha256(aBytesLike) => string<[[datahexstring]]<32>> @ @SRC +_property: ethers.utils.sha256(aBytesLike) => string<[[datahexstring]]<32>> @ @SRC Returns the [SHA2-256](https://en.wikipedia.org/wiki/SHA-2) digest of //aBytesLike//. -_property: utils.sha512(aBytesLike) => string<[[datahexstring]]<64>> @ @SRC +_property: ethers.utils.sha512(aBytesLike) => string<[[datahexstring]]<64>> @ @SRC Returns the [SHA2-512](https://en.wikipedia.org/wiki/SHA-2) digest of //aBytesLike//. -_property: utils.computeHmac(algorithm, key, data) => string<[[datahexstring]]> @ @SRC +_property: ethers.utils.computeHmac(algorithm, key, data) => string<[[datahexstring]]> @ @SRC Returns the [HMAC](https://en.wikipedia.org/wiki/HMAC) of //data// with //key// using the [Algorithm](supported-algorithm) //algorithm//. _heading: HMAC Supported Algorithms @ @SRC -_property: utils.SupportedAlgorithms.sha256 => string +_property: ethers.utils.SupportedAlgorithm.sha256 => string Use the [SHA2-256](https://en.wikipedia.org/wiki/SHA-2) hash algorithm. -_property: utils.SupportedAlgorithms.sha512 => string +_property: ethers.utils.SupportedAlgorithm.sha512 => string Use the [SHA2-512](https://en.wikipedia.org/wiki/SHA-2) hash algorithm. _subsection: Common Hashing Helpers -_property: utils.hashMessage(message) => string<[[datahexstring]]<32>> @ @SRC +_property: ethers.utils.hashMessage(message) => string<[[datahexstring]]<32>> @ @SRC Computes the Ethereum message digest of //message//. Ethereum messages are -converted to UTF-8 bytes and prefixed with ``\x19Ethereum Signed Message:`` +converted to UTF-8 bytes and prefixed with ``\\x19Ethereum Signed Message:`` and the length of //message//. -_property: utils.id(text) => string<[[datahexstring]]<32>> @ @SRC +_property: ethers.utils.id(text) => string<[[datahexstring]]<32>> @ @SRC The Ethereum Identity function computs the keccak256 hash of the //text// bytes. -_property: utils.namehash(name) => string<[[datahexstring]]<32>> @ @SRC -Returns the [ENS Namehash](https://docs.ens.domains/contract-api-reference/name-processing#hashing-names) of //name//. +_property: ethers.utils.namehash(name) => string<[[datahexstring]]<32>> @ @SRC +Returns the [ENS Namehash](https:/\/docs.ens.domains/contract-api-reference/name-processing#hashing-names) of //name//. _subsection: Solidity Hashing Algorithms @@ -56,15 +56,15 @@ When using the Solidity ``abi.packEncoded(...)`` function, a non-standard //tightly packed// version of encoding is used. These functions implement the tightly packing algorithm. -_property: utils.solidityPack(arrayOfTypes, arrayOfValues) => string<[[datahexstring]]> @ @SRC +_property: ethers.utils.solidityPack(arrayOfTypes, arrayOfValues) => string<[[datahexstring]]> @ @SRC Returns the non-standard encoded //arrayOfValues// packed according to their respecive type in //arrayOfTypes//. -_property: utils.solidityKeccak256(arrayOfTypes, arrayOfValues) => string<[[datahexstring]]<32>> @ @SRC +_property: ethers.utils.solidityKeccak256(arrayOfTypes, arrayOfValues) => string<[[datahexstring]]<32>> @ @SRC Returns the KECCAK256 of the non-standard encoded //arrayOfValues// packed according to their respective type in //arrayOfTypes//. -_property: utils.soliditySha256(arrayOfTypes, arrayOfValues) => string<[[datahexstring]]<32>> @ @SRC +_property: ethers.utils.soliditySha256(arrayOfTypes, arrayOfValues) => string<[[datahexstring]]<32>> @ @SRC Returns the SHA2-256 of the non-standard encoded //arrayOfValues// packed according to their respective type in //arrayOfTypes//. diff --git a/docs.wrm/api/utils/strings.wrm b/docs.wrm/api/utils/strings.wrm index 45c6a431..d5c3ea94 100644 --- a/docs.wrm/api/utils/strings.wrm +++ b/docs.wrm/api/utils/strings.wrm @@ -21,22 +21,22 @@ _note: Note Strings that are 31 __//bytes//__ long may contain fewer than 31 __//characters//__, since UTF-8 requires multiple bytes to encode international characters. -_property: utils.parseBytes32String(aBytesLike) => string @ @SRC +_property: ethers.utils.parseBytes32String(aBytesLike) => string @ @SRC Returns the decoded string represented by the ``Bytes32`` encoded data. -_property: utils.formatBytes32String(text) => string<[[datahexstring]]<32>> @ @SRC +_property: ethers.utils.formatBytes32String(text) => string<[[datahexstring]]<32>> @ @SRC Returns a ``bytes32`` string representation of //text//. If the length of //text// exceeds 31 bytes, it will throw an error. _subsection: UTF-8 Strings @ -_property: utils.toUtf8Bytes(text [ , form = current ] ) => Uint8Array @ @SRC +_property: ethers.utils.toUtf8Bytes(text [ , form = current ] ) => Uint8Array @ @SRC Returns the UTF-8 bytes of //text//, optionally normalizing it using the [[unicode-normalization-form]] //form//. -_property: utils.toUtf8CodePoints(aBytesLike [ , form = current ] ) => Array @ @SRC -Returns the Array of codepoints of //aBytesLike//, optionally normalizing it using the +_property: ethers.utils.toUtf8CodePoints(text [ , form = current ] ) => Array @ @SRC +Returns the Array of codepoints of //text//, optionally normalized using the [[unicode-normalization-form]] //form//. _note: Note @@ -45,28 +45,29 @@ its codepoint, accounting for surrogate pairs. This should not be confused with ``string.split("")``, which destroys surrogate pairs, spliting between each UTF-16 codeunit instead. -_property: utils.toUtf8String(aBytesLike [ , ignoreErrors = false ] ) => string @ @SRC -Returns the string represented by the UTF-8 bytes of //aBytesLike//. This will -throw an error for invalid surrogates, overlong sequences or other UTF-8 issues, -unless //ignoreErrors// is specified. +_property: ethers.utils.toUtf8String(aBytesLike [ , onError = error ] ) => string @ @SRC +Returns the string represented by the UTF-8 bytes of //aBytesLike//. +The //onError// is a [Custom UTF-8 Error function](utf8error) and if not specified +it defaults to the [error](utf8error-error) function, which throws an error +on **any** UTF-8 error. -_heading: UnicodeNormalizationForm @ @SRC +_subsection: UnicodeNormalizationForm @ @SRC There are several [commonly used forms](https://en.wikipedia.org/wiki/Unicode_equivalence) when normalizing UTF-8 data, which allow strings to be compared or hashed in a stable way. -_property: utils.UnicodeNormalizationForm.current +_property: ethers.utils.UnicodeNormalizationForm.current Maintain the current normalization form. -_property: utils.UnicodeNormalizationForm.NFC +_property: ethers.utils.UnicodeNormalizationForm.NFC The Composed Normalization Form. This form uses single codepoints which represent the fully composed character. For example, the **é** is a single codepoint, ``0x00e9``. -_property: utils.UnicodeNormalizationForm.NFD +_property: ethers.utils.UnicodeNormalizationForm.NFD The Decomposed Normalization Form. This form uses multiple codepoints (when necessary) to compose a character. @@ -75,7 +76,7 @@ is made up of two codepoints, ``"0x0065"`` (which is the letter ``"e"``) and ``"0x0301"`` which is a special diacritic UTF-8 codepoint which indicates the previous character should have an acute accent. -_property: utils.UnicodeNormalizationForm.NFKC +_property: ethers.utils.UnicodeNormalizationForm.NFKC The Composed Normalization Form with Canonical Equivalence. The Canonical representation folds characters which have the same syntactic representation but different semantic meaning. @@ -83,7 +84,7 @@ but different semantic meaning. For example, the Roman Numeral **I**, which has a UTF-8 codepoint ``"0x2160"``, is folded into the capital letter I, ``"0x0049"``. -_property: utils.UnicodeNormalizationForm.NFKD +_property: ethers.utils.UnicodeNormalizationForm.NFKD The Decomposed Normalization Form with Canonical Equivalence. See NFKC for more an example. @@ -91,3 +92,93 @@ _note: Note Only certain specified characters are folded in Canonical Equivalence, and thus it should **not** be considered a method to acheive //any// level of security from [homoglyph attacks](https://en.wikipedia.org/wiki/IDN_homograph_attack). + +_subsection: Custom UTF-8 Error Handling @ + +When converting a string to its codepoints, there is the possibility +of invalid byte sequences. Since certain situations may need specific +ways to handle UTF-8 errors, a custom error handling function can be used, +which has the signature: + +_property: errorFunction(reason, offset, bytes, output [ , badCodepoint ]) => number +The //reason// is one of the [UTF-8 Error Reasons](utf8error-reasons), //offset// is the index +into //bytes// where the error was first encountered, output is the list +of codepoints already processed (and may be modified) and in certain Error +Reasons, the //badCodepoint// indicates the currently computed codepoint, +but which would be rejected because its value is invalid. + +This function should return the number of bytes to skip past keeping in +mind the value at //offset// will already be consumed. + +_heading: UTF-8 Error Reasons @ @SRC + +_property: ethers.utils.Utf8ErrorReason.BAD_PREFIX +A byte was encountered which is invalid to begin a UTF-8 byte +sequence with. + +_property: ethers.utils.Utf8ErrorReason.MISSING_CONTINUE +A UTF-8 sequence was begun, but did not have enough continuation +bytes for the sequence. For this error the //ofset// is the index +at which a continuation byte was expected. + +_property: ethers.utils.Utf8ErrorReason.OUT_OF_RANGE +The computed codepoint is outside the range for valid UTF-8 +codepoints (i.e. the codepoint is greater than 0x10ffff). +This reason will pass the computed //badCountpoint// into +the custom error function. + +_property: ethers.utils.Utf8ErrorReason.OVERLONG +Due to the way UTF-8 allows variable length byte sequences +to be used, it is possible to have multiple representations +of the same character, which means +[overlong sequences](https://en.wikipedia.org/wiki/UTF-8#Overlong_encodings) +allow for a non-distinguished string to be formed, which can +impact security as multiple strings that are otherwise +equal can have different hashes. + +Generally, overlong sequences are an attempt to circumvent +some part of security, but in rare cases may be produced by +lazy libraries or used to encode the null terminating +character in a way that is safe to include in a ``char*``. + +This reason will pass the computed //badCountpoint// into the +custom error function, which is actually a valid codepoint, just +one that was arrived at through unsafe methods. + +_property: ethers.utils.Utf8ErrorReason.OVERRUN +The string does not have enough characters remaining for the +length of this sequence. + +_property: ethers.utils.Utf8ErrorReason.UNEXPECTED_CONTINUE +This error is similar to BAD_PREFIX, since a continuation byte +cannot begin a valid sequence, but many may wish to process this +differently. However, most developers would want to trap this +and perform the same operation as a BAD_PREFIX. + +_property: ethers.utils.Utf8ErrorReason.UTF16_SURROGATE +The computed codepoint represents a value reserved for +UTF-16 surrogate pairs. +This reason will pass the computed surrogate half +//badCountpoint// into the custom error function. + + + +_heading: Provided UTF-8 Error Handling Functions + +There are already several functions available for the most common +situations. + +_property: ethers.utils.Utf8ErrorFuncs.error @ @SRC +The will throw an error on **any** error with a UTF-8 sequence, including +invalid prefix bytes, overlong sequences, UTF-16 surrogate pairs. + +_property: ethers.utils.Utf8ErrorFuncs.ignore @ @SRC +This will drop all invalid sequences (by consuming invalid prefix bytes and +any following continuation bytes) from the final string as well as permit +overlong sequences to be converted to their equivalent string. + +_property: ethers.utils.Utf8ErrorFuncs.replace @ @SRC +This will replace all invalid sequences (by consuming invalid prefix bytes and +any following continuation bytes) with the +[UTF-8 Replacement Character](https:/\/en.wikipedia.org/wiki/Specials_%28Unicode_block%29#Replacement_character), +(i.e. U+FFFD). diff --git a/docs.wrm/cli/asm-help.txt b/docs.wrm/cli/asm-help.txt new file mode 100644 index 00000000..d3867638 --- /dev/null +++ b/docs.wrm/cli/asm-help.txt @@ -0,0 +1,11 @@ +Usage: + ethers-asm [ FILENAME ] [ OPTIONS ] + +OPTIONS + --disassemble Disassemble input bytecode + --define KEY=VALUE provide assembler defines + +OTHER OPTIONS + --debug Show stack traces for errors + --help Show this usage and exit + --version Show this version and exit diff --git a/docs.wrm/cli/asm.wrm b/docs.wrm/cli/asm.wrm new file mode 100644 index 00000000..3df0b881 --- /dev/null +++ b/docs.wrm/cli/asm.wrm @@ -0,0 +1,17 @@ +_title: Assembler + +_section: Assembler + +The assembler Command-Line utility allows you to assemble EVM asm files +into deployable bytecode and disassemle EVM bytecode into human-readable +mnemonics. + +_subsection: Help + +_code: asm-help.txt + +_subsection: Examples + +TODO examples + + diff --git a/docs.wrm/cli/ens.wrm b/docs.wrm/cli/ens.wrm index a36279bf..3c608eb4 100644 --- a/docs.wrm/cli/ens.wrm +++ b/docs.wrm/cli/ens.wrm @@ -6,7 +6,7 @@ _subsection: Help _code: ens-help.txt -_heading: Examples +_subsection: Examples TODO examples diff --git a/docs.wrm/cli/index.wrm b/docs.wrm/cli/index.wrm index c74fd834..3ab27183 100644 --- a/docs.wrm/cli/index.wrm +++ b/docs.wrm/cli/index.wrm @@ -4,6 +4,7 @@ _section: Command Line Interfaces _toc: ethers + asm ens typescript plugin