Merge pull request #68 from multiformats/feat/symbolic-multibase

fill out multibase table and treat multibases as symbols
This commit is contained in:
Steven Allen 2018-01-31 22:43:36 +00:00 committed by GitHub
commit 9577c6681d
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 34 additions and 16 deletions

View File

@ -25,12 +25,12 @@
## Protocol Description - How does the protocol work?
`multicodec` is a _self-describing multiformat_, it wraps other formats with a tiny bit of self-description. A multicodec identifier is both a varint and the code identifying the following data, this means that the most significant bit of every multicodec code is reserved to signal the continuation.
`multicodec` is a _self-describing multiformat_, it wraps other formats with a tiny bit of self-description. A multicodec identifier may either be a varint (in a byte string) or a symbol (in a text string).
This way, a chunk of data identified by multicodec will look like this:
A chunk of data identified by multicodec will look like this:
```sh
<multicodec-varint><encoded-data>
<multicodec><encoded-data>
# To reduce the cognitive load, we sometimes might write the same line as:
<mcp><data>
```
@ -49,11 +49,12 @@ It is worth noting that multicodec-packed works very well in conjunction with [m
## MulticodecProtocol Tables
Multicodec uses "protocol tables" to agree upon the mapping from one multicodec code (a single varint). These tables can be application specific, though -- like [with](https://github.com/multiformats/multihash) [other](https://github.com/multiformats/multibase) [multiformats](https://github.com/multiformats/multiaddr) -- we will keep a globally agreed upon table with common protocols and formats.
Multicodec uses "protocol tables" to agree upon the mapping from one multicodec code. These tables can be application specific, though -- like [with](https://github.com/multiformats/multihash) [other](https://github.com/multiformats/multibase) [multiformats](https://github.com/multiformats/multiaddr) -- we will keep a globally agreed upon table with common protocols and formats.
## Multicodec table
The full table can be found at [table.csv](/table.csv) inside this repo.
The full table can be found at [table.csv](/table.csv) inside this repo. Codes
prefixed with `0x` are varint multicodecs and all others are symbolic.
### Adding new multicodecs to the table
@ -101,6 +102,12 @@ An Most Significant Bit unsigned varint, as defined by the [multiformats/unsigne
Yes, but we already have to agree on what protocols themselves are, so this is not so hard. The table even leaves some room for custom protocol paths, or you can use your own tables. The standard table is only for common things.
> **Q. Why distinguish between bytes and text?**
For completeness, we consider
[multibase](https://github.com/multiformats/multibase) prefixes to be
multicodecs. However multibase prefixes occur in *text*, and are therefore *symbols*. They may (or may not) have some underlying binary representation but that changes based on the text encoding used.
## Maintainers
Captain: [@jbenet](https://github.com/jbenet).

View File

@ -4,17 +4,27 @@ miscellaneous,,
raw, raw binary, 0x55
bases encodings,,
base1, unary, 0x01
base2, binary (0 and 1), 0x00
base8, octal, 0x07
base10, decimal, 0x09
base16, hexadecimal, 0x
base32, rfc4648, 0x
base32hex, rfc4648, 0x
base58flickr, base58 flicker, 0x
base58btc, base58 bitcoin, 0x
base64, rfc4648, 0x
base64url, rfc4648, 0x
identity, raw binary, NUL
base1, unary, "1"
base2, binary (0 and 1), "0"
base8, octal, "7"
base10, decimal, "9"
base16, hexadecimal, "f"
base16-upper, hexadecimal, "F"
base32, rfc4648, "b"
base32-upper, rfc4648, "B"
base32pad, rfc4648, "c"
base32pad-upper, rfc4648, "C"
base32hex, rfc4648, "v"
base32hex-upper, rfc4648, "V"
base32hexpad, rfc4648, "t"
base32hexpad-upper, rfc4648, "T"
base58flickr, base58 flicker, "Z"
base58btc, base58 bitcoin, "z"
base64, rfc4648, "m"
base64pad, rfc4648, "M"
base64url, rfc4648, "u"
base64urlpad, rfc4648, "U"
serialization formats,,
cbor, CBOR, 0x51
@ -35,6 +45,7 @@ multiaddr, , 0x32
multibase, , 0x33
multihashes,,
identity, raw binary, 0x0
md4, , 0xd4
md5, , 0xd5
sha1, , 0x11

1 codec description code
4 bases encodings
5 base1 identity unary raw binary 0x01 NUL
6 base2 base1 binary (0 and 1) unary 0x00 1
7 base8 base2 octal binary (0 and 1) 0x07 0
8 base10 base8 decimal octal 0x09 7
9 base16 base10 hexadecimal decimal 0x 9
10 base32 base16 rfc4648 hexadecimal 0x f
11 base32hex base16-upper rfc4648 hexadecimal 0x F
12 base58flickr base32 base58 flicker rfc4648 0x b
13 base58btc base32-upper base58 bitcoin rfc4648 0x B
14 base64 base32pad rfc4648 0x c
15 base64url base32pad-upper rfc4648 0x C
16 serialization formats base32hex rfc4648 v
17 cbor base32hex-upper CBOR rfc4648 0x51 V
18 base32hexpad rfc4648 t
19 base32hexpad-upper rfc4648 T
20 base58flickr base58 flicker Z
21 base58btc base58 bitcoin z
22 base64 rfc4648 m
23 base64pad rfc4648 M
24 base64url rfc4648 u
25 base64urlpad rfc4648 U
26 serialization formats
27 cbor CBOR 0x51
28 bson Binary JSON 0x
29 ubjson Universal Binary JSON 0x
30 protobuf Protocol Buffers 0x50
45 sha1 md5 0x11 0xd5
46 sha2-256 sha1 0x12 0x11
47 sha2-512 sha2-256 0x13 0x12
48 sha2-512 0x13
49 dbl-sha2-256 0x56
50 sha3-224 0x17
51 sha3-256 0x16