multicodec/README.md

118 lines
6.2 KiB
Markdown
Raw Normal View History

# multicodec
2015-08-23 22:34:57 +00:00
[![](https://img.shields.io/badge/made%20by-Protocol%20Labs-blue.svg?style=flat-square)](http://ipn.io)
[![](https://img.shields.io/badge/project-multiformats-blue.svg?style=flat-square)](https://github.com/multiformats/multiformats)
[![](https://img.shields.io/badge/freenode-%23ipfs-blue.svg?style=flat-square)](https://webchat.freenode.net/?channels=%23ipfs)
[![](https://img.shields.io/badge/readme%20style-standard-brightgreen.svg?style=flat-square)](https://github.com/RichardLitt/standard-readme)
2015-08-24 10:16:30 +00:00
> Compact self-describing codecs. Save space by using predefined multicodec tables.
2015-08-23 22:34:57 +00:00
## Table of Contents
2015-08-24 10:16:30 +00:00
- [Motivation](#motivation)
- [How does it work? - Protocol Description](#how-does-it-work---protocol-description)
- [Multicodec tables](#multicodec-tables)
- [Standard multicodec table](#standard-mcp-protocol-table)
- [Implementations](#implementations)
- [FAQ](#faq)
- [Maintainers](#maintainers)
- [Contribute](#contribute)
- [License](#license)
2015-08-24 10:16:30 +00:00
## Motivation
2015-08-23 22:34:57 +00:00
[Multistreams](https://github.com/multiformats/multistream) are self-describing protocol/encoding streams. Multicodec uses an agreed-upon "protocol table". It is designed for use in short strings, such as keys or identifiers (i.e [CID](https://github.com/ipld/cid)).
## Protocol Description - How does the protocol work?
`multicodec` is a _self-describing multiformat_, it wraps other formats with a tiny bit of self-description. A multicodec identifier is both a varint and the code identifying the following data, this means that the most significant bit of every multicodec code is reserved to signal the continuation.
This way, a chunk of data identified by multicodec will look like this:
2015-08-23 22:34:57 +00:00
```sh
<multicodec-varint><encoded-data>
# To reduce the cognitive load, we sometimes might write the same line as:
<mcp><data>
2015-08-23 22:34:57 +00:00
```
Another useful scenario is when using the multicodec-packed as part of the keys to access data, example:
2016-09-25 07:46:07 +00:00
2015-08-23 22:34:57 +00:00
```
# suppose we have a value and a key to retrieve it
"<key>" -> <value>
2015-08-23 22:34:57 +00:00
# we can use multicodec-packed with the key to know what codec the value is in
"<mcp><key>" -> <value>
2015-08-23 22:34:57 +00:00
```
2016-09-25 07:46:07 +00:00
It is worth noting that multicodec-packed works very well in conjunction with [multihash](https://github.com/multiformats/multihash) and [multiaddr](https://github.com/multiformats/multiaddr), as you can prefix those values with a multicodec-packed to tell what they are.
2016-09-25 07:46:07 +00:00
2016-11-07 20:40:58 +00:00
## MulticodecProtocol Tables
2016-09-25 07:46:07 +00:00
2016-11-07 20:40:58 +00:00
Multicodec uses "protocol tables" to agree upon the mapping from one multicodec code (a single varint). These tables can be application specific, though -- like [with](https://github.com/multiformats/multihash) [other](https://github.com/multiformats/multibase) [multiformats](https://github.com/multiformats/multiaddr) -- we will keep a globally agreed upon table with common protocols and formats.
2015-08-23 22:34:57 +00:00
2016-09-25 07:46:07 +00:00
## Multicodec table
2015-08-23 22:34:57 +00:00
The full table can be found at [table.csv](/table.csv) inside this repo.
### Adding new multicodecs to the table
The process to add a new multicodec to the table is the following:
- 1. Fork this repo
- 2. Update the table with the value you want to add
- 3. Submit a Pull Request
This ["first come, first assign"](https://github.com/multiformats/multicodec/pull/16#issuecomment-260146609) policy is a way to assign codes as they are most needed, without increasing the size of the table (and therefore the size of the multicodecs) too rapidly.
2015-08-24 10:16:30 +00:00
## Implementations
- [go](https://github.com/multiformats/go-multicodec/)
- [JavaScript](https://github.com/multiformats/js-multicodec)
- [Python](https://github.com/multiformats/py-multicodec)
- [Add yours today!](https://github.com/multiformats/multicodec/edit/master/table.csv)
2015-08-24 10:16:30 +00:00
2016-11-07 21:25:15 +00:00
## Multicodec Path, also known as [`multistream`](https://github.com/multiformats/multistream)
Multicodec defines a table for the most common data serialization formats that can be expanded overtime or per application bases, however, in order for two programs to talk with each other, they need to know before hand which table or table extension is being used.
In order to enable self descriptive data formats or streams that can be dynamically described, without the formal set of adding a binary packed code to a table, we have [`multistream`](https://github.com/multiformats/multistream), so that applications can adopt multiple data formats for their streams and with that create different protocols.
## FAQ
2015-08-23 22:34:57 +00:00
> **Q. I have questions on multicodec, not listed here.**
2015-08-23 22:34:57 +00:00
That's not a question. But, have you checked the proper [multicodec FAQ](./README.md#faq)? Maybe your question is answered there. This FAQ is only specifically for multicodec-packed.
2015-08-23 22:34:57 +00:00
> **Q. Why?**
2015-08-23 22:34:57 +00:00
Because [multistream](https://github.com/multiformats/multistream) is too long for identifiers. We needed something shorter.
2015-08-23 22:34:57 +00:00
> **Q. Why varints?**
2015-08-23 22:34:57 +00:00
So that we have no limitation on protocols. Implementation note: you do not need to implement varints until the standard multicodec table has more than 127 functions.
2015-08-23 22:34:57 +00:00
> **Q. What kind of varints?**
2015-08-23 22:34:57 +00:00
An Most Significant Bit unsigned varint, as defined by the [multiformats/unsigned-varint](https://github.com/multiformats/unsigned-varint).
> **Q. Don't we have to agree on a table of protocols?**
Yes, but we already have to agree on what protocols themselves are, so this is not so hard. The table even leaves some room for custom protocol paths, or you can use your own tables. The standard table is only for common things.
## Maintainers
Captain: [@jbenet](https://github.com/jbenet).
## Contribute
Contributions welcome. Please check out [the issues](https://github.com/multiformats/multicodec/issues).
Check out our [contributing document](https://github.com/multiformats/multiformats/blob/master/contributing.md) for more information on how we work, and about contributing in general. Please be aware that all interactions related to multiformats are subject to the IPFS [Code of Conduct](https://github.com/ipfs/community/blob/master/code-of-conduct.md).
Small note: If editing the README, please conform to the [standard-readme](https://github.com/RichardLitt/standard-readme) specification.
## License
This repository is only for documents. All of these are licensed under the [CC-BY-SA 3.0](https://ipfs.io/ipfs/QmVreNvKsQmQZ83T86cWSjPu2vR3yZHGPm5jnxFuunEB9u) license © 2016 Protocol Labs Inc. Any code is under a [MIT](LICENSE) © 2016 Protocol Labs Inc.