Decentralized Durability Engine
Go to file
Bulat-Ziganshin f24ded0f76
Download files without padding (#218)
The initial goal of this patch was to allow to download of a file via REST API in exactly the same size as it was uploaded, which required adding fields Chunker.offset and Manifest.originalBytes to keep that size. On top of that, we added more integrity checks to operations on Manifest, and reorganized TestNode.nim to test the actual interaction between node.store and node.retrieve operations.

Note that the wire format of Manifest was changed, so we need to recreate all BlockStores.

* Download without padding
* Fixed chunker tests
* Chunker: get rid of RabinChunker
* Verify offset in the chunker tests
* Use manifest.originalBytesPadded in StoreStream.size
* StoreStream: replace emptyBlock with zeroMem
* Manifest.bytes: compute how many bytes corresponding StoreStream(Manifest, pad) will return
* Manifest: verify originalBytes and originalLen on new/encode/decode
Also set originalBytes in each Manifest creation/update scenario
* Manifest: comments, split code into sections
* Reordered parameters to deal with int64 size in 32-bit builds
* TestNode.nim: combine Store and Retrieve tests
1. Instead of copy-pasting code from node.nim, new test calls node.store() and node.retrieve() in order to check that they can correctly store and then retrieve data
2. New test compares only file contents, manifest contents considered an implementation detail
3. New test chunks at odd chunkSize=BlockSize/1.618 in order to ensure that data retrieved correctly even when buffer sizes mismatch
* TestNode.nim: code refactoring
* Manifest.add: one more test
* Manifest.verify: return Result instead of raising Defect
* Node.store: added blockSize parameter
2022-08-24 15:15:59 +03:00
.github/workflows ci: more comments added to `.github/workflows/ci.yml` 2022-07-06 19:03:10 -05:00
codex Download files without padding (#218) 2022-08-24 15:15:59 +03:00
metrics Adding metrics (#203) 2022-08-23 10:11:21 -06:00
tests Download files without padding (#218) 2022-08-24 15:15:59 +03:00
vendor Download files without padding (#218) 2022-08-24 15:15:59 +03:00
.editorconfig Project setup 2021-02-02 19:29:52 +01:00
.gitignore windows: ignore *.exe 2022-06-30 17:26:24 -05:00
.gitmodules [build] add github.com/arnetheduck/nim-sqlite3-abi to vendor 2022-08-08 02:12:43 -05:00
Makefile ci: update GitHub Actions CI workflow to use msys2/setup-msys2@v2 2022-07-06 19:03:10 -05:00
README.md Updating download endpoint curl (#212) 2022-08-19 21:20:25 -06:00
codecov.yml [ci] disable pull-request comments by codecov 2022-05-19 15:23:35 +02:00
codex.nim Change every dagger to codex (#102) 2022-05-19 13:56:03 -06:00
codex.nimble Update ethers to version 0.2.0 2022-07-20 13:43:20 +02:00
config.nims ci: update GitHub Actions CI workflow to use msys2/setup-msys2@v2 2022-07-06 19:03:10 -05:00
env.sh add env.sh shim to project root (#34) 2021-12-20 13:12:18 -06:00
nim.cfg Disable ObservableStores warning 2021-11-16 16:51:24 +01:00
nimble.lock Sync submodule dependencies and lock file (#134) 2022-07-19 09:31:32 -06:00

README.md

Codex Decentralized Durability Engine

The Codex project aims to create a decentralized durability engine that allows persisting data in p2p networks. In other words, it allows storing files and data with predictable durability guarantees for later retrieval.

WARNING: This project is under active development and is considered pre-alpha.

License: Apache License: MIT Stability: experimental CI Codecov Discord

Build and Run

To build the project, clone it and run:

make update && make exec

The executable will be placed under the build directory under the project root.

Run the client with:

./build/codex

CLI Options

./build/codex --help
Usage:

codex [OPTIONS]... command

The following options are available:

     --log-level            Sets the log level [=LogLevel.INFO].
     --metrics              Enable the metrics server [=false].
     --metrics-address      Listening address of the metrics server [=127.0.0.1].
     --metrics-port         Listening HTTP port of the metrics server [=8008].
 -d, --data-dir             The directory where codex will store configuration and data..
 -l, --listen-port          Specifies one or more listening ports for the node to listen on. [=0].
 -i, --listen-ip            The public IP [=0.0.0.0].
     --udp-port             Specify the discovery (UDP) port [=8090].
     --net-privkey          Source of network (secp256k1) private key file (random|<path>) [=random].
 -b, --bootstrap-node       Specifies one or more bootstrap nodes to use when connecting to the network..
     --max-peers            The maximum number of peers to connect to [=160].
     --agent-string         Node agent string which is used as identifier in network [=Codex].
 -p, --api-port             The REST Api port [=8080].
 -c, --cache-size           The size in MiB of the block cache, 0 disables the cache [=100].
     --eth-provider         The URL of the JSON-RPC API of the Ethereum node [=ws://localhost:8545].
     --eth-account          The Ethereum account that is used for storage contracts [=EthAddress.default].
     --eth-deployment       The json file describing the contract deployment [=string.default].

Available sub-commands:

codex initNode

Example: running two Codex clients

./build/codex --data-dir="$(pwd)/Codex1" -i=127.0.0.1

This will start codex with a data directory pointing to Codex under the current execution directory and announce itself on the DHT under 127.0.0.1.

To run a second client that automatically discovers nodes on the network, we need to get the Signed Peer Record (SPR) of first client, Client1. We can do this by querying the /info endpoint of the node's REST API.

curl http://127.0.0.1:8080/api/codex/v1/info

This should output information about Client1, including its PeerID, TCP/UDP addresses, data directory, and SPR:

{
  "id": "16Uiu2HAm92LGXYTuhtLaZzkFnsCx6FFJsNmswK6o9oPXFbSKHQEa",
  "addrs": [
    "/ip4/0.0.0.0/udp/8090",
    "/ip4/0.0.0.0/tcp/49336"
  ],
  "repo": "/repos/status-im/nim-codex/Codex1",
  "spr": "spr:CiUIAhIhAmqg5fVU2yxPStLdUOWgwrkWZMHW2MHf6i6l8IjA4tssEgIDARpICicAJQgCEiECaqDl9VTbLE9K0t1Q5aDCuRZkwdbYwd_qLqXwiMDi2ywQ5v2VlAYaCwoJBH8AAAGRAh-aGgoKCAR_AAABBts3KkcwRQIhAPOKl38CviplVbMVnA_9q3N1K_nk5oGuNp7DWeOqiJzzAiATQ2acPyQvPxLU9YS-TiVo4RUXndRcwMFMX2Yjhw8k3A"
}

Now, let's start a second client, Client2. Because we're already using the default ports TCP (:8080) and UDP (:8090) for the first client, we have to specify new ports to avoid a collision. Additionally, we can specify the SPR from Client1 as the bootstrap node for discovery purposes, allowing Client2 to determine where content is located in the network.

./build/codex --data-dir="$(pwd)/Codex2" -i=127.0.0.1 --api-port=8081 --udp-port=8091 --bootstrap-node=spr:CiUIAhIhAmqg5fVU2yxPStLdUOWgwrkWZMHW2MHf6i6l8IjA4tssEgIDARpICicAJQgCEiECaqDl9VTbLE9K0t1Q5aDCuRZkwdbYwd_qLqXwiMDi2ywQ5v2VlAYaCwoJBH8AAAGRAh-aGgoKCAR_AAABBts3KkcwRQIhAPOKl38CviplVbMVnA_9q3N1K_nk5oGuNp7DWeOqiJzzAiATQ2acPyQvPxLU9YS-TiVo4RUXndRcwMFMX2Yjhw8k3A

There are now two clients running. We could upload a file to Client1 and download that file (given its CID) using Client2, by using the clients' REST API.

Interacting with the client

The client exposes a REST API that can be used to interact with the clients. These commands could be invoked with any HTTP client, however the following endpoints assume the use of the curl command.

/api/codex/v1/connect/{peerId}

Connect to a peer identified by its peer id. Takes an optional addrs parameter with a list of valid multiaddresses. If addrs is absent, the peer will be discovered over the DHT.

Example:

curl "127.0.0.1:8080/api/codex/v1/connect/<peer id>?addrs=<multiaddress>"

/api/codex/v1/download/{id}

Download data identified by a Cid.

Example:

 curl -vvv "127.0.0.1:8080/api/codex/v1/download/<Cid of the content>" --output <name of output file>

/api/codex/v1/upload

Upload a file, upon success returns the Cid of the uploaded file.

Example:

curl -vvv -H "content-type: application/octet-stream" -H Expect: -T "127.0.0.1:8080/api/codex/v1/upload" -X POST "<path to file>"

/api/codex/v1/info

Get useful node info such as its peer id, address and SPR.

Example:

curl -vvv "127.0.0.1:8080/api/codex/v1/info"