3.1 KiB
Source: PerKeep.
See also Using PerKeep
Some eraly notes as they may be relavent in longer term to shape the support for small content, files, and dictionaries, but also for short term improvement of the Codex protocol and data maodel.
There is a nice video about PerKeep, and here comes some relevant notes.
Layered Architecture
In the end it is a blob store. So how they look at it, is also interesting to us.
Blobs:
- 0 - 16MB,
- no file names,
- no mime-types,
- no metadata,
- no versions,
- just immutable blocks
Blobs are represented by blob-refs, self-describing identifiers similar to Multihases:
They use SHA-224. There are no deletes.
Blobs are not just flat bytes. Blobs can also be JSON objects with certain known fields, e.g. files are represented by JSON objects:
Thus, blobstore keeps both data and metadata.
5TB video file or VM image? Merkle tree of "bytes" schema blobs. Data at leaves. Rolling checksum cut points (similar to rsync - to be checked how does it really work). De-duplication within files & shifting files. Efficient seeks/pread.
Files and Dictionaries:
Indexing
The role of the metadata layer is indexing - to speed up the search. Most importantyl, it can be fully reconstructed from the BlobStore. As we will on some slides later, on top of indexing we have something called Corpus - optimized in memory key-value store to make the search even faster.
Below some slides about indexing:
Handling mutations
This is handled by something called mermanodes and needa further investigation.
A permanode and is a singed unique object. Having a permanode, you can then add atrributes (or mutations) to it:
There seems to be an error in the above slide - the text should be Better title I guess.
Everytime, you put an attribute on a permanode, you create a mutation or a claim connected to the base permanode. A claim seems to be a blob on its own.
Thus, in short, in PerKeep we seem to be having mutations by appending.
To be further investigated.












