mirror of
https://github.com/status-im/sourcecred.git
synced 2025-02-12 12:38:45 +00:00
Summary: We store the relational view in `view.json.gz` instead of `view.json`, taking advantage of the isomorphic `pako` library for gzip encoding and decoding. Sample space savings (note that post bodies are included; i.e., #747 has not been applied): SAVE OLD (B) NEW (B) REPO 89.7% 25326 2617 sourcecred/example-github 82.9% 3257576 555948 sourcecred/sourcecred 85.2% 11287621 1665884 ipfs/js-ipfs 88.0% 20953425 2520358 gitcoinco/web 84.4% 38196825 5951459 ipfs/go-ipfs 84.9% 205770642 31101452 tensorflow/tensorflow <details> <summary>Script to generate space savings output</summary> ```shell savings() { printf '% 7s % 11s % 11s %s\n' 'SAVE' 'OLD (B)' 'NEW (B)' 'REPO' for repo; do file="${SOURCECRED_DIRECTORY}/data/${repo}/github/view.json.gz" if ! [ -f "${file}" ]; then printf >&2 'warn: no such file %s\n' "${file}" continue fi script="$(sed -e 's/^ *//' <<EOF repo = '${repo}' pre_size = $(<"${file}" gzip -dc | wc -c) post_size = $(<"${file}" wc -c) percentage = '%0.1f%%' % (100 * (1 - post_size / pre_size)) p = '% 7s % 11d % 11d %s' % (percentage, pre_size, post_size, repo) print(p) EOF )" python3 -c "${script}" done } ``` </details> Closes #750. Test Plan: Comparing the raw old version with the decompressed new version shows that they are identical: ``` $ <~/tmp/sourcecred/data/sourcecred/example-github/github/view.json \ > shasum -a 256 - 63853b9d3f918274aafacf5198787e18185a61b9c95faf640a1e61f5d11fa19f - $ <~/tmp/sourcecred/data/sourcecred/example-github/github/view.json.gz \ > gzip -dc | shasum -a 256 63853b9d3f918274aafacf5198787e18185a61b9c95faf640a1e61f5d11fa19f - ``` Additionally, `yarn test --full` passes, and `yarn start` still loads data and runs PageRank properly. wchargin-branch: gzip-relational-view
906 B
906 B
Changelog
[Unreleased]
- Store GitHub data compressed at rest, reducing space usage by 6–8× (#750)
- Improve weight sliders display (#736)
- Separate bots from users in the UI (#720)
- Add a feedback link to the prototype (#715)
- Support combining multiple repositories into a single graph (#711)
- Normalize scores so that 1000 cred is split amongst users (#709)
- Stop persisting weights in local store (#706)
- Execute GraphQL queries with exponential backoff (#699)
- Introduce a simplified Git plugin that only tracks commits (#685)
- Rename cred explorer table columns (#680)
- Display version string in the app's footer
- Support hosting SourceCred instances at arbitrary gateways, not just the root of a domain (#643)
- Aggregate over connection types in the cred explorer (#502)
- Start tracking changes in
CHANGELOG.md