sourcecred/CHANGELOG.md

31 lines
1.3 KiB
Markdown
Raw Normal View History

# Changelog
## [Unreleased]
- Add GitHub reactions to the graph (#846)
- Detect references to commits (#833)
- Detect references in commit messages (#829)
- Add commit authorship to the graph (#826)
- Add `MentionsAuthor` edges to the graph (#808)
<!-- Please add new entries to the _top_ of this section. -->
## [0.1.0]
- Organize weight config by plugin (#773)
- Configure edge forward/backward weights separately (#749)
- Combine "load graph" and "run pagerank" into one button (#759)
Store GitHub data gzipped at rest (#751) Summary: We store the relational view in `view.json.gz` instead of `view.json`, taking advantage of the isomorphic `pako` library for gzip encoding and decoding. Sample space savings (note that post bodies are included; i.e., #747 has not been applied): SAVE OLD (B) NEW (B) REPO 89.7% 25326 2617 sourcecred/example-github 82.9% 3257576 555948 sourcecred/sourcecred 85.2% 11287621 1665884 ipfs/js-ipfs 88.0% 20953425 2520358 gitcoinco/web 84.4% 38196825 5951459 ipfs/go-ipfs 84.9% 205770642 31101452 tensorflow/tensorflow <details> <summary>Script to generate space savings output</summary> ```shell savings() { printf '% 7s % 11s % 11s %s\n' 'SAVE' 'OLD (B)' 'NEW (B)' 'REPO' for repo; do file="${SOURCECRED_DIRECTORY}/data/${repo}/github/view.json.gz" if ! [ -f "${file}" ]; then printf >&2 'warn: no such file %s\n' "${file}" continue fi script="$(sed -e 's/^ *//' <<EOF repo = '${repo}' pre_size = $(<"${file}" gzip -dc | wc -c) post_size = $(<"${file}" wc -c) percentage = '%0.1f%%' % (100 * (1 - post_size / pre_size)) p = '% 7s % 11d % 11d %s' % (percentage, pre_size, post_size, repo) print(p) EOF )" python3 -c "${script}" done } ``` </details> Closes #750. Test Plan: Comparing the raw old version with the decompressed new version shows that they are identical: ``` $ <~/tmp/sourcecred/data/sourcecred/example-github/github/view.json \ > shasum -a 256 - 63853b9d3f918274aafacf5198787e18185a61b9c95faf640a1e61f5d11fa19f - $ <~/tmp/sourcecred/data/sourcecred/example-github/github/view.json.gz \ > gzip -dc | shasum -a 256 63853b9d3f918274aafacf5198787e18185a61b9c95faf640a1e61f5d11fa19f - ``` Additionally, `yarn test --full` passes, and `yarn start` still loads data and runs PageRank properly. wchargin-branch: gzip-relational-view
2018-09-01 10:42:30 -07:00
- Store GitHub data compressed at rest, reducing space usage by 68× (#750)
- Improve weight sliders display (#736)
- Separate bots from users in the UI (#720)
- Add a feedback link to the prototype (#715)
- Support combining multiple repositories into a single graph (#711)
- Normalize scores so that 1000 cred is split amongst users (#709)
- Stop persisting weights in local store (#706)
Retry GitHub queries with exponential backoff (#699) Summary: This patch adds independent exponential backoff to each individual GitHub GraphQL query. We remove the fixed `GITHUB_DELAY_MS` delay before each query in favor of this solution, which requires no additional configuration (thus resolving a TODO in the process). We use the NPM module `retry` with its default settings: namely, a maximum of 10 retries with factor-2 backoff starting at 1000ms. Empirically, it seems very unlikely that we should require much more than 2 retries for a query. (See Test Plan for more details.) This is both a short-term unblocker and a good kind of thing to have in the long term. Test Plan: Note that `yarn test --full` passes, including `fetchGithubRepoTest.sh`. Consider manual testing as follows. Add `console.info` statements in `retryGithubFetch`, then load a large repository like TensorFlow, and observe the output: ```shell $ node bin/sourcecred.js load --plugin github tensorflow/tensorflow 2>&1 | ts -s '%.s' 0.252566 Fetching repo... 0.258422 Trying... 5.203014 Trying... [snip] 1244.521197 Trying... 1254.848044 Will retry (n=1)... 1260.893334 Trying... 1271.547368 Trying... 1282.094735 Will retry (n=1)... 1283.349192 Will retry (n=2)... 1289.188728 Trying... [snip] 1741.026869 Ensuring no more pages... 1742.139978 Creating view... 1752.023697 Stringifying... 1754.697116 Writing... 1754.697772 Done. ``` This took just under half an hour, with 264 queries total, of which: - 225 queries required 0 retries; - 38 queries required exactly 1 retry; - 1 query required exactly 2 retries; and - 0 queries required 3 or more retries. wchargin-branch: github-backoff
2018-08-22 11:37:29 -07:00
- Execute GraphQL queries with exponential backoff (#699)
Re-introduce a simplified git plugin (#685) This commit re-introduces the git plugin, now that it has been radically simplified as described in [1]. The new git plugin only has nodes for commits and only has commit has-parent edges. As compared to the version that was removed in #628, this plugin is far leaner. It doesn't bloat the graph (for `sourcecred/sourcecred`, the git plugin data is just 164k), and as such doesn't incur much performance penalty. Re-incorporating the git plugin also brings some tangible benefits. We already had git nodes in the graph, as the GitHub plugin attaches them to pull requests. Without any git plugin, these nodes are displayed as "uknown nodes" with ugly descriptions. Also, including a git plugin, even one that is very minimal, communicates to users that git is a source of information to SourceCred, and that they can expect more from it in the future. Note that this commit breaks backcompat for existing repositories that were locally loaded after #628. As such, it is best to `rm -rf $SOURCECRED_DIRECTORY` and start with fresh data. Also, due to a known bug in the WeightConfig, you should reset your browser's local storage. Test plan: After removing the SourceCred directory and the stale localStorage, the cred explorer nicely displays git commits, and connects them via has_parent edges. The NodeType filter allows filtering to commits as expected, and the WeightConfig shows node and edge weights for the Git plugin's nodes and edges. [1]: https://github.com/sourcecred/sourcecred/issues/627#issuecomment-413435447
2018-08-16 13:20:41 -07:00
- Introduce a simplified Git plugin that only tracks commits (#685)
- Rename cred explorer table columns (#680)
- Display version string in the app's footer
- Support hosting SourceCred instances at arbitrary gateways, not just
the root of a domain (#643)
- Aggregate over connection types in the cred explorer (#502)
- Start tracking changes in `CHANGELOG.md`