Commit Graph

56 Commits

Author SHA1 Message Date
Dandelion Mané 64c17f7dba
Change default alpha to 0.2 (#1391)
SourceCred is currently quite sensitive to inadvertent 'tight loops' in
the cred, where (e.g.) one user recieves cred but doesn't have many out
edges, resulting in a feedback loop where that person gets
disproportinate cred. See [1] and [2] for some examples.

Per a [suggestion] from @mzargham, I'm going to bandaid this issue by
increasing the alpha parameter; I've increased it 4x from 0.05 to 0.2.
Subjectively, I think this improves the cred quality.

[1]: https://discourse.sourcecred.io/t/sneak-peek-sourcecred-discourse-plugin/171
[2]: https://discourse.sourcecred.io/t/preliminary-credsperiment-cred/219
[suggestion]: https://discourse.sourcecred.io/t/preliminary-credsperiment-cred/219/16?u=decentralion
2019-09-30 10:49:25 -06:00
Dandelion Mané 6e2af1070f
Expose alpha in TimelineExplorer (#1390)
This commit modifies the TimelineExplorer so that the user can both see
the chosen alpha value, and change it. Alpha has a pretty profound
impact on the final scores, and I want to tweak it for CredSperiment
week two, so this is an important addition.

Test plan: Modify the alpha, re-run cred calculation, and observe that
the scores change. `yarn test` passes.
2019-09-30 10:33:15 -06:00
Dandelion Mané 54ece536d3
Integrate the identity plugin (#1385)
This commit integrates the identity plugin, which was created in #1384.
It does this by adding explicit identity fields to the project
configuration, which are then applied when loading the graph in
`api/load.js`.

The actual integration is quite straightforward.

Test plan: The underlying logic is thoroughly tested; I added one new
test case to verify that it is integrated properly. Since the project
compat has changed, I've updated all the snapshots. Prior to merging
this PR, I will produce one "integration test", using this code to do
identity resolution for a real project (i.e. on the SourceCred instance
itself).
2019-09-20 12:08:27 +02:00
Dandelion Mané 007568d3f0
Add `sourcecred discourse` command (#1374)
This adds a new command, `discourse`, which makes it convenient to load
Discourse servers as standalone SourceCred projects.

For example, you could load the official SourceCred discourse via the
following:

```sh
export SOURCECRED_DISCOURSE_KEY=....
yarn backend
node bin/sourcecred.js discourse https://discourse.sourcecred.io credbot
yarn start
```

I've updated the README with instructions for using the plugin.

Test plan: No automated testing because I see this tool as a temporary
placeholder until we get the SourceCred instances setup. I manually
tested the error cases (e.g. providing an invalid server url) as well as
success cases like the one above. I validated that the weights file
argument is being interpreted correctly (i.e. trying to load invalid
weights produces an expected error message, loading valid weights
results in those weights being present in the UI).
2019-09-19 12:32:49 +02:00
Dandelion Mané 093955dea1
scores command no longer assumes GitHub plugin (#1372)
Previously, the `sourcecred scores` command assumed that all users are
GitHub users, and assigned users an id based on their GitHub login.

Now, the command returns information on all users, regardless of which
plugin provided them. As such, we need to identify users differently.
Instead of a string id, they now have an array of address parts. That
array contains all of the parts of their corresponding node address.

For example, the GitHub user `@Beanow` would correspond to the address
array `["sourcecred", "github", "USERLIKE", "USER", "Beanow"]`

As a general convention, the first two components of any node's address
contain information about the plugin that owns that node. The first
component is the owner of the plugin, and the second is the name of the
plugin. Afterwards, the plugin may represent nodes in whatever manner it
sees fit.

Thanks to @Beanow and @vsoch for some feedback and discussion on this
design.

Test plan: Snapshots have been updated. `yarn test` passes.
2019-09-10 23:49:45 +02:00
Dandelion Mané 545b084146
Change TimelineCred filtering strategy (#1358)
This changes how TimelineCred filtering works. Instead of using the
filterTimelineCred module, which includes all nodes matching
filterPrefixes, we now take all nodes matching scorePrefixes and
additionally the top `k` nodes for every other type.

This ensures that we will have the top comments, pull requests, issues,
etc in the UI, without needing to take every single comment or PR or
issue.

Concurrently, the UI is updated so that every type is included in the
filter dropdown.

CHANGELOG has been updated, since this is user facing.

Test plan: `yarn test` passes, snapshots are updated, and I also tested
the UI manually.
2019-09-08 00:32:10 +02:00
Dandelion Mané c62ddccfec
Release version 0.4.0 (#1271)
Test plan: `yarn test --full`
2019-08-07 20:12:11 +02:00
Dandelion Mané 26c0910a1f
TimelineExplorer: Enable changing selected type (#1268)
The code is mostly ported from the legacy app. However, we no longer
assume that we are showing every type for every plugin. Instead, the
types are manually selected. For now, we permit the GitHub user type,
and the GitHub repo type, as these are the two types that are included
in filtered timeline cred.

Test plan: Manual inspection is necessary, since this frontend is mostly
untested. I've done that inspection. Also, `yarn test` passes.
2019-08-07 17:54:04 +02:00
Dandelion Mané ca4fb2bc5d Remove deprecated commands and adapters
This commit removes the `pagerank` and `analyze` commands (both of which
never saw real usage), removes the outdated adapter-based `loadGraph`
method, and removes all traces of the analysis adapters.

It builds on work in #1233 and #1136.

Test plan: `yarn test --full` passes.
2019-07-23 01:31:18 +01:00
Dandelion Mané b4c2846ed0 Update CHANGELOG for #1233 2019-07-23 01:01:09 +01:00
Robin van Boven 7509a78f65 Add --weights as load option (#1224)
Includes a change to `cli/load` and `build_static_site.sh` to accept a `--weights WEIGHTS_FILE` argument.
This allows overriding the default weights at build-time using a `weights.json` that has the same format as previously generated in the frontend.

Test plan:
Adds an additional test-case as well for propagating the optional parameter.
The file I/O of loading and parsing a weights.json file was tested manually. As analysis/weights' fromJSON() is tested elsewhere as is passing weight parameters.
2019-07-15 15:25:28 +01:00
Dandelion Mané 8e0bbcf597 Change version to 0.3.0 2019-07-11 21:53:11 +01:00
Dandelion Mané f5172f8098 CHANGELOG: We now support node 10 and 12 2019-07-11 06:38:27 +01:00
Dandelion Mané 8d6f62d4b3 Update CHANGELOG.md to mention Timeilne Cred UI 2019-07-11 06:33:41 +01:00
Dandelion Mané 2d16afe891 Update CHANGELOG.md
Test plan: Visual inspection
2019-07-11 01:30:27 +01:00
Tyler Mace 1fbf8cd587 Quicker failure and description when invalid token supplied (#1161)
Fixes #1156

When users export a GitHub API token that has insufficient privleges
or has been revoked, we have been using a catch all error with retry
to handle it. This change adds a new error type for bad credentials
and does not retry.

Test plan:
There are no unit tests that cover this, however, you can test the
change by supplying a revoked token and attempting to load a GitHub
repo.
2019-05-30 22:18:30 +03:00
Dandelion Mané a831e05e5f
Add a WeightsFileManager (#1150)
This adds a WeightsFileManager component that allows the user to save or
load weights in the cred explorer. Clicking the download icon downloads
the weights, clicking the upload icon uploads them.

I also did a slight refactor to the FileUploader so that it no longer
always provides the file upload icon, instead the instantiator passes
children which act as the upload clickable. Seemed more consistent.

Test plan: No tests added, but I manually tested that upload and
download both work.
2019-05-21 04:41:00 +03:00
Dandelion Mané d2559960bb explorer: tweak weights on a per-node basis (#1143)
This pull request adds a weight slider to every NodeRow in the explorer,
enabling the user to manually set a weight for that node. The weights are
multiplicative with the type level weights, so that they can be changed
independently (e.g. you can have a comment that is weighted 2x higher than
regular comments, but still have comments get a low weight in general).

This pull coordinates a number of different changes across the codebase, all of
which are tested:

Adding support for manual weights in the weights and
weightsToEdgeEvaluator modules.
Modifying pagerankTable.TableRow so that it can show a slider in the second
column.
Adding piping for manual weights into the PagerankTable shared props, and
into the explorer app
Adding the slider to the NodeRow class that displays the current weight,
and can trigger the upstream weight change
Ensuring that the runPagerank call in the explorer actually uses the manual
weights
At present, there is no way to save these weights (they are ephemeral in the
frontend) and so this is clearly a prototype/tech demo level feature rather
than being ready for real usage. Correspondingly, CLI pagerank command always
uses an empty set of manual weights. I plan to remedy this in a follow-on pull
request.

Test plan: Run the included unit tests (yarn test) and also spin up the UI,
verify that it visually looks good in both Firefox and Chrome, and verify that
changing the weights and then re-running PageRank actually causes the cred of
the modified node to change.

Review plan: In addition to carefully reading the code, ensure that all of the
changes described a few paragraphs up are actually tested.

Merge plan: Squash and merge.

Thanks to @s-ben for proposing this feature in Discord, and to everyone
discussing its implications in this Discourse thread.
2019-05-18 19:21:17 +03:00
Brian Litwin 0f038305a2 Add CLI command to clear sourcecred data directory (#1111)
Resolves #1067

Adds the CLI commands:
`sourcecred clear --all` -- removes the $SOURCECRED_DIRECTORY
`sourcecred clear --cache` -- removes the cache directory
`sourcecred clear --help` -- provides usage info
`sourcecred clear` -- prompts the user to be more specific

Test plan:
The unit tests ensure that the command is properly wired into the
 sourcecred CLI, including help text integration. However, just to be
safe, we can start by verifying that calling `sourcecred` without
arguments lists the `clear` command as a valid option, and that
calling `sourcecred help clear` prints help information. (Note: it's
necessary to run `yarn backend` before testing these changes)

The unit tests also ensure that the command removes the proper
directories, so there isn't really a need to manually test it,
although the reviewer may choose to do so to be safe.

Although out of scope for unit tests on this function, we can also do
integration tests, to make sure that running the clear command doesn't
leave the sourcecred directory in an invalid state from the perspective of the `load` command.

```js
$ yarn backend;
$ node bin/sourcecred.js load sourcecred/example-github;
$ node bin/sourcecred.js clear --cache;
$ node bin/sourcecred.js load sourcecred/example-github;
$ node bin/sourcecred.js clear --all;
$ node bin/sourcecred.js load sourcecred/example-github;
```
The expected behavior of the above command block is that the load command never fails or throws an error.

@decentralion and I discussed the scenario where `rimraf` errors.
We decided that testing this scenario wasn't necessary, because
`rimraf` doesn't error if a directory doesn't exist, and
rimraf's maintainer suggests [monkey-patching the fs module]
to get rimraf to error in testing scenarios.

Thanks @decentralion for reviewing and pair-programming this with me.

[monkey-patching the fs module]: https://github.com/isaacs/rimraf/issues/31#issuecomment-29534796
2019-05-13 12:59:58 +03:00
Dandelion Mané 012c4f3eb7
Add `sourcecred pagerank` for backend pagerank (#1114)
This commit adds a new CLI command, `pagerank`, which runs PageRank on a
given repository. At present, the command only ever uses the default
weights, although I plan to make this configurable in the future. The
command then saves the resultant pagerank graph in the SourceCred
directory.

On its own, this command is not yet very compelling, as it doesn't
present any easily-consumed information (e.g. users' scores). However,
it is the first step for building other commands which do just that. My
intention is to make running this command the last step of `sourcecred
load`, so that future commands may assume the existence of pagerank
scores for any loaded repository.

Test plan: The new command is thoroughly tested; see
`cli/pagerank.test.js`. It also has nearly perfect code coverage (one
line missing, the dependency-injected real function for loading graphs).

Additionally, the following sequence of commands works:
```
$ yarn backend
$ node bin/sourcecred.js load sourcecred/pm
$ node bin/sourcecred.js pagerank sourcecred/pm
$ cat $SOURCECRED_DIRECTORY/data/sourcecred/pm/pagerankGraph.json
```

Material progress on #967.
2019-03-25 18:05:58 -07:00
Ana Noemi c48b2cd52e
Node and edge description tooltips (#1081)
* Show tooltips in weightConfig UI

* Updated to pass checks from prettier

* Updates unit tests to check WeightSlider descriptions

* Update CHANGELOG.md to reflect PR #1081
2019-03-07 18:49:27 +09:00
Dandelion Mané 996899ade3
Add CLI command: `sourcecred export-graph` (#1110)
* Add CLI command: `sourcecred export-graph`

This adds an `export-graph` command to the SourceCred CLI. It exports
the combined cred graphs for individual RepoIds, as was done for
[sourcecred/research#4].

Example usage:
```
$ node bin/sourcecred.js load sourcecred/mission
$ node bin/sourcecred.js export-graph sourcecred/mission >
  /tmp/mission_graph.json
```

Test plan:
The new command is thoroughly unit tested.

[sourcecred/research#4]: https://github.com/sourcecred/research/pull/4

* Address review feedback by @wchargin
2019-03-01 15:33:40 -07:00
Dandelion Mané a56c941b80
Enable loading private git repositories (#1085)
* Enable loading private git repositories

This commit enables loading private repositories, assuming that the user
has ssh-agent configured with keys to allow cloning the private
repository, and has provided a GitHub API token with permissions for the
repository in question.

I have not added automated testing. I don't think a cost-benefit
analysis favors adding such tests at this time:
- This code changes very infrequently, and so is unlikely to break
- If it does break, it will be pretty easy to catch and to fix
- the @sourcecred org is on a free plan, which doesn't allow private
repos, so setting up the test case is a bit of a pain

Test plan: `yarn test --full` passes, so I haven't broken existing Git
clone behavior. Locally, I am able to load private repositories.

* Remove unnecessary process import.
2019-02-11 14:36:14 -07:00
Ian Darrow 642a62437b Update WeightSlider.js to allow 0 weights (#1005)
This commit #811, allowing users to set the weights of node/edge types to 0.

The WeightSlider now sets the weight to 0 when its dragged to its minimum value.
The logic for converting between weights and sliders has also been made more robust,
and is more thoroughly tested.

In cases where we wanted to set the weight to 0 (e.g. backwards Reaction edges),
the default weight has been changed.

Test plan:
Loading the UI, check that the sliders still work as expected (dragging them changes the displayed weight, dragging to the far left sets weight to 0). Check that the weights are consumed as expected (setting weight for issues to 0 leads to no cred for issues). Check that the weights for backwards reaction edges now have 0 weight. `git grep "TODO(#811)"` returns no hits.
2019-02-10 13:41:00 -07:00
Brian Litwin 020200f21d
Changelog: add rocket and eyes reaction types (#1075)
Test Plan:
Make sure the pull request number is correct
2019-01-25 19:34:12 -05:00
Dandelion Mané 210b4bd071
Update the changelog (one-page-per-project) (#990)
Test plan: n/a
2018-11-01 16:54:41 -07:00
William Chargin f9bb75ef71
release: v0.2.0 (#952)
Test Plan:
Remove the SourceCred output directory, run `yarn backend`, and load
data for `sourcecred/example-github` and `sourcecred/sourcecred`. Then,
run `yarn start` and note that the cred explorer still works. Finally,
note that `yarn test --full` passes.

wchargin-branch: release-v0.2.0
2018-10-30 15:18:19 -07:00
William Chargin ea575cf5da
changelog: add Mirror module entry (#951)
Summary:
This points to #622 as the blanket issue, though really there was a long
series of pull requests worth of implementation.

Test Plan:
None.

wchargin-branch: changelog-mirror
2018-10-29 19:49:11 -07:00
Dandelion Mané 4a374d755e
Hyperlink Git commits to GitHub (#887)
This modifies the `nodeDescription` code for the Git plugin so that when
given a Git commit, it will hyperlink to that commit on GitHub. It does
this by looking up the corresponding `RepoId`s from the newly-added
`commitToRepoId` field in the `Repository` (#884).

Per a [suggestion in review], rather than hardcoding the GitHub url
logic in the Git plugin, we provide them via a `GitGateway`.

[suggestion in review]: https://github.com/sourcecred/sourcecred/pull/887#issuecomment-424059649

When no `RepoId` is found, it errors to console and does not include a
hyperlink. When multiple `RepoId`s are available, it chooses to link to
one arbitrarily. (In the future, we could amend this behavior to add
links to every valid repo). This behavior is tested.

Test plan:
I ran the application on newly-generated data and verified that it sets
up commit hyperlinks appropriately. Also, see unit tests.
2018-09-27 20:32:43 -07:00
William Chargin c7ba89b807
license: relicense under MIT + Apache-2 (#896)
Summary:
All contributors to SourceCred have agreed to this more permissive
licensing option:

  - @decentralion: [link to comment][decentralion]
  - @wchargin: [link to comment][wchargin]
  - @claireandcode: [link to comment][claireandcode]

[decentralion]: https://github.com/sourcecred/sourcecred/issues/812#issuecomment-420817902
[wchargin]: https://github.com/sourcecred/sourcecred/issues/812#issuecomment-420819732
[claireandcode]: https://github.com/sourcecred/sourcecred/issues/812#issuecomment-424914639

Archive link to thread: <https://archive.fo/BH2v5>

Resolves #812.

Test Plan:
Note that the GitHub tree explorer correctly links from the README to
the individual license files.

wchargin-branch: license-dual-mit-apache2
2018-09-26 19:28:41 -07:00
Dandelion Mané 1e5f728e29
Cred explorer: display commit short hash + summary (#879)
This modifies how commits are displayed in the cred explorer. Rather
than printing the full hash, we now print a short hash followed by the
summary.

Test plan:
Snapshot is updated, also I tested it by running SourceCred on a real
repository.
2018-09-21 13:24:28 -07:00
Dandelion Mané cdceedef8d
Display urls in the cred explorer (#860)
This commit modifies the plugin adapter's `nodeDescription` method so
that it may return a React node.

This enables the GitHub plugin's `nodeDescription` method to include
hyperlinks directly to the referenced content on GitHub. This makes
examining e.g. comment cred much easier.

I've also made two other changes to the descriptions:
- Pull requests diffs now color-encode the additions and deletions
- Descriptions for comments and reviews no longer include the authors

The Git plugin's behavior is unchanged.

Test plan:
I loaded a large repository in the cred explorer and verified that
exploring comments and pulls and issues is much easier. The descriptions
are as expected for every category of node. Snapshot tests updated.

Fixes #590.
2018-09-20 10:48:05 -07:00
Dandelion Mané 62d3c180ee
Add GitHub reactions to the graph (#846)
* Define Reaction edges

This adds support to `github/edges` for creating edges representing
GitHub reactions. These edges are not actually added to the graph.

Test plan: Unit tests

* Add GitHub reactions to the graph

This commit adds functional support for reactions in SourceCred.
Only thumbs-up, heart, and hooray reactions are supported for now, as
they are all unambiguously positive; adding support for negative
reactions like thumbs-down will require some more thought.

The reactions are added to the graph, and new edge types have been added
to the UI.

Test plan:
The `graphView` class has been updated to do invariant checking for the
reaction edges, including that the unsupported reaction types like
"THUMBS_DOWN" aren't added to the graph.

I've tested this feature by downloading data for a large repository
(ipfs/go-ipfs). The reaction edges appear and transfer cred reasonably.
The edge types are displayed in the weight config appropriately.

Builds on #839, #840, and #845.
2018-09-17 13:44:11 -07:00
Dandelion Mané aecf64b026
Detect references to commits (#833)
Now that #832 gave us logic to parse references to commits, we have the
RelationalView find and add these references. The actual change is
a simple extension of existing reference detection logic.

Test plan: Observe that the snapshots are updated with references to
commits from the example-github repository.

Progress on #815.
2018-09-14 11:56:16 -07:00
Dandelion Mané ab85c9785b
Detect references in commit messages (#829)
Now that the GitHub plugin knows about commit messages (#828), we can
parse those commit messages to find references to other GitHub entities.

Fixed a minor typing mistake along the way.

Test plan:
Observe that a number of references have been detected among the commits
in the example GitHub repository. We mistakenly find references to
wchargin because we don't have a proper tokenizer. (#481)

Progress on #815.
2018-09-13 15:46:39 -07:00
Dandelion Mané c68cb29769
Add commit authorship to the graph (#826)
In #824, we loaded every commit in the default branch's history into the
GitHub relational view, along with authorship info. This commit actually
uses that authorship info to create AUTHORS edges from the commit to the
user that authored it (whenever possible).

The implementation is quite simple: we just need to yield the commits
when we yield all the authored entities, so that we will process their
authors and add them to the graph. Also, I updated the invariant
declarations in `graphView.js`, and corrected a type signature so that the
new invariants would typecheck.

Test plan: The snapshot update shows that commits are being added to the
graph appropriately. Observe that commits which do not have a valid
GitHub user as their author do not correspond to edges in the graph.
See [example].

This is basically a solution to #815, but I'll defer closing that issue
until I've added a few more features, like reference detection.

[example]: 6bd1b4c0b7
2018-09-13 14:19:37 -07:00
Dandelion Mané 2a5c093286
Update CHANGELOG.md (#820)
It now mentions that we added `MentionsAuthor` edges to the GitHub
graph in #808.

Thanks @whyrusleeping for suggesting this heuristic.

Test plan: n/a
2018-09-12 20:30:35 -07:00
Dandelion Mané 508fbc5d72
Release 0.1.0 (#799)
Test plan: I ran `yarn test --full`. I also regenerated data from
scratch and manually tested the cred explorer.
2018-09-06 19:06:16 -07:00
Dandelion Mané 2779770af5
Organize weights by plugin (#773)
This commit adds PluginWeightConfig, which is responsible for
adding all the weights for an individual plugin. The top-level
WeightConfig now creates multiple PluginWeightConfigs. It also takes
responsibility for hiding the FallbackPlugin.

Test plan: The PluginWeightConfig is tested (and fairly simple). The
top-level WeightConfig is not yet tested (#604), so I manually tested
that the weights in the app still function.
2018-09-05 11:57:20 -07:00
Dandelion Mané c7e5a3b87d
Configure forward/backward edge weights separately (#749)
This commit introduces a new component, `EdgeTypeConfig`, which is
responsible for configuring the weights for a given edge type. The
config creates two `WeightSlider`s: one for the forward direction, and
one for the backward direction. The `DirectionalitySlider` is no longer
used, and is removed. This fixes #596.

So as to avoid confusion, we now describe every edge with variables, as
in 'α REFERENCES β', and clarify that the weight modifies how cred flows
from β to α. This necessitated the creation of an `EdgeWeightSlider`,
local to the `EdgeTypeConfig`, which sets up a `WeightSlider` with the
necessary greek characters.

The EdgeTypeConfig is tested, so this is continuing progress towards
solving #604.

Test plan: I manually verified that modifying edge weights has the
expected effect on cred scores. Also, some new unit tests are included.
2018-09-04 15:37:00 -07:00
Dandelion Mané 44407b5520
Combine loadGraph and runPagerank into one button (#759)
* StateTransitionMachine.loadGraph reports success

Step one towards #586. This will enable us to chain runPagerank after
loadGraph only if the load went through successfully.

Test plan: Unit tests included.

* Add StateTransitionMachine.loadGraphAndRunPagerank

This methods combines `loadGraph` and `runPagerank` into one method
which internally chains the two method. `runPagerank` is only called if
`loadGraph` was successful.

Progress on #586.

Test plan:
The new method has attached unit tests. I implemented the unit tests via
mocking, which seemed quite convenient as the method is basically a
wrapper for chaining two other function calls.

* Combine loadGraph and runPagerank into one button

Resolves #586. The new button is called "Analyze cred".

Test plan: Unit tests, also I tested it manually.
2018-09-03 14:34:14 -07:00
William Chargin 7f81337d74
Store GitHub data gzipped at rest (#751)
Summary:
We store the relational view in `view.json.gz` instead of `view.json`,
taking advantage of the isomorphic `pako` library for gzip encoding and
decoding.

Sample space savings (note that post bodies are included; i.e., #747 has
not been applied):

       SAVE     OLD (B)     NEW (B) REPO
      89.7%       25326        2617 sourcecred/example-github
      82.9%     3257576      555948 sourcecred/sourcecred
      85.2%    11287621     1665884 ipfs/js-ipfs
      88.0%    20953425     2520358 gitcoinco/web
      84.4%    38196825     5951459 ipfs/go-ipfs
      84.9%   205770642    31101452 tensorflow/tensorflow

<details>
<summary>Script to generate space savings output</summary>

```shell
savings() {
    printf '% 7s % 11s % 11s %s\n' 'SAVE' 'OLD (B)' 'NEW (B)' 'REPO'
    for repo; do
        file="${SOURCECRED_DIRECTORY}/data/${repo}/github/view.json.gz"
        if ! [ -f "${file}" ]; then
            printf >&2 'warn: no such file %s\n' "${file}"
            continue
        fi
        script="$(sed -e 's/^ *//' <<EOF
            repo = '${repo}'
            pre_size = $(<"${file}" gzip -dc | wc -c)
            post_size = $(<"${file}" wc -c)
            percentage = '%0.1f%%' % (100 * (1 - post_size / pre_size))
            p = '% 7s % 11d % 11d %s' % (percentage, pre_size, post_size, repo)
            print(p)
EOF
        )"
        python3 -c "${script}"
    done
}
```

</details>

Closes #750.

Test Plan:
Comparing the raw old version with the decompressed new version shows
that they are identical:

```
$ <~/tmp/sourcecred/data/sourcecred/example-github/github/view.json \
> shasum -a 256 -
63853b9d3f918274aafacf5198787e18185a61b9c95faf640a1e61f5d11fa19f  -
$ <~/tmp/sourcecred/data/sourcecred/example-github/github/view.json.gz \
> gzip -dc | shasum -a 256
63853b9d3f918274aafacf5198787e18185a61b9c95faf640a1e61f5d11fa19f  -
```

Additionally, `yarn test --full` passes, and `yarn start` still loads
data and runs PageRank properly.

wchargin-branch: gzip-relational-view
2018-09-01 10:42:30 -07:00
Dandelion Mané d8a16a4def
Better handling of log weights (#736)
This commit isolates all of the log-weight behavior in the weight
slider. That slider moves in log space, but the numbers printed and
passed around the WeightConfig code are now always in linear-space.

This should reduce confusion in the UI and for developers.

This commit contains two other improvements: (#588)
- Changes the (log space) range on the sliders from ±10 to ±5
- Change the order from slider, weight, name to name, slider, weight, so
that there is more visual separation between the name and the weight.

Test plan: Changes to the weight slider are tested. Changes to the
WeightConfig aren't (#604) so I manually tested the UI.
2018-08-30 19:21:59 -07:00
Dandelion Mané 9e78f26d0a
Separate bots and users in the UI (#720)
Fixes #696.

Test plan: This is basically a config change, so I manually tested it.
I ran SourceCred on gitcoinco/web, which has two bots,
and verified that the bots are correctly removed from the list of users.
Selecting "Bots" in the dropdown filter shows the two bots. Changing
the user weight does not affect the bots' scores, and changing the bot
weight does affect the bots' scores.
2018-08-29 15:14:42 -07:00
William Chargin d4202b2304
Add a configurable feedback URL to prototype (#715)
Summary:
We can now set, at build time, a URL to be displayed at the top of the
prototype, encouraging users to provide feedback. If the URL is not
provided, it defaults to the appropriate topic on the SourceCred
Discourse instance.

The result looks like this:

![Screenshot of the feedback URL in the prototype][screenshot]

[screenshot]: https://user-images.githubusercontent.com/4317806/44814824-a238b380-ab92-11e8-88c8-dfbae27ca496.png

Test Plan:
Unit tests added to `yarn sharness-full` and `yarn unit`.

You can run `yarn start` to see the message with the default URL, or
`SOURCECRED_FEEDBACK_URL=http://example.com/ yarn start` to specify a
custom URL.

wchargin-branch: feedback-url
2018-08-29 15:06:12 -07:00
William Chargin 761b5a0875
Allow combining repositories at load time (#711)
Summary:
As a first pass toward support for analyzing whole organizations, we
allow loading multiple repositories with `sourcecred load`, combining
them into a single relational view and a single Git graph at load time.

Test Plan:
Run

```
node bin/sourcecred.js \
    load \
    sourcecred/example-git \
    sourcecred/example-github \
    sourcecred/sourcecred \
    --output sourcecred/examples \
    ;
```

and select `sourcecred/examples` from the web view. Filter “Repository”
nodes, and note that there are three.

Note that loading a single repository without `--output` still works,
that loading a single repository with `--output` still works (respecting
the alias name), and loading not exactly one repository without
`--output` yields an appropriate error message.

Note that `yarn sharness-full` still works.

wchargin-branch: load-combined
2018-08-29 14:52:26 -07:00
Dandelion Mané a5c909689a
Users have 1000 cred in aggregate (#709)
This commit changes the cred normalization algorithm so that the total
cred of all GitHub user nodes always sums to 1000. For rationale on the
change, see #705.

Fixes #705.

Note that this introduces a new way for PageRank to fail: if the
graph has no GitHub userlike nodes, then PageRank will throw an error
when it attempts to normalize. This will result in a message being
displayed to the user, and a more helpful error being printed to
console. If we need the cred explorer to display graphs that have no
userlike nodes, then we can modify the codepath so that it falls back to
normalizing based on all nodes instead of on the GitHub userlike nodes
specifically.

Test plan: There is an included unit test which verifies that the
new argument gets threaded through the state properly. But this is
mostly a config change, so it's best tested by actually inspecting
the cred explorer. I have done so, and can verify that the behavior is
as expected: the sum of users' cred now sums to 1000, and e.g. modifying
the weight on the repository node doesn't produce drastic changes to
cred scores.
2018-08-29 12:20:57 -07:00
Dandelion Mané 3e77f486f2
Stop persisting users' weight choices (#706)
Storing the user's weights in localStore enables a workflow where a
user chooses their preferred weights, and brings those weights with them
across projects and contexts. However, this is the wrong workflow:
actually, a project chooses its weights, and when a user visits a
particular project, they want to sync up with the project's choice.
Giving the user the ability to modify the weights and recalculate is
still important, so that they can propose improvements to the project
maintainer. But implicitly keeping their modified weights, and even
bringing them to other projects the user inspects, is
counter-productive.

This commit removes this dubious feature. (It's a feature we were likely
to drop anyway, as it conflicts with #703.) As an added bonus, this code
is untested, which means the feature is technical debt—so removing it
reduces our technical debt! It also removes at least one known bug.

Test plan: There are no tests. I manually verified that the frontend
still works, and that it no longer persists weights across refresh.
2018-08-29 11:46:48 -07:00
William Chargin 0c2908dbfb
Retry GitHub queries with exponential backoff (#699)
Summary:
This patch adds independent exponential backoff to each individual
GitHub GraphQL query. We remove the fixed `GITHUB_DELAY_MS` delay before
each query in favor of this solution, which requires no additional
configuration (thus resolving a TODO in the process).

We use the NPM module `retry` with its default settings: namely, a
maximum of 10 retries with factor-2 backoff starting at 1000ms.
Empirically, it seems very unlikely that we should require much more
than 2 retries for a query. (See Test Plan for more details.)

This is both a short-term unblocker and a good kind of thing to have in
the long term.

Test Plan:
Note that `yarn test --full` passes, including `fetchGithubRepoTest.sh`.
Consider manual testing as follows.

Add `console.info` statements in `retryGithubFetch`, then load a large
repository like TensorFlow, and observe the output:

```shell
$ node bin/sourcecred.js load --plugin github tensorflow/tensorflow 2>&1 | ts -s '%.s'
0.252566 Fetching repo...
0.258422 Trying...
5.203014 Trying...
[snip]
1244.521197 Trying...
1254.848044 Will retry (n=1)...
1260.893334 Trying...
1271.547368 Trying...
1282.094735 Will retry (n=1)...
1283.349192 Will retry (n=2)...
1289.188728 Trying...
[snip]
1741.026869 Ensuring no more pages...
1742.139978 Creating view...
1752.023697 Stringifying...
1754.697116 Writing...
1754.697772 Done.
```

This took just under half an hour, with 264 queries total, of which:
  - 225 queries required 0 retries;
  - 38 queries required exactly 1 retry;
  - 1 query required exactly 2 retries; and
  - 0 queries required 3 or more retries.

wchargin-branch: github-backoff
2018-08-22 11:37:29 -07:00
Dandelion Mané 2d28bd5de4
Re-introduce a simplified git plugin (#685)
This commit re-introduces the git plugin, now that it has been radically
simplified as described in [1]. The new git plugin only has nodes for
commits and only has commit has-parent edges. As compared to the version
that was removed in #628, this plugin is far leaner. It doesn't bloat
the graph (for `sourcecred/sourcecred`, the git plugin data is just
164k), and as such doesn't incur much performance penalty.

Re-incorporating the git plugin also brings some tangible benefits. We
already had git nodes in the graph, as the GitHub plugin attaches them
to pull requests. Without any git plugin, these nodes are displayed as
"uknown nodes" with ugly descriptions. Also, including a git plugin,
even one that is very minimal, communicates to users that git is a
source of information to SourceCred, and that they can expect more from
it in the future.

Note that this commit breaks backcompat for existing repositories that
were locally loaded after #628. As such, it is best to
`rm -rf $SOURCECRED_DIRECTORY` and start with fresh data. Also, due to a
known bug in the WeightConfig, you should reset your browser's local
storage.

Test plan: After removing the SourceCred directory and the stale
localStorage, the cred explorer nicely displays git commits, and
connects them via has_parent edges. The NodeType filter allows filtering
to commits as expected, and the WeightConfig shows node and edge weights
for the Git plugin's nodes and edges.

[1]: https://github.com/sourcecred/sourcecred/issues/627#issuecomment-413435447
2018-08-16 13:20:41 -07:00