With some frequency we find ourselves needing to maintain maps whose
values are arrays that we append to. `MapUtil.pushValue` is a utility
method for these cases.
Existing usage in `aggregate.js` has been modified to use the new
function.
Test plan: Unit tests included.
Summary:
Per #800, each test file should start with a `describe` block listing
its file path under `src`. Currently, nine of our tests do not do so.
Of these, eight had a top-level describe block with the wrong name
(either not a filepath or an outdated filepath), while only one short
test was missing a top-level describe block altogether. This patch fixes
each file to use the correct format.
Test Plan:
Apply the Sharness test in #802, and note that it fails before this
patch but passes after it.
wchargin-branch: describe-fix
Previously, the WeightConfig (and the button that expanded it) were in
the credExplorer App. This was a little weird, as there's no reason to
play with the weights before you have some Pagerank results to
investigate; additionally, it risked confusing new users with a concept
that was not yet applicable.
Also, the implementation was wonky: the WeightConfig had responsibility
for expanding/hiding itself, which gave poor ability to position the
button and the WeightConfig separately.
Finally, the codepath was untested (vestiges of #604).
This commit fixes all three issues:
- The WeightConfig and button have moved into PagerankTable
- The WeightConfig is now a stateless component, and the parent takes
responsibility for deciding when to mount it
- Logic for showing/hiding the WeightConfig is now tested.
This commit implements a [suggestion] to make `credExplorer/App` a
single source of truth on the `WeightedTypes`. As such, both
`WeightConfig` and `PluginWeightConfig` have been refactored to be
(essentially) stateless components that are controlled from above. I say
essentially because `WeightConfig` still has its expanded state, but
that will go away soon.
Along the way, I've improved testing and added some new invariant
checking (e.g. that `PluginWeightConfig` was passed the correct set of
weights for its adapter). For the first time, there are now tests for
the `WeightConfig` itself! I'm not totally done with the weight
re-write, but this seems like a good time to close#604, as the whole
logical sequence for setting weights is now tested.
Test plan: There are new unit tests. Also, to be sure, I manually tested
the UI as well.
[suggestion]: https://github.com/sourcecred/sourcecred/pull/792#issuecomment-419234721
This commit refactors `credExplorer/App` so that instead of storing an
`EdgeEvaulator` in its state, it stores `WeightedTypes` instead. This
has a few benefits:
- It's trivial to generate the right default value for `WeightedTypes`,
so we no longer allow the variable to be nullable in the state. This
simplifies logic, removes an error case, and means that we don't require
the `WeightConfig` to mount before the app is usable.
- `WeightedTypes` are serializable and can be tested for equality, so
they are a better-behaved piece of state
- We put off the information-destroying transformation as long as
possible
- In the future, I think we may want to move the weights/types concept
into core, at which point the `WeightedTypes` will directly be consumed
by the `core/attribution` module.
Test plan: Unit tests are pretty thorough; to be safe, I tested the UI
myself.
This refactors PluginWeightConfig so that it uses the
`defaultWeightsForAdapter` method introduced in #787.
The refactor is mildly invasive, as we switch the state from being a
(mutable) `WeightedTypes` to having a (regular, read-only)
`WeightedTypes`. I think this is an improvement in consistency.
Test plan: Trivial refactor; unit tests+flow pass.
This commit creates a central `weights` module that defines all of the
weight-related types, and provides some utilities for dealing with them.
This way users of weight-concepts do not need to depend on a lot of
random modules just to get the relevant types. The utility methods are
implicitly defined a few places in the codebase: now we can avoid
re-writing them, and test them more thoroughly.
Test plan: Unit tests pass.
Currently, the `credExplorer` uses the `defaultStaticAdapters`, but it
imports these adapters in multiple places. If we decide to make the
adapters configurable (e.g. when we start supporting more plugins) this
will be a problem.
This change modifies the cred explorer so that the adapters always come
from a prop declaration on the app. Then the adapters are passed into
the `state` module's functional entry points, rather than letting
`state` depend on the default adapters directly.
This change is motivated by the fact that my WeightConfig cleanup can be
done more cleanly if the adapters are present as a prop on the App.
Test plan: Unit tests are updated. Also, `git grep
defaultStaticAdapters` reaveals that the adapters are only consumed
once.
This commit adds `weightsToEdgeEvaluator`, a function for converting
weighted node types into an `EdgeEvaluator`. This replaces the
`edgeWeights` module (which was untested, and an outmoded API).
Test plan: The new `weightsToEdgeEvaluator` method is well-tested.
Since `WeightConfig` is still not tested, I manually verified that it
still works as anticipated.
Summary:
Lots of tests need the output of `yarn backend`. Before this commit,
they tended to create it themselves. This was slow and wasteful, and
also could in principle have race conditions (though in practice usually
tended not to).
This commit updates tests to respect a `SOURCECRED_BIN` environment
variable indicating the path to an existing directory of backend
applications.
Closes#765.
Test Plan:
Running `yarn test --full` passes.
Prepending `echo run >>/tmp/log &&` to the `backend` script in
`package.json` and running `yarn test --full` results in a log file
containing only one line, indicating that the script really is run only
once.
wchargin-branch: deduplicate-backend
This is convenient for testing other code, where we may want to directly
use the fallback types. One test has been updated in this way.
I also changed the names for the fallback adapter's edges to be somewhat
more readable.
Test plan: Tests improved.
This commit adds PluginWeightConfig, which is responsible for
adding all the weights for an individual plugin. The top-level
WeightConfig now creates multiple PluginWeightConfigs. It also takes
responsibility for hiding the FallbackPlugin.
Test plan: The PluginWeightConfig is tested (and fairly simple). The
top-level WeightConfig is not yet tested (#604), so I manually tested
that the weights in the app still function.
`testUtil.configureEnzyme` now additionally asserts, after every test,
that `console.error` and `console.warn` were not called. Tests that
explicitly expect such calls can still be written by manually re-mocking
the relevant console method (and several examples already exist).
The code that explicitly specifies this for various enzyme test files
has been removed.
Test plan: `git grep "not.toHaveBeenCalled"` shows only unrelated usage.
`yarn test` passes. Adding a spurious console.warn to a passing test
causes it to fail.
Fixes#668
Summary:
This simplifies interfaces everywhere.
See also #216, which did the opposite of this as a temporary fix due to
a Babel/Webpack interaction that no longer exists as of #766.
Test Plan:
Note that `node bin/sourcecred.js load sourcecred/example-git` still
works (after `yarn backend`). Note that `yarn test` still works. These
demonstrate that the module works from both a Webpack context and a Node
context. Note that `git grep --name-only execDependencyGraph` yields
exactly those files touched in this commit. Note that `yarn test --full`
passes.
wchargin-branch: commonjs-execDependencyGraph
Currently, it's possible to lock the cred explorer UI via the following
sequence of actions:
1. Press the analyze cred button
2. While it is running, modify the weights
After following this repro, the UI is stuck: it will say "loading"
forever, and the analyze cred button is disabled.
The issue is that loadGraph and runPagerank do not apply their success
(or failure) state transitions if they are pre-empted by another state
change. If a repo change occurs, that's the right behavior: the repo
change puts the state back to `"READY_TO_LOAD_GRAPH"`, which means the
UI is ready to re-load, and showing the PageRank results for the wrong
repo would be very confusing.
However, if an edge evaluator change occurs while loadGraph or
runPagerank is happening, the state is left in the "LOADING" status
(which means the analyze cred button is disabled).
This commit fixes the issue via a refactor: per @wchargin's suggestion,
responsibility for the edge evaluator moves from the state module out to
`credExplorer/App.js`. This dramatically simplifies the state module, as
we no longer need a `Substate` concept: we can simplify the state into a
single sequence of states.
As of the refactor, the bug is impossible.
Test plan: Unit tests have been updated to maintain coverage on the
refactored code. I manually tested that the bug no longer repro's, and
that the cred explorer UI continues to function. I did not add a new
test to protect against regression, because in the new codepath, the bug
is basically impossible.
This commit modifies the edge type so that it has a
`defaultForwardWeight` and `defaultBackwardWeight`, and these defaults
are respected by the `WeightConfig`.
I came up with reasonable-seeming defaults for the Git and GitHub
plugins; these will undoubtably be more methodically tuned in the
future.
Test plan: `yarn test` passes, also opening the cred explorer now has
the specified default weights in the WeightConfig. (Note that the
forward/backward directions are reversed as described in #749.)
This commit introduces a new component, `EdgeTypeConfig`, which is
responsible for configuring the weights for a given edge type. The
config creates two `WeightSlider`s: one for the forward direction, and
one for the backward direction. The `DirectionalitySlider` is no longer
used, and is removed. This fixes#596.
So as to avoid confusion, we now describe every edge with variables, as
in 'α REFERENCES β', and clarify that the weight modifies how cred flows
from β to α. This necessitated the creation of an `EdgeWeightSlider`,
local to the `EdgeTypeConfig`, which sets up a `WeightSlider` with the
necessary greek characters.
The EdgeTypeConfig is tested, so this is continuing progress towards
solving #604.
Test plan: I manually verified that modifying edge weights has the
expected effect on cred scores. Also, some new unit tests are included.
This factors `NodeTypeConfig` out of the `WeightConfig` component. The
scope for a `NodeTypeConfig` is that it configures a single node type.
Right now it just renders a single `WeightSlider`, but I like factoring
out both for consistency with the `EdgeTypeConfig` (see #749) and
because I expect we may want to add more complexity later.
Test plan: The new component has some tests, also I manually tested the
frontend.
* StateTransitionMachine.loadGraph reports success
Step one towards #586. This will enable us to chain runPagerank after
loadGraph only if the load went through successfully.
Test plan: Unit tests included.
* Add StateTransitionMachine.loadGraphAndRunPagerank
This methods combines `loadGraph` and `runPagerank` into one method
which internally chains the two method. `runPagerank` is only called if
`loadGraph` was successful.
Progress on #586.
Test plan:
The new method has attached unit tests. I implemented the unit tests via
mocking, which seemed quite convenient as the method is basically a
wrapper for chaining two other function calls.
* Combine loadGraph and runPagerank into one button
Resolves#586. The new button is called "Analyze cred".
Test plan: Unit tests, also I tested it manually.
* StateTransitionMachine.loadGraph reports success
Step one towards #586. This will enable us to chain runPagerank after
loadGraph only if the load went through successfully.
Test plan: Unit tests included.
* Add StateTransitionMachine.loadGraphAndRunPagerank
This methods combines `loadGraph` and `runPagerank` into one method
which internally chains the two method. `runPagerank` is only called if
`loadGraph` was successful.
Progress on #586.
Test plan:
The new method has attached unit tests. I implemented the unit tests via
mocking, which seemed quite convenient as the method is basically a
wrapper for chaining two other function calls.
Step one towards #586. This will enable us to chain runPagerank after
loadGraph only if the load went through successfully.
Test plan: Unit tests included.
Fixes#732; see that issue for context.
Test plan:
The success case still works (verified that loading
sourcecred/sourcecred works).
I haven't tested the error case, as getting a real RATE_LIMIT_EXCEEDED
form GitHub is time-consuming, and has only happened once in practice.
I'm pretty confident the code works because it's a simple adaptation of
the code that catches other cases.
Summary:
This commit changes the CLI to use the code in `cli` instead of `oclif`.
A subsequent commit will remove the dependency on OClif altogether.
Resolves#580.
Test Plan:
Note that `yarn backend; node bin/sourcecred.js help` works. Note that
the documentation in the README is still correct.
wchargin-branch: cli-replace-oclif
Summary:
This ports the OClif version of `sourcecred load` to the sane CLI
system. The functionality is similar, but the interface has been
changed a bit (mostly simplifications):
- The `SOURCECRED_GITHUB_TOKEN` can only be set by an environment
variable, not by a command-line argument. This is standard practice
because it is more secure: (a) other users on the same system can
see the full command line arguments, but not the environment
variables, and (b) it’s easier to accidentally leak a command line
(e.g., in CI) than a full environment.
- The `SOURCECRED_DIRECTORY` can only be set by an environment
variable, not by a command-line argument. This is mostly just to
simplify the interface, and also because we don’t really have a good
name for the argument: we had previously used `-d`, which is
unclear, but `--sourcecred-directory` is a bit redundant, while
`--directory` is vague and `--sourcecred-directory` is redundant.
This is an easy way out, but we can put the flag for this back in if
it becomes a problem.
- The `--max-old-space-size` argument has been removed in favor of a
fixed value. It’s unlikely that users should need to change it.
If we’re blowing an 8GB heap, we should try to not do that instead
of increasing the heap.
- Loading zero repositories, but specifying an output directory, is
now valid. This is the right thing to do, but OClif got in our way
in the previous implementation.
Test Plan:
Unit tests added, with full coverage; run `yarn unit`.
To try it out, run `yarn backend`, then `node bin/cli.js load --help` to
get started.
I also manually tested that the following invocations work (i.e., they
complete successfully, and `yarn start` shows good data):
- `load sourcecred/sourcecred`
- `load sourcecred/example-git{,hub} --output sourcecred/examples`
These work even when invoked from a different directory.
wchargin-branch: cli-load
Summary:
This includes environment variables to specify the SourceCred directory
and the GitHub token. Parts of this may change once #638 is resolved.
Test Plan:
Unit tests included, with full coverage; run `yarn unit`.
wchargin-branch: cli-common
Summary:
This commit includes a minimal usage of an actual CLI application. It
provides the `help` command and no actual functionality.
Test Plan:
Unit tests added, with full coverage. To see it in action, first run
`yarn backend`, then run `node bin/cli.js help`.
wchargin-branch: cli-beginnings
Summary:
This commit introduces the notion of a `Command`, which is simply a
function that takes command-line arguments and interacts with the real
world. This infrastructure will enable us to write a well-tested CLI.
The `Command` interface is asynchronous because commands like `load`
need to block on promise resolution (for loading GitHub and Git data).
This is annoying for testing, but does not actually appear to be a
problem in practice.
Test Plan:
Unit tests added. See later commits for real-world usage.
wchargin-branch: cli-command-infrastructure
Summary:
Per #580, we aim to remove OClif. To do so, we move the old system to a
directory `oclif`, and will create the new system in the now-vacant
`cli` directory.
Test Plan:
Note that `yarn backend` still builds, that `node bin/sourcecred.js`
still has `help` and `load`, and that `git grep -wc cli` yields only
`yarn.lock:9`.
wchargin-branch: rename-cli-to-oclif
Our serialized RelationalView can get quite large - in the case of
TensorFlow it's over 190MB. This is a problem, as GitHub pages have a
hard cap of 100MB on hosted files.
As a temporary workaround, this commit introduces a method,
`compressByRemovingBody`, which strips away the bodies of every post. In
the longer term, we'll need a solution that scales with larger
repositories, e.g. sharding the relational view into smaller pieces.
Test plan: Unit tests were added. I've manually confirmed that the
newly-generated views are smaller (2.1MB vs 3.3MB), and that the
frontend continues to function.
Summary:
We store the relational view in `view.json.gz` instead of `view.json`,
taking advantage of the isomorphic `pako` library for gzip encoding and
decoding.
Sample space savings (note that post bodies are included; i.e., #747 has
not been applied):
SAVE OLD (B) NEW (B) REPO
89.7% 25326 2617 sourcecred/example-github
82.9% 3257576 555948 sourcecred/sourcecred
85.2% 11287621 1665884 ipfs/js-ipfs
88.0% 20953425 2520358 gitcoinco/web
84.4% 38196825 5951459 ipfs/go-ipfs
84.9% 205770642 31101452 tensorflow/tensorflow
<details>
<summary>Script to generate space savings output</summary>
```shell
savings() {
printf '% 7s % 11s % 11s %s\n' 'SAVE' 'OLD (B)' 'NEW (B)' 'REPO'
for repo; do
file="${SOURCECRED_DIRECTORY}/data/${repo}/github/view.json.gz"
if ! [ -f "${file}" ]; then
printf >&2 'warn: no such file %s\n' "${file}"
continue
fi
script="$(sed -e 's/^ *//' <<EOF
repo = '${repo}'
pre_size = $(<"${file}" gzip -dc | wc -c)
post_size = $(<"${file}" wc -c)
percentage = '%0.1f%%' % (100 * (1 - post_size / pre_size))
p = '% 7s % 11d % 11d %s' % (percentage, pre_size, post_size, repo)
print(p)
EOF
)"
python3 -c "${script}"
done
}
```
</details>
Closes#750.
Test Plan:
Comparing the raw old version with the decompressed new version shows
that they are identical:
```
$ <~/tmp/sourcecred/data/sourcecred/example-github/github/view.json \
> shasum -a 256 -
63853b9d3f918274aafacf5198787e18185a61b9c95faf640a1e61f5d11fa19f -
$ <~/tmp/sourcecred/data/sourcecred/example-github/github/view.json.gz \
> gzip -dc | shasum -a 256
63853b9d3f918274aafacf5198787e18185a61b9c95faf640a1e61f5d11fa19f -
```
Additionally, `yarn test --full` passes, and `yarn start` still loads
data and runs PageRank properly.
wchargin-branch: gzip-relational-view
The GitHub regex in urlIdParse.js incorrectly disallowed repo names with
underscores and dots. Fixes#721.
To mitigate errors like this in the future, code which uses regexes to
find owners and repos has been modified to all depend on the same regex
pattern.
Test plan:
Unit tests have been updated to include the failure case (they correctly
failed), and then code was updated so that the tests pass again.
Also, I manually verified that loading ipfs/js.ipfs.io no longer fails.
Paired with @wchargin
This commit isolates all of the log-weight behavior in the weight
slider. That slider moves in log space, but the numbers printed and
passed around the WeightConfig code are now always in linear-space.
This should reduce confusion in the UI and for developers.
This commit contains two other improvements: (#588)
- Changes the (log space) range on the sliders from ±10 to ±5
- Change the order from slider, weight, name to name, slider, weight, so
that there is more visual separation between the name and the weight.
Test plan: Changes to the weight slider are tested. Changes to the
WeightConfig aren't (#604) so I manually tested the UI.
PluginAdapters and Node/Edge types are increasingly fundamental to the
cred explorer. Prior to this commit, we had no canonical demo
adapters/types, and we would create ad-hoc and messy adapters whenever
we needed them. This creates unnecessary repetition and lowers test
quality.
This commit creates a canonical demo adapter (loosely themed based on
the wonderful game [Factorio]) and refactors most existing test cases to
use the demo adapters. In particular, the horrible mess of pagerankTable
adapters has been removed.
[Factorio]: https://www.factorio.com/
I left `aggregate.test.js` untouched because I would have needed to
materially re-write the tests to port them over. I added a comment so
that if we ever do re-write those tests, we'll use the new demo
adapters.
Test plan: `yarn test` passes.
This commit factors the weight sliders used for both node and edge
weights into a shared WeightSlider component, and factors out the
direction slider used for edge weights into a DirectionalitySlider.
Both of these components are tested. This is a step towards #604.
Test plan:
The specific behaviors of the sliders are well tested. Since the weight
config as a whole is not tested, I manually verified by messing with the
weights that node weights, edge weights, and edge directionality all
affects the cred distribution as anticipated.
Summary:
We currently load trees and then throw them away later, because we don’t
get useful signal from them. We should consider not doing that. This
will be faster.
Test Plan:
```
$ time node bin/sourcecred.js load tensorflow/tensorflow --plugin git
real 0m33.512s
user 0m35.196s
sys 0m12.489s
```
Also, `yarn test --full` passes.
wchargin-branch: git-deforestation
Adds a link titled "what is this?" that points to my gentle introduction
to cred. Also, move the feedback link to be next to it and get rid of
the prototype disclaimer.
Test plan: Visual inspection, also a test was updated.
Summary:
This fixes a bug where, if the `SOURCECRED_DIRECTORY` environment
variable is set to `foo` but the `-d bar` flag is passed, then the
repository registry will be written under `foo` but the plugin data will
be loaded under `bar`.
Test Plan:
```
$ rm -rf /tmp/good /tmp/bad
$ SOURCECRED_DIRECTORY=/tmp/bad >/dev/null \
> node bin/sourcecred.js load sourcecred/example-github -d /tmp/good
$ [ -d /tmp/bad ]; echo $?
$ find /tmp/good
/tmp/good
/tmp/good/cache
/tmp/good/cache/sourcecred
/tmp/good/cache/sourcecred/example-github
/tmp/good/cache/sourcecred/example-github/github
/tmp/good/cache/sourcecred/example-github/git
/tmp/good/repositoryRegistry.json
/tmp/good/data
/tmp/good/data/sourcecred
/tmp/good/data/sourcecred/example-github
/tmp/good/data/sourcecred/example-github/github
/tmp/good/data/sourcecred/example-github/github/view.json
/tmp/good/data/sourcecred/example-github/git
/tmp/good/data/sourcecred/example-github/git/graph.json
```
wchargin-branch: load-pass-context
Fixes#696.
Test plan: This is basically a config change, so I manually tested it.
I ran SourceCred on gitcoinco/web, which has two bots,
and verified that the bots are correctly removed from the list of users.
Selecting "Bots" in the dropdown filter shows the two bots. Changing
the user weight does not affect the bots' scores, and changing the bot
weight does affect the bots' scores.
Summary:
We can now set, at build time, a URL to be displayed at the top of the
prototype, encouraging users to provide feedback. If the URL is not
provided, it defaults to the appropriate topic on the SourceCred
Discourse instance.
The result looks like this:
![Screenshot of the feedback URL in the prototype][screenshot]
[screenshot]: https://user-images.githubusercontent.com/4317806/44814824-a238b380-ab92-11e8-88c8-dfbae27ca496.png
Test Plan:
Unit tests added to `yarn sharness-full` and `yarn unit`.
You can run `yarn start` to see the message with the default URL, or
`SOURCECRED_FEEDBACK_URL=http://example.com/ yarn start` to specify a
custom URL.
wchargin-branch: feedback-url
This commit adds a hardcoded list of known bots. Building on #713, it
categorizes those userlikes with the bot subtype. (Note that those users
may not be bots in the GitHub ontology - GitHub doesn't actually have a
clear record of which userlikes are bots.)
Progress towards #696.
Test plan:
Observe the single snapshot change, which demonstrates that @credbot is
now correctly categorized as a bot.