Commit Graph

1162 Commits

Author SHA1 Message Date
Dandelion Mané 6a40d962fb Mass update of dependencies
This commit deletes and regenerates the `yarn.lock` file, with the
effect that all dependencies are upaded to the latest version allowed by
our package.json. (For example, if we are pinned to `^2.4.0`, we might
now get `2.6.3`.)

This commit was generated via:
```Bash
$ rm yarn.lock
$ yarn
```

Test plan: `yarn test --full` passes, and I also manually tested the
frontend.
2019-07-05 17:33:28 +01:00
Dandelion Mané eadcca8999 Upgrade flow to to 0.102.0
This necessitated a number of type fixes:
- Upgraded the express flow-typed file to latest
- Added manual flow error suppression to where the express flow-typed
file is still using a deprecated utility type
- Removed type polymorphism support on map.merge (see context here[1]).
We weren't using the polymorphism anywhere so I figured it was simplest
to just remove it.
- Improve typing around jest mocks throughout the codebase.

Test plan: `yarn test --full` passes.

[1]: https://github.com/flow-typed/flow-typed/issues/2991
2019-07-05 17:21:56 +01:00
Dandelion Mané 230756ffec Upgrade jest from v23 to v24
Just some general housekeeping. `yarn test --full` passes without issue.
2019-07-04 18:57:50 +01:00
Dandelion Mané 6a13248b09 Upgrade prettier
This commit updates our prettier version from `1.13` to `1.18`. Looks
like software does get better over time! I like all of the changes.

Test plan: `yarn test` passes. I've manually inspected the diffs.
2019-07-04 20:33:42 +03:00
Dandelion Mané 29c9229c28 Update better-sqlite3 to v5
When we took a dep on better-sqlite3 in #836, we used a fork, because
better-sqlite3 did not yet support private in-memory databases via the
`:memory:` filepath. As of better-sqlite3 v5, this has been added to
mainline, so we no longer need the fork.

The v4->v5 transition involves some breaking changes. The only ones that
affected us were two field renames, from `lastUpdateROWID` to
`lastUpdateRowid`, and `returnsData` to `reader`.

Test plan:
After updating the field accesses, `yarn test --full` passes. For added
safety, I also blew away cache, loaded a nontrivial repository, and
verified that the full cred workflow still works.

cc @wchargin
2019-07-04 20:31:32 +03:00
William Chargin 99b2b12a14 tools: make `update_snapshots.sh` reentrant
Summary:
General cleanup to `update_snapshots.sh`, primarily such that it is free
of race conditions and does not clobber your build output directory. The
mechanism is the same as used in `yarn test`, via `SOURCECRED_BIN`.

Test Plan:
Remove your build directories (`rm -r ./bin ./build`), then run this
script (`./scripts/update_snapshots.sh`) and check that each subprocess
is properly invoked and all tests pass. Check that the temporary build
directory is cleaned up.

wchargin-branch: update-snapshots-cleanup
2019-07-04 13:45:10 +03:00
dependabot[bot] c3f2f4d963 Bump diff from 3.4.0 to 3.5.0
Bumps [diff](https://github.com/kpdecker/jsdiff) from 3.4.0 to 3.5.0.
- [Release notes](https://github.com/kpdecker/jsdiff/releases)
- [Changelog](https://github.com/kpdecker/jsdiff/blob/master/release-notes.md)
- [Commits](https://github.com/kpdecker/jsdiff/compare/v3.4.0...v3.5.0)

Signed-off-by: dependabot[bot] <support@github.com>
2019-07-04 13:44:45 +03:00
Dandelion Mané e47c9b3aba graph: add node and edge timestamps
This commit updates the Graph class so that both nodes and edges have
timestmaps. This is a big step for #862.

Test plan: `yarn test --full` passes.
2019-07-04 13:44:28 +03:00
Dandelion Mané 6c5e8b70d6 graph: add descriptions
This updates the graph `Node` type to include a string description.

The description should be a brief (ideally oneline) string giving
context on what the node is. All planned frontends will support
markdown, so linking to context (e.g. linking to the issue corresponding
to an ISSUE type node) is supported.

This commit updates the Git and GitHub plugins to use the new
description field.

Test plan: `yarn test --full` passes, and I've inspected snapshots and
made sure they look reasonable.
2019-07-04 13:44:28 +03:00
Dandelion Mané e7add05df5 Plugins create dangling edges
The GitHub plugin no longer adds a Node to the graph for Git commits.
Instead, it creates a dangling edge to it. This frees the GitHub plugin
from responsibility for setting the timestamp or other metadata for Git
nodes.

The Git plugin no longer adds a Commit Node to the graph immediately when
encountering a commit's parent hash. Instead, it creates an edge to the
parent, and then fills in the parent node once it is encountered in the
commit store.

Test plan: Load a real repository with merged pull requests
(e.g. sourcecred/research) into the explorer, and verify that GitHub
commit entities are still connected to Git commits, and that Git
commits are still connected to their parents.
2019-07-03 15:19:11 +03:00
Dandelion Mané 02a8e02922 graph: add support for dangling edges
This commit modifies the Graph class so that it permits dangling edges;
that is to say, edges whose src or dst are not present in the graph.
Dangling edges may be directly added to the graph, or existing edges may
become dangling if their src or dst is removed.

This change is prerequisite to #1136; if we require that nodes have
metadata, we should also make it possible to add edges to nodes that
don't yet exist, as the plugin creating an edge may not have access to
the full metadata needed to add the node.

To support this change, there is now an `isDanglingEdge` method on the
graph, which reports whether or not the edge is dangling. Also,
`Graph.edges` requires that the client make an explicit choice on
whether dangling edges are desired. This ensures that we do not
accidentally include dangling edges in a case where they are
inappropriate (e.g. creating a Markov chain) or accidentally discard
dangling edges when they are needed (e.g. when merging or serializing).

The Graph's invariant checker has been updated to reflect the new
semantics.

The Graph compat version has been bumped, since this is a break in
backwards compatibility.

Note that this commit does not change the behavior of any plugins; that
is to say, no plugins create dangling edges (yet).

Test plan: The advanced graph test case has been updated to include
dangling edges. The tests for Graph, PagerankGraph, and
GraphToMarkovChain have been updated. `yarn test --full` passes.
2019-07-03 15:19:11 +03:00
Dandelion Mané f934735afc
Graph: reify Object nodes (#1188)
This commit modifies the base `Graph` class so that nodes are now
represented by `Node` objects rather than `NodeAddressT`. The intention
is to start adding additional fields (e.g. description and timestamp) to
nodes, although that is not included in this commit.

See #1136 for rationale.

Test plan: The graph is very well tested, and this commit adds
additional tests and invariant checking. Some additional test code
needed update. `yarn test --full` passes, and the SourceCred UI works as
expected.
2019-06-14 03:22:31 +03:00
Dandelion Mané 7277867cc8 cleanup: PagerankGraph getters return undefined
PagerankGraph's `node` and `edge` getters returned null for unavailable
entries, rather than undefined. This is inconsistent with general JS
behavior, and with the base Graph. I've now cleaned it up.

Test plan: unit tests updated; `yarn test` passes.
2019-06-14 02:46:11 +03:00
Dandelion Mané bcf805b9c8 cleanup: remove test duplication
In a previous commit (#1182) I inadvertently duplicated some tests. They
have now been removed.

Test plan: `yarn test` passes.
2019-06-14 02:46:11 +03:00
Dandelion Mané 414fb9f89f
Add GitHub entity descriptions (#1186)
Every GitHub entity now has a `description` method which returns a short
markdown description. These will be added to the graph as part of #1136.

Test plan: Inspected snapshots, `yarn test --full`.
2019-06-13 23:58:17 +03:00
Dandelion Mané 31abd380dc
Add GitHub reaction timestamps (#1185)
This will allow timeline cred (#862) to do a better job of flowing cred
across reaction edges. (Very old reactions should not be moving a lot of
present-day cred.)

Test plan: Inspected snapshot changes.
2019-06-13 23:43:54 +03:00
Dandelion Mané b03f824e7d
GitHub entities expose `timestampMs` (#1184)
Every GitHub entity from `RelationalView` now has a `timestampMs`
method. This replaces the standalone `createdAt` method.

Test plan: Snapshots look good.
2019-06-13 23:32:22 +03:00
Dandelion Mané 1ec3945cdb
Load commit authored datetimes from GitHub (#1183)
It's an extension of #1152 induced by #1175.

It's a very simple change; I just changed the schema, and
`scripts/update_snapshots.sh` took care of the rest.

Test plan: Inspected snapshots and generated flow types.
2019-06-13 23:27:56 +03:00
Dandelion Mané 4029458098
Factor out distribution modules (#1182)
This pulls distribution related code out of `markovChain.js` into the new
`distribution.js` module, and from `graphToMarkovChain.js` into
`nodeDistribution.js`.

Since the `computeDelta` method is now exported, I've added some unit
tests.

Test plan: `yarn test` passes.
2019-06-13 23:24:37 +03:00
Dandelion Mané e47a5bd84e Remove `createdAt` from the `AnalysisAdapter`
As #1136 will be moving timestamps into the graph, we no longer need
`createdAt` method in the `AnalysisAdapter`. Actually, we no longer need
the adapter/loader distinction introduced in #1157. I haven't taken the
time to remove the `BackendAdapterLoader` concept because a) we may need
it later, and b) if we don't, I'll likely remove the `AnalysisAdapter`
concept entirely, in favor of having plugins directly save `graph.json`
files to a known location.

Test plan: `yarn test` passes.
2019-06-13 23:24:20 +03:00
Dandelion Mané 4b1763ebc6 Discard mentionsAuthorReference
I added `mentionsAuthorReference` based on an untested hypothesis that
they would be useful. With the passage of time, I've never seen any
evidence that they actually improve cred socres (their impact seems
negligible), and they add complexity.

In the future, "go-fishing" style heuristics like this should not merge
unless they are of clearly demonstrated value. Also, it would be better
to add stuff like this via a standalone plugin rather than in the core
GitHub logic.

Undoes #806.

Test plan: `yarn test`
2019-06-13 23:24:20 +03:00
Dandelion Mané 3179ba841b remove `sourcecred export-graph`
Now that the graph is saved by default as a part of load, users who need
the graph can grab it directly from the `$SOURCECRED_DIRECTORY`. If we
really need a command line util for grabbing it, we should rewrite that
command to just grab the graph from that spot rather than re-computing
it.

Test plan: `yarn test`
2019-06-13 22:40:07 +03:00
Dandelion Mané a348747aed Remove `timestampMap`
As of #1136, this will be redundant with raw information in the graph.

Test plan: `yarn test`
2019-06-13 22:40:07 +03:00
Dandelion Mané 67baacd862 cli load: save graph, not pagerank or timestampMap
As of the timeline cred work, I'm shifting emphasis away from raw
PageRank results, in favor of timeline pagerank results. As such,
there's no need to have load save the regular pagerank results on
creation.

As of #1136, there will be no need for timestampMap, as that data will
be present directly in the graph. As the timeline cred UI will depend on
the full graph for analysis, let's save the graph instead.

Test plan: `yarn test` and snapshot inspection.
2019-06-13 22:40:07 +03:00
Dandelion Mané e493af2307
Refactor graph-related test code (#1179)
This commit adds new helper methods for creating test nodes (`node` and
`partsNode`) and for creating test edges (`edge` and `partsEdge`) to
graphTestUtil.js.

This is very helpful in light of work related to #1136. I'm going to
change the concept of "node" from a raw address to an object, add fields
to that object, and add fields to the `Edge` type. If done naively, we
would need to change all the test code across the project for every one
of those changes.

By centralizing the creation of test nodes and edges behind the new
functions, we can update all the test code in a single place.

This change is trivial from a conceputal perspective, and very
broad-reaching from a code-touching perspective. It should be easy to
review, because if tests pass then the change is probably working as
intended. :)

Test plan: `yarn test`
2019-06-13 22:16:26 +03:00
Dandelion Mané e916bc91c8
Temporarily remove the odyssey plugin (#1178)
In #1132 and #1134, I started work on the Odyssey plugin. However,
before getting it to a state where it's usefully included in SourceCred,
I decided to pivot to focus on timeline cred first.

Now I'm merging significant refactors as a part of timeline cred
(#1136). As a side effect of this refactor, the Odyssey plugin should
undergo significant changes (OdysseyInstance is now basically redundant
with base Graph.) Rather than incrementally update unused code, I elect
to remove the plugin. This code should be revived on a side branch, and
then merged into master once we have a fully functioning prototype.

Test plan: `yarn test` passes.
2019-06-13 17:07:05 +03:00
Dandelion Mané 3c8fd0e701
Graph refactor: {inEdges, outEdges}->incidentEdges (#1173)
This commit refactors the Graph class so that rather than having
separate maps for inEdges and outEdges, there is a single incidentEdges
map, which contains objects with inEdges and outEdges.

This is motivated by a forthcoming big change as part of #1136; namely, to
allow storing dangling edges in the graph. Once we do so, we'll need a
consistent source of truth that enumerates all of the node addresses
which are accessible in the graph (either because they correspond to a
node in the graph, or because they are the src or dst of a dangling
edge). We could do this by adding another field to graph which tracks
this set, but by making this refactor, we can instead use the key set of
_incidentEdges as the source of truth for which node addresses are
present.

Besides being motivated by #1136, I think it's cleaner in general. Note
there are fewer ways for the graph to be inconsistent, as it's no longer
possible for inEdges and outEdges to have inconsistent sets of node
addresses.

The most complicated piece of this change was updating the automatic
invariant checker. It was no longer possible to test 3.1 and 4.1
separately, so they needed to be merged into a new invariant. Rather
than re-enumerate the invariants, I called the new one the 'Temporary
Invariant', because it is going to disappear in a subsequent commit.

Test plan: `yarn test` passes. Since Graph has extremely thorough
testing, this gives me great confidence in this commit. Note that no
observable behavior has changed.
2019-06-13 15:49:12 +03:00
Dandelion Mané f5a46f8b31
update_snapshots.sh updates generated flow types (#1176)
The generated GitHub GraphQL flow types are a kind of snapshot, and it
can be hard to remember/discover how to update them when making changes
to schema. Therefore, I've included them in the snapshot update script.

Test plan: I used it to update the types, and it worked as intended.
2019-06-04 02:48:51 +03:00
Dandelion Mané ccfaa25e7b
Add a GitHub Commit node type (#1175)
At present, the Git commit node type lives in a strange state of shared
responsibility between GitHub and Git. The Git plugin is nominally
responsible for it, but its render method tries to show a hyperlink to
GitHub -- which is awkward for many reasons, including that the same Git
commit could have multiple hyperlinks on GitHub.

This commit resolves that issue by separating the existing commit type
into two: the Git Commit type, which is owned by the Git plugin and
doesn't have hyperlinks or any fancy GitHub metadata, and the GitHub
Commit, which is owned by the GitHub plugin, corresponds to a unique
database id in GitHub, and has a corresponding GitHub url.

The two commits are connected by a CorrespondsToCommit edge type, which
links from the GitHub commit to the corresponding Git commit.

This is necessary for #1136, as if we want to make descriptions a part
of the graph payload, we need for descriptions to be unique for a given
address--and descriptions are only unique if we identifiy each GitHub
commit pointer as a separate address.

Test plan: The unit testing in this part of the codebase is light, so I
verified that the frontend work as expected for `sourcecred/sourcecred`
and `sourcecred/research`. The new node type and edge type appear
properly in the UI, the GitHub commits are connected to their Git
counterparts, etc.
2019-06-03 23:57:48 +03:00
Dandelion Mané 16edea6413
Remove the `graphView`s (#1171)
A long time ago, we made graph views for git and github. These are
interfaces over the graph which allow retrieving nodes' relations, e.g.
finding the parent address of a commit just using the graph.

These are fairly complex, and have seen almost no use at all. The one
thing they are used for is implementing invariant checking. The
invariant checking is nice in principle, but since we only apply it to
the example data, its of very limited value in practice.

Since I'm planning a significant Graph refactor (#1136), I'd rather delete this
code than continue to maintain it, since I think it's complexity/value
ratio is unfavorable.

Test plan: `yarn test`
2019-06-03 21:07:27 +03:00
Dandelion Mané fcbd024a83 Fix silently failing github token test
This is another minor silent test failure: the error message thrown by
loadIndividualPlugin when a GitHub token is not available is not quite
the error that was specified in the test.

There were two issues: that we were testing for the wrong error message,
and that the failure didn't fail the test. I fixed both issues (by
changing the message thrown to match the test, and by having the test
_return_ the expectation that the promise will reject, and by expecting
there to be one assertion.)

Test plan: `yarn unit cli/load` no longer shows any unhandled promise
rejection warnings. If the test is modified so that it checks for the
wrong string, it now properly fails rather than passing with an
unhandled rejection.
2019-06-02 11:27:36 +03:00
Dandelion Mané 7c4ff66907 Fix uncaught test failures
Prior to this commit, if you run `yarn unit cli/load`,
you would see a lot of unhandled promise rejection warnings related to
the fact that load calls a `saveTimestamps` function which expects the
SourceCred directory to contain the results of really loading SourceCred
plugins. However, when testing, these functions have been mocked, and so
saveTimestamps rejects. This rejection was not caught, and polluted the
test output without actually failing the tests.

This commit updates the tests so that saveTimestamps can be mocked (via
dependency injection) and we can both verify that it is invoked
correctly, and not pollute the test output with spurious warnings.

Test plan:
- `yarn test --full` passes
- `yarn unit cli/load` produces far fewer
UnhandledPromiseRejectionWarnings (there is still one unrelated one)
- loading sourcecred/research still works (as a canary)

Note: the PR which introduced this issue is #1162.
2019-06-02 11:27:36 +03:00
Dandelion Mané 1d32141f76
Fix `sourcecred load` for real repos (#1164)
As discussed in #1163, #1162 caused `sourcecred load` to start failing
for real repos (e.g. sourcecred/research). This commit merges a
somewhat hacky fix.

Test plan: `yarn test` passes, and `sourcecred load sourcecred/research`
works.
2019-05-31 20:16:58 +03:00
Tyler Mace 1fbf8cd587 Quicker failure and description when invalid token supplied (#1161)
Fixes #1156

When users export a GitHub API token that has insufficient privleges
or has been revoked, we have been using a catch all error with retry
to handle it. This change adds a new error type for bad credentials
and does not retry.

Test plan:
There are no unit tests that cover this, however, you can test the
change by supplying a revoked token and attempting to load a GitHub
repo.
2019-05-30 22:18:30 +03:00
Dandelion Mané ad2470e5c6
Aggregate timestamp information on sourcecred load (#1162)
This modifies `sourcecred load` so that it saves timestamp information
for all of the loaded plugins in a single aggregated map.

This is quite convenient, as it saves consumers of timestamp information
from needing to worry about the (rather hacky) implementation whereby
the data is fed from each adapter. Instead, consumers can just load the
timestamp map. This will also make it much easier to use timestamp info
in the research codebase.

Test plan: The timestampMap module has testing around generating the map
from the adapter and nodes, writing it, and reading it.

I haven't added any testing to the `load` CLI command. I think it would
be redundant as the updated snapshot test reveals that the map is
getting serialized properly.

Tests pass, and I have inspected the snapshot
2019-05-30 17:15:15 +03:00
Dandelion Mané 4dc97fcc57
PagerankGraph reuses existing distribution (#1160)
This commit modifies `PagerankGraph.runPagerank` so that rather than
always starting from a uniform distribution, it starts with the
PagerankGraph's existing score distribution. The PagerankGraph
initializes with a uniform score over nodes, so it has the exact same
behavior on the first time that runPagerank is called. On subsequent
calls, PageRank will likely converge a lot faster, because it's starting
from converged scores. (It should still be a lot faster in cases where
e.g. the user has tweaked the weights.)

In certain degerate cases, this could change the resultant scores.
Specifically, if there are isolated nodes in the graph and alpha=0, then
the isolated nodes' final scores depends on the initial score. In
general, I think this won't be an issue as we expect alpha > 0 in normal
usage.

Test plan: I added a unit test to verify this property, by checking that
running PageRank with maxIterations==0 on an already-converged graph
results in a still-converged graph. Also, existing tests pass.

I think we can now close #1020.
2019-05-29 20:29:28 +03:00
Dandelion Mané 14eee06799
Add a universal snapshot updater (#1159)
As SourceCred has evolved, we've grown more and more snapshot tests that
are not included in Jest. The GitHub plugin has two ad-hoc snapshot
tests, the Git plugin has one, and the sharness test suites have one.

This makes it difficult to keep track of where to update snapshots when
core changes are made. To fix this, I've added a script,
`scripts/update_snapshots.sh`, which updates snapshot tests across the
project.

Test plan: I removed existing snapshots across the codebase, ran the
snapshot tester, and they correctly regenerated.
2019-05-28 18:59:50 +03:00
Dandelion Mané e01247a642
Expose `createdAt` in `AnalysisAdapter` (#1157)
* Refactor Loader from AnalysisAdapter

At present, the only data the AnalysisAdapter provides is the Graph, so
the AnalysisAdapter has a `load` method which directly returns the
graph. I'm planning to add a `createdAt` getter to the adapter as well,
which also will depend on loading the data. To make this change
convenient, I'm starting by refactoring an AdapterLoader out, which
manages loading data from disk, so that once we have an AnalysisAdapter,
it already has all relevant data loaded. Then, it will be much easier to
add a `createdAt` method.

Test plan: Tests updated, flow passes.

* Add `createdAt` to the analysis adapter

A big step forward for time-varying cred. This will make `createdAt`
timestamps available for PageRank analysis.

Test plan: Added new unit tests. Inspect the snapshots. Run `yarn test`.
2019-05-28 18:03:44 +03:00
Dandelion Mané 61627531bb
Fix a build error induced by #1153 (#1154)
Thanks to @wchargin for [catching it].

[catching it]: https://github.com/sourcecred/sourcecred/issues/1151#issuecomment-494256526

Generated this change via:
```
$ yarn backend
$ (cd sharness; UPDATE_SNAPSHOT=1 ./test_load_example_github.t -l)
```

Test plan: `yarn test --full`, excluding the known (local-only) failure
described in #1151
2019-05-21 14:26:27 +03:00
Dandelion Mané da5bce255e
Add authorDate tracking to Git commits (#1153)
Modifies the Git plugin so that we now track commit author dates.
Similar to in #1152, they are encoded in MsSinceEpoch.

Test plan: `yarn test --full` passes, except for the pre-existing
failure discussed in #1151.

Thanks to @s-ben for a conversation which motivated these changes.
2019-05-21 06:56:57 +03:00
Dandelion Mané d51c0b6715
Add createdAt timestamp tracking to GitHub (#1152)
Updates github schema to include createdAt timestamps, and then updates
the RelationalView to provide those timestamps as MsSinceEpoch.

I added createdAt timestamps to Repos, Issues, Pulls, Reviews, and
Comments, as these correspond to GitHub graph nodes where I think
time-based filtering is relevant. I didn't add them to Users, Reactions,
or Commits. Reactions, because they correspond to edges not nodes. (We
could consider doing the time filtering on edges too, but I'd rather
keep it simple for now.) Commits, because they're owned by a different
plugin. Users, because... in a certain sense the user identity is
timeless, the time factoring is mostly so we can evaluate how users'
cred varies over time.

Anyway, it will be easy to add more fields later if we need them.

Test plan:
- Inspect snapshot changes
- Ran `yarn test --full`
  - Its only failure is pre-existing, per #1151

Thanks to @s-ben for some motivation and discussion about this change.
2019-05-21 06:16:16 +03:00
Dandelion Mané a831e05e5f
Add a WeightsFileManager (#1150)
This adds a WeightsFileManager component that allows the user to save or
load weights in the cred explorer. Clicking the download icon downloads
the weights, clicking the upload icon uploads them.

I also did a slight refactor to the FileUploader so that it no longer
always provides the file upload icon, instead the instantiator passes
children which act as the upload clickable. Seemed more consistent.

Test plan: No tests added, but I manually tested that upload and
download both work.
2019-05-21 04:41:00 +03:00
Dandelion Mané fb89559e44
Add a FileUploader component, with inspection test (#1149)
* Add FileUploader with inspection test

TODO: get it working

* Add a FileUploader component, with inspection test

This adds a FileUploader component, which allows the user to upload JSON
files. Rather than using automated testing, it has an inspection test.
The inspection test may be run by navigating to:
http://localhost:8080/test/FileUploader/

This commit also adds some basic utility functions for defining
inspection tests to `routeData.js`. We should improve support for
inspection tests in the future; see [#1148].

[#1148]: https://github.com/sourcecred/sourcecred/issues/1148

Test plan: Ran the included inspection test.
2019-05-20 17:12:57 +03:00
Dandelion Mané f6a4042342
Add JSON serialization support for weights (#1147)
Test plan: New unit tests added.
2019-05-20 14:12:12 +03:00
Dandelion Mané 03a8ab679b
Refactor the Weights module (#1146)
This commit refactors the `analysis/weights` module so that there's
a single top-level type called `Weights` which includes all of the
user-specified weights for running PageRank. This includes both manually
specified type weights, and per-node weights.

In contrast to the old weights module which had weighted types (i.e. the
type bundled with the weight), this only records the weight choice
(keyed by address). Similarly, the map is empty unless the user has
explicitly chosen a weight. This has a few advantages:

- For planned work on serializing weights, it's convenient that
there's a single unified object that bundles all the weights.

- For planned work on serializing weights, it's convenient that we only
record actual choices, not the default values (defaults are provided
within the WeightsToEdgeEvaluator). This way, we don't need to manually
filter out the default weights for serialization. (We want users who
have not made any choice about the type weights to automatically get new
type weightings we publish in sourcecred/sourcecred).

- Overall, the WeightConfig data piping has become a lot simpler.

While making these changes, I threw away the PluginWeightConfig
component, instead inlining it in the top-level WeightConfig. I also
moved responsibility for the WeightConfig out of the PagerankTable, and
into the containing App; the WeightConfig is created and wired by the
App, and then passed as a whole component into the PagerankTable via
props. This means that PagerankTable does not need to get a bunch of
extra props for piping the state changes into WeightConfig.

I also threw away some frontend test code for the WeightConfig
component. Over the course of the past few months, I've observed that
the value of a lot of the frontend-wiring-testing is pretty low (i.e.
they are not catching issues, because Flow catches them) and the cost of
maintaining the testing is high. I'm now inclined to believe that we
shouldn't be testing routine component wiring via explicit tests,
because it's very easy to verify that the behavior is correct by
interacting with the UI, and the tests are expensive to write and
maintain.

For "core logic" (e.g. behaviors of types and data structures,
algorithms, etc) I continue to believe testing is very valuable. One
heuristic I'll start applying is: "can I achieve high confidence that
this code is correct, without tests, using little effort by manually
inspecting the frontend?". If the answer is yes, I'm unlikely to write
tests.

Test plan: `yarn test` passes, and the UI continues to function,
including changing manual and type level weights and observing cred
changes.
2019-05-20 13:46:08 +03:00
Dandelion Mané 1ea3c1ec94 nit: remove fallbackDeclaration.js
In the previous commit, I failed to remove this file.
This is a cleanup.

Test plan: `yarn test`
2019-05-19 15:50:03 +03:00
Dandelion Mané 467b781788
Remove the fallback adapter (#1145)
Node and Edge types are increasingly important in SourceCred, as we use
them to decide what weights to provide, what description logic to use,
etc. In #631, we hit a bug around not finding a type matching a
particular node, and in #640 I responded by creating a
"FallbackAdapter", which matches any node, and a
"Static/DynamicAdapterSet", which are abstractions for finding adapters
in a way that guarantees the presence of the fallback adapter.

I think this solution is actually pretty brittle. It only works if we
are disciplined in only ever accessing node and edge types using a
context that included the FallbackAdapter, and it requires us to
centralize all of the "type-not-found" logic in one place (the fallback
declaration) irrespective of what the locally appropriate solution is.

Looking back at #631, the core issue was that we had a getter that
promised to always give a matching type and adapter, rather than
returning a possibly undefined adapter. We fixed this by attempting to
ensure that there always would be a matching adapter. I think it would
be better to have the methods honestly report that the adapter might be
null/undefined, and let the client code handle it appropriately.

This commit implements that change. It's motivated by me being
frustrated at the fallback adapter while doing other refactoring.

Test plan:
Unit tests have been updated and `yarn test` passes. Also, I
experimentally removed the Git plugin from the list of adapters for both
the backend and frontend, and verified that the frontend UI and the
pagerank and export-graph commands still worked.
2019-05-19 15:49:24 +03:00
Dandelion Mané 9d8d223653
Weights: Refactor around a new EdgeWeight type (#1144)
This refactors the weights module (and downstream consumers) so that
rather than tracking forwardWeight and backwardsWeight as separate
fields, we have an `EdgeWeight` type with a forwards and backwards
property. I feel this makes the code a little cleaner and wanted it a
few times while doing other refactors in this module.

Note that the graphToMarkovChain module has a distinct `EdgeWeight` type
which serves a similar purpose but happens to have different field
names. I've added a TODO to rename those fields for consistency.

Test plan: Since this is slight data structure re-organization, no new
tests are required; that `yarn test` passes is sufficient.
2019-05-19 14:17:46 +03:00
Dandelion Mané d2559960bb explorer: tweak weights on a per-node basis (#1143)
This pull request adds a weight slider to every NodeRow in the explorer,
enabling the user to manually set a weight for that node. The weights are
multiplicative with the type level weights, so that they can be changed
independently (e.g. you can have a comment that is weighted 2x higher than
regular comments, but still have comments get a low weight in general).

This pull coordinates a number of different changes across the codebase, all of
which are tested:

Adding support for manual weights in the weights and
weightsToEdgeEvaluator modules.
Modifying pagerankTable.TableRow so that it can show a slider in the second
column.
Adding piping for manual weights into the PagerankTable shared props, and
into the explorer app
Adding the slider to the NodeRow class that displays the current weight,
and can trigger the upstream weight change
Ensuring that the runPagerank call in the explorer actually uses the manual
weights
At present, there is no way to save these weights (they are ephemeral in the
frontend) and so this is clearly a prototype/tech demo level feature rather
than being ready for real usage. Correspondingly, CLI pagerank command always
uses an empty set of manual weights. I plan to remedy this in a follow-on pull
request.

Test plan: Run the included unit tests (yarn test) and also spin up the UI,
verify that it visually looks good in both Firefox and Chrome, and verify that
changing the weights and then re-running PageRank actually causes the cred of
the modified node to change.

Review plan: In addition to carefully reading the code, ensure that all of the
changes described a few paragraphs up are actually tested.

Merge plan: Squash and merge.

Thanks to @s-ben for proposing this feature in Discord, and to everyone
discussing its implications in this Discourse thread.
2019-05-18 19:21:17 +03:00
Dandelion Mané 2999d24593
Fix a conflict between #1140 and #1141 (#1142)
In #1140 I rename a field in PagerankGraph.ScoredNode, and in #1141 I
added a new test which referenced the field.

I merged them in quick succession, introducing a conflict. This PR fixes
it.

Test plan:
`yarn test` passes.
2019-05-17 17:11:22 +03:00