983 Commits

Author SHA1 Message Date
Dandelion Mané
aa7158dd95 add analysis/timeline/timelineCred
This adds a TimelineCred class which serves several functions:
- acts as a view on timeline cred data
  - (lets you get highest scoring nodes, etc)
- has an interface for computing timeline cred
- lets you serialize cred along with the graph and paramter settings
  that generated it in a single object

One upshot of this design is that now if we let the user provide weights
(or other config) on load time in the CLI, those weights will get
carried over to the frontend, since they are included along with the
cred results.

TimelineCred has 'Parameters' and 'Config'. The parameters are
user-specified and may change within a given instance. The config is
essentially codebase-level configuration around what types are used for
scoring, etc; I don't expect users to be changing this. To keep the
analysis module decoupled from the plugins module, I put a default
config in `src/plugins/defaultCredConfig`; I expect all users of
TimelineCred to use this config. (At least for a while!)

Test plan: I've added some tests to `TimelineCred`. Run `yarn test`. I
also have a yet-unmerged branch that builds a functioning cred display
UI using the `TimelineCred` class.

fixup tlc
2019-07-11 01:30:27 +01:00
Dandelion Mané
9bd1e88bc9 add analysis/timeline/filterTimelineCred
This adds the `filterTimelineCred` module, which dramatically reduces
the size of timeline cred by throwing away all nodes that are not a user
or repository. It also supports serialization / deserialization.

Test plan: unit tests included
2019-07-11 01:30:27 +01:00
Dandelion Mané
162f73c3e9 add analysis/timeline/distributionToCred
This module takes the timeline distributions created by
`timelinePagerank`, and re-normalizes the scores into cred. For details
on the algorithm, read comments and docstrings in the module.

Test plan: Unit tests added.
2019-07-11 01:30:27 +01:00
Dandelion Mané
87720c4868 Add analysis/timeline/timelinePagerank
As the name would suggest, this module allows computing timeline
PageRank on a graph. See documentation in the module for details on the
algorithm.

Test plan: The module has incomplete testing. The timelinePagerank
function calls out to iterators for getting the time-decayed node
weights and the time-decayed markov chain; these iterators are tested.
However, the wrapper logic that composes these pieces together,
calculates seed vectors, and runs PageRank is not tested. I felt it
would be a pain to test and settled for reviewing the code carefully,
and putting a cautionary note at the top of the function.
2019-07-11 01:30:27 +01:00
Dandelion Mané
cb236eff5d add analysis/weightEvaluator
This commit adds new weight evaluators for nodes and edges. Unlike the
previous evaluator, edges and nodes are handled as separate concerns,
rather than composing the node weights into the edge weights. I think
this separation is cleaner.

Both evaluators use only the address, not the full (Node or Edge)
object. Although we may want to give the edge evaluator access to the
full Edge later, if we decide we want node-type-differentiated edge
weights (e.g. if a hasParent edge has a different weight depending on
whether it is connected to an Issue or a Repository).

weightsToEdgeEvaluator has been refactored to use the new evaluators,
and has been given a deprecation notice.

Test plan: `yarn test`
2019-07-11 01:30:27 +01:00
Dandelion Mané
2335c5d844 add analysis/timeline/interval
This commit adds an `interval` module which defines intervals (time
ranges), and methods for slicing up a graph into its consistuent time
intervals. This is pre-requisite work for #862.

I've added a dep on d3-array.

Test plan: Unit tests added; run `yarn test`
2019-07-11 01:30:27 +01:00
dependabot[bot]
6cb1b336d5 Bump lodash.merge from 4.6.1 to 4.6.2 (#1213)
Bumps [lodash.merge](https://github.com/lodash/lodash) from 4.6.1 to 4.6.2.
- [Release notes](https://github.com/lodash/lodash/releases)
- [Commits](https://github.com/lodash/lodash/commits)

Signed-off-by: dependabot[bot] <support@github.com>
2019-07-11 01:03:01 +01:00
Tyler Mace
2d37dd77cc Removed incorrect information from README.md (#1211)
#1167 added some info to the README about how the user needs to have ssh keys setup. This was true at the time, but changed as a result of #1210. This commit fixes that up by removing the now-outdated information.
2019-07-09 23:29:30 +01:00
Tyler Mace
16e2b3964e
README & CONTRIBUTING updates (#1167)
* Quicker failure and description when invalid token supplied

Fixes #1156

When users export a GitHub API token that has insufficient privleges
or has been revoked, we have been using a catch all error with retry
to handle it. This change adds a new error type for bad credentials
and does not retry.

Test plan:
There are no unit tests that cover this, however, you can test the
change by supplying a revoked token and attempting to load a GitHub
repo.

* Minor: reunite comment with relevant code block

* Update Changelog: Fail quicker and with information when using invalid GH token (#1161)

* Documentation updates

Added note on Changelog update format, SSH key requirements, and formatted the console (CLI) blocks a little better.

* Minor formatting fixes

* Revert a line wrap commit as it is not appropriate on README.md

* Removed prompt tokens and changed non-visible formatting
2019-07-09 14:59:43 -07:00
Dandelion Mané
7d26c196f2 Disable the Git plugin
This commit disables the Git plugin by removing it from the default list
of plugins to load, or to display in the frontend.

Rationale: The git plugin doesn't currently add very much to cred
quality. Git commits have edges to their parent, which isn't a very
meaningful relationship for cred purposes. We'll want to re-enable the
Git plugin once we're ready to support e.g. file and directory level
cred tracking.

I've skipped a block of tests around the git analysisAdapter. (I intend
to deprecate the analysisAdapters, so skipping the tests seemed
preferrable to updating them). I also updated our sharness test for
catching test files without a proper describe block, so that it won't
error on skipped blocks.

Test plan: `yarn test --full` passes. Loading a new repository and
inspecting it in the frontend gives consistent results. There are no
references to Git plugin weights in the frontend, now that corresponding
nodes are not available.
2019-07-09 19:40:17 +01:00
Dandelion Mané
e454a44c71 blacklist dependabot
The dependabot bot has an inconsistent typename in GitHub's database.
We'll blacklist it so we can continue loading `sourcecred/sourcecred`.

Test plan: `node bin/sourcecred.js load sourcecred/sourcecred` fails
before this commit, and succeeds after.
2019-07-09 19:33:16 +01:00
Dandelion Mané
302850202a refactor: describe edge weights consistently
This commit resolves an inconsistency where we called edge weights
`toWeight` and `froWeight` in the core/attribution module, but
`forwards` and `backwards` in the analysis module.

I changed field names in the PagerankGraph JSON, so I bumped its compat.

Test plan: `yarn test --full` passes.
2019-07-09 13:29:22 +01:00
Dandelion Mané
e459a82fae Test node 12 and node 10
This commit changes CI to test against node 12 and 10 instead of node 8.

I test against node 12 by default (it will be LTS soon, and it has a
number of nice improvements compared to 10). We test node10 on the
nightly and post-merge, that way we will still discover quickly if we
have a problem with node 10, but it won't slow down CI for merges.

I'm just dropping explicit support for node 8 entirely, since node 8 is
end-of-life soon (Dec 19).

Test plan: I've locally verified that `yarn test --full` passes for both
node 10 and node 12.
2019-07-09 13:09:17 +01:00
Dandelion Mané
31ab767c03 verify graphToMarkovChain ordering is canonical
This commit just adds a test which verifies that when an
OrderedSparseMarkovChain is created by graphToMarkovChain, its nodeOrder
is the graph's canonical node order.

Test plan: `yarn test`
2019-07-09 13:08:23 +01:00
Dandelion Mané
df55b9c3c5 graph: canoncialize node and edge iteration order
This means that we no longer need to expose methods for extracting the
order from serialized JSON. We can always count on iterating over the
nodes and edges in sorted order.

Test plan: `yarn test`; tests updated.
2019-07-09 13:08:23 +01:00
Dandelion Mané
6738947083
Regenerate yarn.lock file (#1204)
I've regenerated the yarn.lock file (by removing it and then re-running
yarn). This picks up [a fix I authored][fix] for node-eval, which was
preventing `yarn test --full` from passing in node 10 and 12. With this
upgrde of transitive dependencies, we're now ready to officially support
newer versions of node.

Test plan: `yarn test` passes.
2019-07-06 15:02:04 +01:00
Dandelion Mané
b7eae8b3eb
Switch to flow v0.100.0 to try to fix CI issues (#1203)
Since upgrading to flow 0.102.0, we've been having CI issues where flow
fails as "out of retries". In my testing, downgrading flow seems to
resolve this, although it's hard to be certain as the issue strikes
sporadically.

Test plan: This commit definitely works locally; hopefully it will
consistently pass in CI as well.
2019-07-06 14:39:32 +01:00
Dandelion Mané
0ade04b66c
try reverting tmp upgrade (#1202)
After merging #1201, we started seeing build failures ([1], [2]) on
CircleCI. I can't reproduce them locally, but they seem related to the
`tmp` module, so let's try reverting the upgrade.

Test plan: `yarn test` passes locally. Let's see how it does on CI.

[1]: https://circleci.com/gh/sourcecred/sourcecred/1091
[2]: https://circleci.com/gh/sourcecred/sourcecred/1092
2019-07-06 00:22:04 +01:00
greenkeeper[bot]
4758cea2f8 Update dependencies to enable Greenkeeper 🌴 (#1201)
This commit enables Greenkeeper, along with an initial upgrade push for our dependencies.

I've reverted a number of upgrades, and also added them to the
greenkeeper ignore list. They mostly relate to babel/webpack stuff,
which I'm reluctant to dive into now (there have been major upgrades for
both babel and webpack) but we should address eventually. There are also
a few oddballs like whatwg-fetch and history.

Test plan: `yarn test --full` passes.
2019-07-05 20:10:02 +01:00
Dandelion Mané
125e8b0486 Tweak package.json
- Remove obsolete eslint react app config
- Pin webpack major version

Test plan: yarn && yarn test
2019-07-05 19:34:35 +01:00
Dandelion Mané
7a493b596d Update eslint and eslint configuration
This commit updates eslint from v4 to v6. In doing so, I've moved off of
the create-react-app base eslint config. We were on an old version (v2)
and it doesn't make sense to update to v4, as in v4 create-react-app
uses typescript. Also, it didn't make sense to stay on
create-react-app's v2 config, because then it had unmet peer dependency
constraints on old versions of eslint.

Instead, I've moved us to use the default rules for eslint,
eslint-plugin-react, and eslint-plugin-flowtype.

I also made some changes to the codebase to satisfy the new lint rules
that came with this change.

Test plan: `yarn test` passes.
2019-07-05 18:39:00 +01:00
Dandelion Mané
6a40d962fb Mass update of dependencies
This commit deletes and regenerates the `yarn.lock` file, with the
effect that all dependencies are upaded to the latest version allowed by
our package.json. (For example, if we are pinned to `^2.4.0`, we might
now get `2.6.3`.)

This commit was generated via:
```Bash
$ rm yarn.lock
$ yarn
```

Test plan: `yarn test --full` passes, and I also manually tested the
frontend.
2019-07-05 17:33:28 +01:00
Dandelion Mané
eadcca8999 Upgrade flow to to 0.102.0
This necessitated a number of type fixes:
- Upgraded the express flow-typed file to latest
- Added manual flow error suppression to where the express flow-typed
file is still using a deprecated utility type
- Removed type polymorphism support on map.merge (see context here[1]).
We weren't using the polymorphism anywhere so I figured it was simplest
to just remove it.
- Improve typing around jest mocks throughout the codebase.

Test plan: `yarn test --full` passes.

[1]: https://github.com/flow-typed/flow-typed/issues/2991
2019-07-05 17:21:56 +01:00
Dandelion Mané
230756ffec Upgrade jest from v23 to v24
Just some general housekeeping. `yarn test --full` passes without issue.
2019-07-04 18:57:50 +01:00
Dandelion Mané
6a13248b09 Upgrade prettier
This commit updates our prettier version from `1.13` to `1.18`. Looks
like software does get better over time! I like all of the changes.

Test plan: `yarn test` passes. I've manually inspected the diffs.
2019-07-04 20:33:42 +03:00
Dandelion Mané
29c9229c28 Update better-sqlite3 to v5
When we took a dep on better-sqlite3 in #836, we used a fork, because
better-sqlite3 did not yet support private in-memory databases via the
`:memory:` filepath. As of better-sqlite3 v5, this has been added to
mainline, so we no longer need the fork.

The v4->v5 transition involves some breaking changes. The only ones that
affected us were two field renames, from `lastUpdateROWID` to
`lastUpdateRowid`, and `returnsData` to `reader`.

Test plan:
After updating the field accesses, `yarn test --full` passes. For added
safety, I also blew away cache, loaded a nontrivial repository, and
verified that the full cred workflow still works.

cc @wchargin
2019-07-04 20:31:32 +03:00
William Chargin
99b2b12a14 tools: make update_snapshots.sh reentrant
Summary:
General cleanup to `update_snapshots.sh`, primarily such that it is free
of race conditions and does not clobber your build output directory. The
mechanism is the same as used in `yarn test`, via `SOURCECRED_BIN`.

Test Plan:
Remove your build directories (`rm -r ./bin ./build`), then run this
script (`./scripts/update_snapshots.sh`) and check that each subprocess
is properly invoked and all tests pass. Check that the temporary build
directory is cleaned up.

wchargin-branch: update-snapshots-cleanup
2019-07-04 13:45:10 +03:00
dependabot[bot]
c3f2f4d963 Bump diff from 3.4.0 to 3.5.0
Bumps [diff](https://github.com/kpdecker/jsdiff) from 3.4.0 to 3.5.0.
- [Release notes](https://github.com/kpdecker/jsdiff/releases)
- [Changelog](https://github.com/kpdecker/jsdiff/blob/master/release-notes.md)
- [Commits](https://github.com/kpdecker/jsdiff/compare/v3.4.0...v3.5.0)

Signed-off-by: dependabot[bot] <support@github.com>
2019-07-04 13:44:45 +03:00
Dandelion Mané
e47c9b3aba graph: add node and edge timestamps
This commit updates the Graph class so that both nodes and edges have
timestmaps. This is a big step for #862.

Test plan: `yarn test --full` passes.
2019-07-04 13:44:28 +03:00
Dandelion Mané
6c5e8b70d6 graph: add descriptions
This updates the graph `Node` type to include a string description.

The description should be a brief (ideally oneline) string giving
context on what the node is. All planned frontends will support
markdown, so linking to context (e.g. linking to the issue corresponding
to an ISSUE type node) is supported.

This commit updates the Git and GitHub plugins to use the new
description field.

Test plan: `yarn test --full` passes, and I've inspected snapshots and
made sure they look reasonable.
2019-07-04 13:44:28 +03:00
Dandelion Mané
e7add05df5 Plugins create dangling edges
The GitHub plugin no longer adds a Node to the graph for Git commits.
Instead, it creates a dangling edge to it. This frees the GitHub plugin
from responsibility for setting the timestamp or other metadata for Git
nodes.

The Git plugin no longer adds a Commit Node to the graph immediately when
encountering a commit's parent hash. Instead, it creates an edge to the
parent, and then fills in the parent node once it is encountered in the
commit store.

Test plan: Load a real repository with merged pull requests
(e.g. sourcecred/research) into the explorer, and verify that GitHub
commit entities are still connected to Git commits, and that Git
commits are still connected to their parents.
2019-07-03 15:19:11 +03:00
Dandelion Mané
02a8e02922 graph: add support for dangling edges
This commit modifies the Graph class so that it permits dangling edges;
that is to say, edges whose src or dst are not present in the graph.
Dangling edges may be directly added to the graph, or existing edges may
become dangling if their src or dst is removed.

This change is prerequisite to #1136; if we require that nodes have
metadata, we should also make it possible to add edges to nodes that
don't yet exist, as the plugin creating an edge may not have access to
the full metadata needed to add the node.

To support this change, there is now an `isDanglingEdge` method on the
graph, which reports whether or not the edge is dangling. Also,
`Graph.edges` requires that the client make an explicit choice on
whether dangling edges are desired. This ensures that we do not
accidentally include dangling edges in a case where they are
inappropriate (e.g. creating a Markov chain) or accidentally discard
dangling edges when they are needed (e.g. when merging or serializing).

The Graph's invariant checker has been updated to reflect the new
semantics.

The Graph compat version has been bumped, since this is a break in
backwards compatibility.

Note that this commit does not change the behavior of any plugins; that
is to say, no plugins create dangling edges (yet).

Test plan: The advanced graph test case has been updated to include
dangling edges. The tests for Graph, PagerankGraph, and
GraphToMarkovChain have been updated. `yarn test --full` passes.
2019-07-03 15:19:11 +03:00
Dandelion Mané
f934735afc
Graph: reify Object nodes (#1188)
This commit modifies the base `Graph` class so that nodes are now
represented by `Node` objects rather than `NodeAddressT`. The intention
is to start adding additional fields (e.g. description and timestamp) to
nodes, although that is not included in this commit.

See #1136 for rationale.

Test plan: The graph is very well tested, and this commit adds
additional tests and invariant checking. Some additional test code
needed update. `yarn test --full` passes, and the SourceCred UI works as
expected.
2019-06-14 03:22:31 +03:00
Dandelion Mané
7277867cc8 cleanup: PagerankGraph getters return undefined
PagerankGraph's `node` and `edge` getters returned null for unavailable
entries, rather than undefined. This is inconsistent with general JS
behavior, and with the base Graph. I've now cleaned it up.

Test plan: unit tests updated; `yarn test` passes.
2019-06-14 02:46:11 +03:00
Dandelion Mané
bcf805b9c8 cleanup: remove test duplication
In a previous commit (#1182) I inadvertently duplicated some tests. They
have now been removed.

Test plan: `yarn test` passes.
2019-06-14 02:46:11 +03:00
Dandelion Mané
414fb9f89f
Add GitHub entity descriptions (#1186)
Every GitHub entity now has a `description` method which returns a short
markdown description. These will be added to the graph as part of #1136.

Test plan: Inspected snapshots, `yarn test --full`.
2019-06-13 23:58:17 +03:00
Dandelion Mané
31abd380dc
Add GitHub reaction timestamps (#1185)
This will allow timeline cred (#862) to do a better job of flowing cred
across reaction edges. (Very old reactions should not be moving a lot of
present-day cred.)

Test plan: Inspected snapshot changes.
2019-06-13 23:43:54 +03:00
Dandelion Mané
b03f824e7d
GitHub entities expose timestampMs (#1184)
Every GitHub entity from `RelationalView` now has a `timestampMs`
method. This replaces the standalone `createdAt` method.

Test plan: Snapshots look good.
2019-06-13 23:32:22 +03:00
Dandelion Mané
1ec3945cdb
Load commit authored datetimes from GitHub (#1183)
It's an extension of #1152 induced by #1175.

It's a very simple change; I just changed the schema, and
`scripts/update_snapshots.sh` took care of the rest.

Test plan: Inspected snapshots and generated flow types.
2019-06-13 23:27:56 +03:00
Dandelion Mané
4029458098
Factor out distribution modules (#1182)
This pulls distribution related code out of `markovChain.js` into the new
`distribution.js` module, and from `graphToMarkovChain.js` into
`nodeDistribution.js`.

Since the `computeDelta` method is now exported, I've added some unit
tests.

Test plan: `yarn test` passes.
2019-06-13 23:24:37 +03:00
Dandelion Mané
e47a5bd84e Remove createdAt from the AnalysisAdapter
As #1136 will be moving timestamps into the graph, we no longer need
`createdAt` method in the `AnalysisAdapter`. Actually, we no longer need
the adapter/loader distinction introduced in #1157. I haven't taken the
time to remove the `BackendAdapterLoader` concept because a) we may need
it later, and b) if we don't, I'll likely remove the `AnalysisAdapter`
concept entirely, in favor of having plugins directly save `graph.json`
files to a known location.

Test plan: `yarn test` passes.
2019-06-13 23:24:20 +03:00
Dandelion Mané
4b1763ebc6 Discard mentionsAuthorReference
I added `mentionsAuthorReference` based on an untested hypothesis that
they would be useful. With the passage of time, I've never seen any
evidence that they actually improve cred socres (their impact seems
negligible), and they add complexity.

In the future, "go-fishing" style heuristics like this should not merge
unless they are of clearly demonstrated value. Also, it would be better
to add stuff like this via a standalone plugin rather than in the core
GitHub logic.

Undoes #806.

Test plan: `yarn test`
2019-06-13 23:24:20 +03:00
Dandelion Mané
3179ba841b remove sourcecred export-graph
Now that the graph is saved by default as a part of load, users who need
the graph can grab it directly from the `$SOURCECRED_DIRECTORY`. If we
really need a command line util for grabbing it, we should rewrite that
command to just grab the graph from that spot rather than re-computing
it.

Test plan: `yarn test`
2019-06-13 22:40:07 +03:00
Dandelion Mané
a348747aed Remove timestampMap
As of #1136, this will be redundant with raw information in the graph.

Test plan: `yarn test`
2019-06-13 22:40:07 +03:00
Dandelion Mané
67baacd862 cli load: save graph, not pagerank or timestampMap
As of the timeline cred work, I'm shifting emphasis away from raw
PageRank results, in favor of timeline pagerank results. As such,
there's no need to have load save the regular pagerank results on
creation.

As of #1136, there will be no need for timestampMap, as that data will
be present directly in the graph. As the timeline cred UI will depend on
the full graph for analysis, let's save the graph instead.

Test plan: `yarn test` and snapshot inspection.
2019-06-13 22:40:07 +03:00
Dandelion Mané
e493af2307
Refactor graph-related test code (#1179)
This commit adds new helper methods for creating test nodes (`node` and
`partsNode`) and for creating test edges (`edge` and `partsEdge`) to
graphTestUtil.js.

This is very helpful in light of work related to #1136. I'm going to
change the concept of "node" from a raw address to an object, add fields
to that object, and add fields to the `Edge` type. If done naively, we
would need to change all the test code across the project for every one
of those changes.

By centralizing the creation of test nodes and edges behind the new
functions, we can update all the test code in a single place.

This change is trivial from a conceputal perspective, and very
broad-reaching from a code-touching perspective. It should be easy to
review, because if tests pass then the change is probably working as
intended. :)

Test plan: `yarn test`
2019-06-13 22:16:26 +03:00
Dandelion Mané
e916bc91c8
Temporarily remove the odyssey plugin (#1178)
In #1132 and #1134, I started work on the Odyssey plugin. However,
before getting it to a state where it's usefully included in SourceCred,
I decided to pivot to focus on timeline cred first.

Now I'm merging significant refactors as a part of timeline cred
(#1136). As a side effect of this refactor, the Odyssey plugin should
undergo significant changes (OdysseyInstance is now basically redundant
with base Graph.) Rather than incrementally update unused code, I elect
to remove the plugin. This code should be revived on a side branch, and
then merged into master once we have a fully functioning prototype.

Test plan: `yarn test` passes.
2019-06-13 17:07:05 +03:00
Dandelion Mané
3c8fd0e701
Graph refactor: {inEdges, outEdges}->incidentEdges (#1173)
This commit refactors the Graph class so that rather than having
separate maps for inEdges and outEdges, there is a single incidentEdges
map, which contains objects with inEdges and outEdges.

This is motivated by a forthcoming big change as part of #1136; namely, to
allow storing dangling edges in the graph. Once we do so, we'll need a
consistent source of truth that enumerates all of the node addresses
which are accessible in the graph (either because they correspond to a
node in the graph, or because they are the src or dst of a dangling
edge). We could do this by adding another field to graph which tracks
this set, but by making this refactor, we can instead use the key set of
_incidentEdges as the source of truth for which node addresses are
present.

Besides being motivated by #1136, I think it's cleaner in general. Note
there are fewer ways for the graph to be inconsistent, as it's no longer
possible for inEdges and outEdges to have inconsistent sets of node
addresses.

The most complicated piece of this change was updating the automatic
invariant checker. It was no longer possible to test 3.1 and 4.1
separately, so they needed to be merged into a new invariant. Rather
than re-enumerate the invariants, I called the new one the 'Temporary
Invariant', because it is going to disappear in a subsequent commit.

Test plan: `yarn test` passes. Since Graph has extremely thorough
testing, this gives me great confidence in this commit. Note that no
observable behavior has changed.
2019-06-13 15:49:12 +03:00
Dandelion Mané
f5a46f8b31
update_snapshots.sh updates generated flow types (#1176)
The generated GitHub GraphQL flow types are a kind of snapshot, and it
can be hard to remember/discover how to update them when making changes
to schema. Therefore, I've included them in the snapshot update script.

Test plan: I used it to update the types, and it worked as intended.
2019-06-04 02:48:51 +03:00
Dandelion Mané
ccfaa25e7b
Add a GitHub Commit node type (#1175)
At present, the Git commit node type lives in a strange state of shared
responsibility between GitHub and Git. The Git plugin is nominally
responsible for it, but its render method tries to show a hyperlink to
GitHub -- which is awkward for many reasons, including that the same Git
commit could have multiple hyperlinks on GitHub.

This commit resolves that issue by separating the existing commit type
into two: the Git Commit type, which is owned by the Git plugin and
doesn't have hyperlinks or any fancy GitHub metadata, and the GitHub
Commit, which is owned by the GitHub plugin, corresponds to a unique
database id in GitHub, and has a corresponding GitHub url.

The two commits are connected by a CorrespondsToCommit edge type, which
links from the GitHub commit to the corresponding Git commit.

This is necessary for #1136, as if we want to make descriptions a part
of the graph payload, we need for descriptions to be unique for a given
address--and descriptions are only unique if we identifiy each GitHub
commit pointer as a separate address.

Test plan: The unit testing in this part of the codebase is light, so I
verified that the frontend work as expected for `sourcecred/sourcecred`
and `sourcecred/research`. The new node type and edge type appear
properly in the UI, the GitHub commits are connected to their Git
counterparts, etc.
2019-06-03 23:57:48 +03:00