895 Commits

Author SHA1 Message Date
Dandelion Mané
9d4ae8b901
Deleted users no longer break GitHub parser (#228)
When a GitHub user delete their account, all of their comments remain,
but with a `null` author. Previously, our code did not account for this
possibility. Now it does (by simply not adding an AUTHORSHIP edge).
Conveniently, our porcelain API already represents authors as a list, so
this doesn't require any change in porcelain API usage.

Test plan:
I did not add any unit tests, simply because
creating-and-deleting a GitHub user to make a repro seemed like a bit of
a pain. However, it is very unlikely that this bug will re-occur,
because the nullability of AuthorJSON is now enforced at the type level
inside graphql.js.

Also, running `node bin/sourcecred.js src-d go-git` now succeedsinstead
of failing.
2018-05-07 17:06:26 -07:00
William Chargin
0cae9d742d
Extract a common SOURCECRED_DIRECTORY flag (#227)
Summary:
This solves two problems:

 1. The “output directory” argument to `sourcecred graph` is also the
    input directory to other commands, like `sourcecred analyze`
    (hypothetically). In such cases, it would be nice for the flag to
    have the same name, but clearly `--output-directory` does not always
    make sense.

 2. In addition to storing graphs, we’ll need to store other kinds of
    data: settings, intermediate data for plugins to cache results, etc.
    We should store these under a single umbrella.

With both of these problems in mind, it makes sense to create a
`SOURCECRED_DIRECTORY` flag under which we store all relevant data.

Test Plan:
Run:
```shell
$ yarn backend
$ node bin/sourcecred.js help graph
$ node bin/sourcecred.js graph sourcecred example-github
$ node bin/sourcecred.js graph sourcecred example-github -d /tmp/sorcecrod
$ SOURCECRED_DIRECTORY=/tmp/srccrd node bin/sourcecred.js graph sourcecred example-github
$ for dir in /tmp/{sourcecred,sorcecrod,srccrd}; do find "${dir}"; done
```

wchargin-branch: graph-directory
2018-05-07 17:05:58 -07:00
William Chargin
498480db06
Factor out defaultStorageDirectory function (#226)
Test Plan:
Run `yarn backend && node bin/sourcecred.js help graph`.

wchargin-branch: defaultstoragedirectory
2018-05-07 16:23:53 -07:00
Dandelion Mané
61635a14a7
Remove redundant scripts (#225)
Our SourceCred CLI tool now ipmlements printCombinedGraph and
cloneAndPrintGitGraph, but with more principled implementations and
interfaces :)

Test plan:
`yarn travis --full` passes, so I didn't delete any needed test infra.
2018-05-07 16:15:00 -07:00
Dandelion Mané
55856d7a46
CLI commands error on unhandled promise rejection (#224)
Previously, if a CLI command had an unhandled promise rejection, this
would result in a spurious success and zero exit value.

This commit causes all of our CLI commands to instead fail if they have
an unhandled promise rejection.

Test plan: Previously, `sourcecred graph src-d go-git` would claim to
succeed, although it actually fails due to an unrelated bug. After this
change is applied, it correctly fails to retrieve the GitHub graph (and
hte combine step is never run).
2018-05-07 16:04:54 -07:00
Dandelion Mané
82bf739f35
Query GitHub for repository information (#222)
This commit pulls new information from GitHub about the url, name, and
owner of a GitHub repository. Part of #171.

Test plan: example-github-repo.json has been updated. `yarn travis
--full` passes. This should be sufficient.
2018-05-07 15:48:30 -07:00
Dandelion Mané
f3bfed3deb
Split GithubResponseJSON and RepositoryJSON (#219)
Currently, we generate a `RepositoryJSON` object via querying GitHub.
That `RepositoryJSON` object has a `repository` field... which is weird,
and suggests we got the names slightly wrong.

This commit renames the top-level response to `GithubResponseJSON`, and
factors the `repository` field out as `RepositoryJSON`. Correspondingly,
the `addData` and `addRepository` methods on the parser are now
distinct.

This is a precursor for #171.

Test plan: This is a simple refactor; the fact that yarn travis passes
should be sufficient.
2018-05-07 14:49:59 -07:00
William Chargin
1e0d846675
Change plugin graph label from "PASS" to "DONE" (#221)
Test Plan:
Run `node bin/sourcecred.js graph sourcecred example-github` and note
the new output:
```
Storing graphs into: /tmp/sourcecred/sourcecred/example-github

Starting tasks
  GO   create-git
  GO   create-github
 DONE  create-git
 DONE  create-github
  GO   combine
 DONE  combine

Full results
 DONE  create-git
 DONE  create-github
 DONE  combine

Overview
Final result:  SUCCESS
```

wchargin-branch: plugin-graph-label
2018-05-07 14:47:43 -07:00
William Chargin
4e8d5b574a
Make execDependencyGraph labels configurable (#220)
Summary:
The `"PASS"` label only makes sense for tests. This commit makes the
labels configurable, so that the verbiage can make more sense in other
contexts, too.

Test Plan:
Apply a patch like
```diff
diff --git a/config/travis.js b/config/travis.js
index af0996b..b0ab3b6 100644
--- a/config/travis.js
+++ b/config/travis.js
@@ -10,7 +10,11 @@ function main() {
     process.argv.includes("--full")
       ? "FULL"
       : "BASIC";
-  execDependencyGraph(makeTasks(mode)).then(({success}) => {
+  execDependencyGraph(makeTasks(mode), {
+    taskLaunchLabel: " YO ",
+    taskPassLabel: "WHEE",
+    taskFailLabel: "UHOH",
+  }).then(({success}) => {
     process.exitCode = success ? 0 : 1;
   });
 }
```
and note that `GITHUB_TOKEN=none yarn travis --full` exhibits all
desired messages.

wchargin-branch: configurable-labels
2018-05-07 14:39:59 -07:00
William Chargin
2aeeca9a13
Implement a command-line interface (#217)
Summary:
This commit implements the `sourcecred` command-line utility, which has
three subcommands:
  - `plugin-graph` creates one plugin’s graph;
  - `combine` combines multiple on-disk graphs; and
  - `graph` creates all plugins’ graphs and combines them.

As an implementation detail, the `into.sh` script is very convenient,
avoiding needing to do any pipe management in Node (which is Not Fun).
When we build for release, we may want to factor that differently.

Test Plan:
To see it all in action, run `yarn backend`, and then try:
```
$ export SOURCECRED_GITHUB_TOKEN="your_token_here"
$ node ./bin/sourcecred.js graph sourcecred sourcecred
Using output directory: /tmp/sourcecred/sourcecred

Starting tasks
  GO   create-git
  GO   create-github
 PASS  create-github
 PASS  create-git
  GO   combine
 PASS  combine

Full results
 PASS  create-git
 PASS  create-github
 PASS  combine

Overview
Final result:  SUCCESS

$ ls /tmp/sourcecred/sourcecred/
graph-github.json  graph-git.json  graph.json

$ jq '.nodes | length' /tmp/sourcecred/sourcecred/*.json
1000
7302
8302
```
The `node sourcecred.js graph` command takes 9.8s for me.

(The salient point of the last command is that the two small graphs have
node count adding up to the node count of the big graph. Incidentally,
we are [almost][1] at a nice round number of nodes in the GitHub graph.)

[1]: https://xkcd.com/1000/

wchargin-branch: cli
2018-05-07 12:23:09 -07:00
William Chargin
d7bfa02a54
Change execDependencyGraph export format (#216)
Summary:
To be honest, I have no idea what exactly this does or why it’s
necessary, but if we don’t do this then it is not possible to `import`
the exported member from a Webpack-bundled script. I’ve seen this
pattern before; one day I’ll actually figure out what it does. :-)

Test Plan:
Note that `yarn travis` (success) and `yarn travis --full` (failure; no
GitHub token) both have the expected behaviors.

wchargin-branch: execdependencygraph-export
2018-05-07 10:30:56 -07:00
Dandelion Mané
72ca52f579
Change package.json name to sourcecred (#215) 2018-05-05 00:04:42 -07:00
Dandelion Mané
fa4082c95b
Minimal toy oclif integration (#214)
This commit adds [oclif] as a command-line framework. It is successfully
integrated with webpack.

[oclif]: https://github.com/oclif/oclif

Usage:
`yarn backend` to build the cli.
`node bin/sourcecred.js` to launch the CLI and see usage
`node bin/sourcecred.js example` for one example command
`node bin/sourcecred.js goodbye` for another example command
2018-05-04 19:28:37 -07:00
William Chargin
e9dbdeca96
Target latest Node for backend applications (#213)
Summary:
Consequently, Babel won’t transform classes to their roughly equivalent
ES5 counterparts, etc.

Test Plan:
Create `src/classy.js` with `class X {}; console.log(X);`. Then, add a
build target for `classy: resolveApp("src/classy.js"),` in `paths.js`.
Use `yarn backend` and inspect the contents of `bin/classy.js`; in
particular, look at the definition of `X` (whatever the argument to
`console.log` is). Before this commit, the result will be a big
complicated mess. After this commit, it will be `class X {}`.

Note also that `yarn travis --full` passes, indicating that the two
manual tests, which call out to the utilities in `bin/`, still work.

wchargin-branch: target-node
2018-05-04 19:22:39 -07:00
William Chargin
b5e894bbb4
Fork babel-preset-react-app into config/ (#212)
Summary:
We want to change this configuration so that our compilation of backend
applications can target latest Node. This commit forks the current
configuration so that we can modify it easily.

Test Plan:
Both `yarn start` and `yarn travis` work. The generated backend
applications work, too.

wchargin-branch: fork-babel-config
2018-05-04 19:19:45 -07:00
Dandelion Mané
de5542de6a
Exclude node modules from backend build (#211)
Setup following directions from [webpack-node-externals]

[webpack-node-externals]: https://www.npmjs.com/package/webpack-node-externals

This unblocks #210.

Test plan: `yarn backend` still succeeds, and the binary scripts still
work. The resultant binaries are much smaller, as seen below (note build
time is the same).

before:
```
❯ yarn backend
yarn run v1.5.1
$ node scripts/backend.js
Building backend applications...
Compiled successfully.

File sizes after gzip:

  231.37 KB  bin/printCombinedGraph.js
  199.5 KB   bin/fetchAndPrintGithubRepo.js
  46.41 KB   bin/cloneAndPrintGitGraph.js
  21.48 KB   bin/createExampleRepo.js
  17.71 KB   bin/loadAndPrintGitRepository.js

Build completed; results in 'bin'.
Done in 4.46s.
```

after:
```
❯ yarn backend
yarn run v1.5.1
$ node scripts/backend.js
Building backend applications...
Compiled successfully.

File sizes after gzip:

  27.78 KB  bin/printCombinedGraph.js
  12.73 KB  bin/cloneAndPrintGitGraph.js
  12.41 KB  bin/fetchAndPrintGithubRepo.js
  6.03 KB   bin/loadAndPrintGitRepository.js
  5.52 KB   bin/createExampleRepo.js

Build completed; results in 'bin'.
Done in 4.28s.
```
2018-05-04 16:31:39 -07:00
William Chargin
d3443a3d4c
Extract execDependencyGraph core from CI script (#208)
Summary:
We’d like to use the same abstraction for creating multiple cred graphs
and then combining them together. This will enable us to do that.

Test Plan:
Run `yarn travis` to test the success case, and `yarn travis --full`
(without setting a `GITHUB_TOKEN`) to test the failure case.

wchargin-branch: execdepgraph
2018-05-04 15:47:26 -07:00
William Chargin
a642ed46b9
Expose Graph.mergeManyConservative (#209)
Summary:
This offers #205 to general users.

Test Plan:
Existing tests suffice.

wchargin-branch: merge-many-conservative
2018-05-04 15:42:39 -07:00
Dandelion Mané
e3469f157d
Add src/tools/bin/printCombinedGraph.js (#207)
`printCombinedGraph` loads and prints a cross-plugin combined
contribution graph for a given GitHub repository.

It is a simple executable wrapper around `src/tools/loadCombinedGraph`.

Example usage:
`node bin/printCombinedGraph.js sourcecred example-git $GITHUB_TOKEN`
2018-05-04 12:10:20 -07:00
Dandelion Mané
e66ed45cba
Add CLI for printing a fresh Git graph (#206)
`cloneAndPrintGitGraph` clones a git repository, and generates a Git
object graph for that repository.

This can be run as follows:
```
yarn backend;
node bin/cloneAndPrintGitGraph sourcecred example-git
```

This commit also adds two utility modules:
* `cloneAndLoadRepository` , which clones a Git repository to a tmpdir,
parses the `Repository` data out, and then cleans up.
* `cloneGitGraph`, which calls `cloneAndLoadRepository` and `createGraph`

Test plan: These don't fit well into our CI, because they require
network access to clone repositories from GitHub. I verified that the
functions work via the demo script above.
2018-05-04 11:35:14 -07:00
William Chargin
d3dcf1ef5a
Speed up Git graph creation (#205)
Summary:
Because of `mergeConservative`’s naive implementation, using it as a
reducer results in a roughly quadratic algorithm. Replacing this with a
mutative accumulation has the same semantics but goes much faster.

Test Plan:
For correctness: tests pass. For performance: apply the following patch
to collect timing data. Then run:
```shell
$ NODE_ENV=production yarn backend
$ node bin/loadAndPrintGitRepository.js . >/tmp/sourcecred-git
$ node bin/createGitGraph.js /tmp/sourcecred-git
```
to run against the current state of SourceCred. Before this patch, the
elapsed time is about 6m00s; after this patch, it is about 0m1.3s.
Specifically:
```
$ cat timing_before
[0] Git graph creation: 239593.958ms
[1] Git graph creation: 240380.557ms
[2] Git graph creation: 241687.042ms

$ cat timing_after
[0] Git graph creation: 1585.903ms
[1] Git graph creation: 1315.430ms
[2] Git graph creation: 1373.833ms
```

<details>
<summary>Patch to collect timing data</summary>

```diff
diff --git a/config/paths.js b/config/paths.js
index f875eee..1bc1469 100644
--- a/config/paths.js
+++ b/config/paths.js
@@ -64,5 +64,6 @@ module.exports = {
     loadAndPrintGitRepository: resolveApp(
       "src/plugins/git/bin/loadAndPrintRepository.js"
     ),
+    createGitGraph: resolveApp("src/plugins/git/bin/createGraph.js"),
   },
 };
diff --git a/src/plugins/git/bin/createGraph.js b/src/plugins/git/bin/createGraph.js
new file mode 100644
index 0000000..a35ca1b
--- /dev/null
+++ b/src/plugins/git/bin/createGraph.js
@@ -0,0 +1,25 @@
+// @flow
+
+import {readFileSync} from "fs";
+
+import {createGraph} from "../createGraph";
+
+function main() {
+  const filename = process.argv[2];
+  const raw = JSON.parse(readFileSync(filename).toString());
+  const results = [];
+  for (let i = 0; i < 3; i++) {
+    const timer = `[${i}] Git graph creation`;
+    console.time(timer);
+    results.push(createGraph(raw));
+    console.timeEnd(timer);
+  }
+  console.log(
+    "Checksum: " +
+      JSON.stringify(
+        results.map((graph) => graph.nodes().length ^ graph.edges().length)
+      )
+  );
+}
+
+main();
```

</details>

wchargin-branch: collect-gold-rings
2018-05-04 10:52:41 -07:00
Dandelion Mané
0bf4f73f86
Add fetchGithubGraph (#204)
fetchGithubGraph is a tiny module which is responsible for fetching
GitHub contribution data, and parsing it into a graph.

Test plan:
The function is trivial and does not merit explicit testing.
2018-05-04 10:21:21 -07:00
William Chargin
315f66cc4c
Add BECOMES edges in the Git graph (#203)
Summary:
If a commit causes a tree entry to change hash while keeping the same
name, we now add a BECOMES edge between the corresponding entries.

Test Plan:
Snapshot changes are readable enough to manually verify. Programmatic
tests also added.

wchargin-branch: graph-becomes-edges
2018-05-03 14:16:18 -07:00
William Chargin
e9ecb8c608
Find BECOMES edges for a high-level repository (#202)
Test Plan:
For the snapshot: verify that two of the BECOMES edges are the same as
those tested in `findBecomesEdgesForCommits` and that they have the
right commit hashes; then, verify that the remaining edge is correct.

wchargin-branch: high-level-becomes-repository
2018-05-03 14:09:13 -07:00
William Chargin
c572b7f880
Find BECOMES edges for high-level commits (#201)
Test Plan:
Unit tests included. I verified that the hashes in the snapshot are
correct.

wchargin-branch: high-level-becomes-commits
2018-05-03 13:46:03 -07:00
Dandelion Mané
a76d01ab75
Connect PRs and commits via MERGED_AS edges (#200)
This adds MERGED_AS edges which link from a PullRequest to a Commit. It
adds a corresponding `mergedCommitHash` method on the porcelain PR that
returns the hash of the merged commit (if available).

I would have preferred to return a porcelain wrapper over the commit,
but since we don't have a porcelain Git api, it seemed preferrable to
return the hash as a string. Returning a Node would both break
consistency in the porcelain api, and be problematic as the node does
not necessarily exist in the api. To ensure that the hash is available
without parsing Addresses, I used the edge payload. :)

Test plan:
Inspect the snapshot changes in the graph (they are fairly readable) and
the api testing in api.test.js.
2018-05-03 13:29:44 -07:00
Dandelion Mané
723efeb05f
Pull merge commit SHAs from GitHub (#198)
This commit adds a few fields to the PullRequest query fragment so that
we now retrieve merge commit shas. In cases where there is no merge
commit (ie the PR did not merge), the field is null. Observe that this
is the case for our example unmerged pull request.

Test plan: Inspect the changes to the demo data, and verify that they
are appropriate.
2018-05-03 12:41:20 -07:00
Dandelion Mané
9cbfa35a3a
Expose commitAddress from the Git plugin (#199)
For the GitHub plugin to create edges pointing to commits from the Git
plugin, it needs a way to create the appropriate address given the
commit's hash. This commit exposes that functionality by moving
`makeAddress` out of the "createGraph" module and into a new "address"
module, and using it to implement `commitAddress`.

Test plan: The code is so trivial that I don't think it merits testing.
2018-05-03 12:41:01 -07:00
Dandelion Mané
136cfa839c
Update example-github data (#197) 2018-05-03 12:03:19 -07:00
Dandelion Mané
ce11a1c4e3
Rename sourcecred/example-{repo,github} (#196)
Our repository containing example GitHub data has been renamed from
"sourcecred/example-repo" to "sourcecred/example-github". This commit
updates all snapshots and tests to reflect this rename.
2018-05-03 11:51:12 -07:00
Dandelion Mané
b4474e6bd1
Remove the repositoryName field from Addresses (#195)
See [#190] for context.

The change is almost entirely straightforward; the only "interesting"
decision I made was to move the repo owner and repo name into the string
id for the Artifact Plugin addresses, as the id would otherwise not be
unique.

[#190]: https://github.com/sourcecred/sourcecred/issues/190#issuecomment-386362870
2018-05-03 11:12:02 -07:00
William Chargin
082515e16a
Create nodes for submodule commits (#186)
Summary:
Previously, a tree entry had exactly one `HAS_CONTENTS` edge, unless the
tree entry corresponded to a submodule commit, in which case we had no
information at all. Now, submodule commit tree entries point to zero or
more `SUBMODULE_COMMIT` nodes. In the vast majority of cases, there will
be exactly one such node—however, it is possible that a particular tree
entry could correspond to multiple submodules (clone two identical
submodules with different URLs) or none at all (some manually edited
`.gitmodules` or other corruption).

Test Plan:
The snapshot changes are easy enough to read and verify (two new nodes
and five new edges). Additionally, the path-to-submodule `createGraph`
test has been updated.

wchargin-branch: graph-submodule-urls
2018-05-03 10:44:06 -07:00
William Chargin
7dbecfdac6
Load submodule URLs at each commit (#185)
Summary:
In Git, a tree may point to a commit directly. In our graph, we’d like
to represent “submodule commits” explicitly, because, a priori, we do
not know the repository to which the commit belongs. A submodule commit
node will store the hash of the referent commit, as well as the URL to
the subproject as listed in the .gitmodules file. In this commit, we
load the list of those URLs into the in-memory repository.

Shout-out to `git` for having an excellent command-line API:
the `--blob` argument to `git-config` is perfect for this situation.

Test Plan:
Snapshot updates are readable and sufficient.

wchargin-branch: load-submodule-urls
2018-05-03 10:39:03 -07:00
William Chargin
bbb05c9508
Store TreeEntry metadata in non-string form (#184)
Summary:
Prior to this commit, given a `Tree` node with an edge to a `TreeEntry`
node, there was no way to tell what the entry name was other than
parsing the ID (which should never be required). This adds appropriate
data to the payload of a `TreeEntry`, and also to the inclusion edge (so
that if you only have the edge, you don’t have to fetch the entry).

Test Plan:
Snapshot changes are readable.

wchargin-branch: treeentry-metadata
2018-05-03 10:33:25 -07:00
William Chargin
eba1872495
Build backend applications in CI (#193)
Summary:
This could catch failures in build configuration or with Webpack. It’s
unlikely to catch any logic errors, because no production code is run.
In any case, it’s fast enough; it finishes at about the same time as
`ci-test` and `check-pretty`.

Test Plan:
From the repository root, run `rm -r bin; yarn travis`, and note that
the `bin/` directory is regenerated.

wchargin-branch: ci-backend
2018-05-02 22:16:48 -07:00
William Chargin
25d0106a33
Run npm scripts with --silent in CI (#191)
Summary:
This prevents the boilerplate output of the form
```

> sourcecred-explorer@0.1.0 check-pretty /home/wchargin/git/sourcecred
> prettier --list-different '**/*.js'

```
(superfluous linebreaks included). In the case that a script fails, it
also omits the giant “this is most likely not a problem with npm” block.

The downside to this is that it suppresses any errors in npm-run-script
itself. For instance, `npm run wat` produces “missing script: wat”,
while `npm run --silent wat` just silently exits with 1. This does not
silence the actual scripts themselves, so things like lint errors or
test failures will still appear.

Test Plan:
Run `yarn travis` before and after this commit, and note that the
resulting build log is prettier after.

wchargin-branch: ci-silent
2018-05-02 19:10:37 -07:00
William Chargin
38f4121ce9
Implement a custom CI script (#189)
Summary:
This CI script accomplishes two tasks:
 1. It speeds up our build by parallelizing where possible.
 2. It opens the possibility for running Travis cron jobs.

Currently, this script by default does the same amount of work as our
current CI script. However, I’d like to move `yarn backend` into the
list of basic actions: a backend build failure should fail CI.

Note: this script is written to be executable directly by Node, so we
can’t use Flow types with the standard syntax. Instead, we use the
comment syntax: https://flow.org/en/docs/types/comments/

Test Plan:
The following should pass with useful output:
  - `npm run travis`
  - `GITHUB_TOKEN="your_github_token" npm run travis -- --full`

The following should fail with useful output:
  - `npm run travis -- --full` (fail)

To test different failure modes, it can be helpful to add
```js
    {id: "doomed", cmd: ["false"], deps: []},
    {id: "orphan", cmd: ["whoami"], deps: ["who", "are", "you"]},
```
to the list of `basicTasks` in `travis.js`.

To test performance:
```shell
$ time node ./config/travis.js >/dev/null 2>/dev/null

real    0m8.306s
user    0m20.336s
sys     0m1.364s

$ time bash -c \
>     'npm run check-pretty && npm run lint && npm run flow && CI=1 npm run test' \
>     >/dev/null 2>/dev/null

real    0m12.427s
user    0m13.752s
sys     0m0.804s
```
A 50% savings is not bad at all—and the raw time saved should only
improve from here on, as the individual steps start taking more time.

wchargin-branch: custom-ci
2018-05-02 16:10:03 -07:00
William Chargin
79dff9a083
Add options to not rebuild on shell script tests (#188)
Summary:
This can be useful for speed, but it can also be important for
correctness (at least theoretically): if we run both these scripts
concurrently, then we don’t want one of them to squash the `bin`
directory while the other is about to invoke an executable therein.

One might note that the diffs to the two files in this commit are
virtually identical, and indeed the files themselves are quite similar.
I’d prefer to keep the duplication for now; if we really need a Bash
snapshot testing framework, we can factor one out.

Test Plan:
Run each script with `--help`, with `--build` and `--no-build`, and with
and without `-u`.

wchargin-branch: optional-rebuild
2018-05-02 15:09:46 -07:00
William Chargin
ee03c58357
Exclude punctuation surrounding URL references (#183)
Summary:
To avoid confusion, we simultaneously remove unused capturing groups.
This is not strictly necessary, but it makes the code less brittle.

Test Plan:
The newly added test fails before the change to `findReferences.js`.

wchargin-branch: url-punctuation
2018-05-01 14:51:18 -07:00
Dandelion Mané
acf5000547
Create GitHub reference edges (#182)
This commit adds the `addReferenceEdges()` method to the GitHub parser,
which examines all of the posts in the parsed graph and adds References
edges when it detects references between posts. As an example, `Hey
@wchargin, take a look at #1337` would generate two references.

We currently parse the following kinds of references:
- Numeric references to issues/PRs.
- Explicit in-repository url references (to any entity)
- @-author references

We do not parse:
- Cross-repository urls
- Cross-repository shortform (e.g. `sourcecred/sourcecred#100`)

`Parser.parse` calls `addReferenceEdges()`, so no change is required by
consumers to have reference edges added to their graphs.

The GitHub porcelain API layer now includes methods for retreiving the
entities referenced by a post.

Test plan:
This commit is tested both via snapshot tests, and explicit testing at
api layer. (Actually, the creation of the porcelain API layer was
prompted by wanting a cleaner way to test this commit.) I recommend
inspecting the snapshot tests for sanity, but mostly relying on the
tested behavior in api.test.js.
2018-04-30 20:19:38 -07:00
Dandelion Mané
f358c33e2a
findReferences now finds url references to users (#181)
For example, `https://github.com/decentralion` is now a valid url
reference to a GitHub author.

Test plan: Check the added test case.
2018-04-30 19:23:58 -07:00
Dandelion Mané
0c0bbf58e2
Update example repo data (#180)
I added a lot of new comments that have url references to different
types of GitHub entities, e.g. to pull request review comments.

The commit was generated by running the example repo fetcher, and
running yarn test and updating snapshots.
2018-04-30 19:21:41 -07:00
Dandelion Mané
a1d072846d
Add PR reviews and comments to GitHub api (#179)
Also, a slight re-organization of the GitHub api test code.
2018-04-30 18:22:03 -07:00
William Chargin
16e8e399e6
Add commit parent edges in the Git graph (#178)
Test Plan:
To verify the snapshot change, either believe the programmatic tests, or
use the following script to verify that the right edges were added:
```bash
#!/bin/bash
set -eu
example_repo="$(mktemp -d)"
yarn backend >/dev/null 2>/dev/null
node bin/createExampleRepo.js "${example_repo}"
expected() {
    git -C "${example_repo}" log --format='%H %P' \
        | awk '{ if (NF > 1) { print $2 " " $1 } }' \
        | sort
}
actual() {
    git diff HEAD^..HEAD | grep -A 1 -F -e src -e dst \
        | sed -n 's/^+.*"id": "\(.\+\)".*/\1 /p' \
        | tr -d $'\n' | cat - <(echo) \
        | fold -s -w 82 | sed 's/ *$//' \
        | sort
}
diff -u <(expected) <(actual)
```

wchargin-branch: graph-commit-parents
2018-04-30 18:08:40 -07:00
William Chargin
56ddb5cf9b
Load commit parent hashes into memory (#177)
Test Plan:
Snapshot updated with `./src/plugins/git/loadRepositoryTest.sh -u`; unit
tests suffice.

wchargin-branch: load-commit-parents
2018-04-30 18:01:16 -07:00
William Chargin
d5f468ca68
Fix Git plugin NodePayload definition (#176)
Summary:
Flow didn’t catch this because all the payloads are `{}` anyway.

Test Plan:
Note that every node and edge payload is now listed exactly once in the
correct spot for each of `{Node,Edge}{Type,Payload}`.

wchargin-branch: git-nodepayload
2018-04-30 17:08:49 -07:00
Dandelion Mané
0609201af4
Remove Graph.{in,out}Edges (#174)
This method removes `Graph.inEdges` and `Graph.outEdges`. As a
replacement to these two functions, `graph.neighborhood` now takes an
optional `direction` flag, which can be set to `"IN" | "OUT" | "ANY"`.

This reduces the surface area of the Graph API, and means that the same
pattern can be used when requesting in or out neighbors as is used when
requesting all neighbors.

This change generates significant churn in the test files, and in some
cases the tests are less elegant / show historicity, as they were
written for the type signature of `{in,out}Edges`, which just returns an
array of edges, and now receive an array of neighbors. I think this is
acceptable, and it's not worth re-writing the test.

In many cases, replacing existing calls to `{in,out}Edges` in our actual
codebase resulted in cleaner code, as `neighborhood` successfully
abstracts over the common patterns that users of `{in,out}Edges` were
implementing.

As a fly-by refactor, I also changed the `neighborAddress` part of the
`neighborhood` return value to `neighbor`. It's a little less
descriptive, but it's more concise, and flow is there to help ensure
it's used correctly.

Test plan: Note that CI passes. Inspect the test changes, and verify
that they are appropriate transformations for consuming the new API.
2018-04-30 15:32:28 -07:00
William Chargin
5af5748ed7
Convert in-memory Git repos to cred graphs (#169)
Test Plan:
This snapshot test is too unwieldy to actually read—it’s 1000 lines of
opaque SHAs and thrice-stringified JSON objects—so it should be
interpreted as a regression test only. The programmatic tests should
suffice.

wchargin-branch: wip-git-create-graph
2018-04-30 15:23:37 -07:00
William Chargin
f3a440244e
Fix all lint errors, adding a lint CI step (#175)
Test Plan:
Run `yarn lint` and `yarn travis` and observe success. Add something
that triggers a lint warning, like `const zzz = 3;`; re-run and observe
failures.

wchargin-branch: lint
2018-04-30 14:52:28 -07:00
Dandelion Mané
22ca77ed05
Add safe type coercion for GitHub api (#173)
In general, methods in the porcelain GitHub api may return multiple
types; e.g. a reference could be to an Issue, PullRequest, Comment,
Author (or more). To make working with the api more convenient while
maintaining safety, this commit adds a static `asType` method to each
Entity class, which confirms that type coercion is safe, and errors if
not.

This commit also adds `issueOrPRByNumber`, a convenience method, to
api.test.js.

Test plan: Check the API usage and verify that it is reasonable.
2018-04-30 10:07:23 -07:00