Commit Graph

1256 Commits

Author SHA1 Message Date
William Chargin 9d1200275e
Implement exhaustive fetching for our GitHub query (#123)
Summary:
This commit completes the ad hoc pagination solution described in #117:
we implement pagination specifically for our current query against the
GitHub API. This is done in such a way that reasonable additions to the
query will not be hard to implement—for instance, if we want to fetch
a new kind of field, the marginal cost is at most a bit of extra
copy-and-paste and some modifications to tests. However, we should
certainly still plan to implement the fully automatic pagination system
described in #117.

Running on TensorBoard with the default page limits takes 30–33 seconds
over 7 queries, uses 103 GitHub API points (out of 5000 for any given
hour), and produces a JSON file that is 8.7MB (when pretty-printed).
This all seems quite reasonable to me.

Test Plan:
Extensive unit tests added. The snapshots are quite readable, by design.

For a real-world test, you can `yarn start` the artifact viewer and use
the GUI to fetch the data for tensorflow/tensorboard.

To demonstrate that the fetching process gives the same results
regardless of the page size, follow these steps:

 1. In `fetchGithubRepo.js`’s `postQuery` function, insert a new
    statement `console.error("Posting query...")` at the beginning. (It
    is important to print to stderr instead of stdout.)
 2. Run `yarn backend` and then invoke `fetchGithubRepo.js` on a repo
    large enough to require pagination, like SourceCred or TensorBoard.
    Pipe the result to `shasum -a 256` and note the SHA.
 3. In `github/graphql.js`, change the page size constants near the top
    of the file. Re-run step 2. The number of queries that are posted
    will vary significantly as the page size constants vary, but the
    checksum should remain the same.
 4. Repeat until satisfied. (I tried three sets of values: the standard
    values, the standard values but all divided by 10, and all 5s.)

wchargin-branch: ad-hoc-pagination
2018-04-06 07:29:55 -07:00
William Chargin e82b56e52c
Implement a function to merge query results (#122)
Summary:
Once we execute the root query, find continuations, embed the
continuations into queries, and execute the continuation query, we will
need to merge the continuations’ results back into the root results.
This commit adds a function `merge` that will be suitable for doing just
that.

Test Plan:
New unit tests added, with 100% coverage. Run `yarn test`.

wchargin-branch: merge-query-results
2018-04-05 02:27:03 -07:00
William Chargin 751172ea77
Create pagination continuations for GitHub query (#121)
Summary:
Per #117, this is a first step toward at writing a pagination API that
specifically targets our current GitHub query. For design details, see
new module docs on `src/plugins/github/graphql.js`.

This commit modifies the core GitHub query and thus the
`example-repo.json` snapshot: we now request `endCursor` fields for all
pagination info, and we request the `id` of the root `repository` field.
The former is obviously necessary. The latter is necessary for the
repository to be consistent with other nodes that offer connections as
fields: we require an ID on the node containing the connection so that
we can have random access to it in a continuation selector.

Test Plan:
Unit tests added. You can also try out the generated continuation
queries for yourself: apply the patch below, run `yarn backend`, and
then run the `fetchGithubRepo.js` script on `sourcecred/sourcecred`.
This will output a nicely formatted query that you can paste directly
into GitHub’s API explorer and execute. (Note that, because this patch
is not fully polished, the query must be run against a repository that
has a continuation for every node type: more pages of issues, PRs,
comments, reviews, and review comments. This is due to an
easy-but-annoying-to-fix bug in the patch, not in the code included in
this commit.)

<details>
<summary>Patch for generating a continuations query</summary>

```diff
diff --git a/src/plugins/github/fetchGithubRepo.js b/src/plugins/github/fetchGithubRepo.js
index 789a20e..418c736 100644
--- a/src/plugins/github/fetchGithubRepo.js
+++ b/src/plugins/github/fetchGithubRepo.js
@@ -6,8 +6,13 @@

 import fetch from "isomorphic-fetch";

-import {stringify, inlineLayout} from "../../graphql/queries";
-import {createQuery, createVariables} from "./graphql";
+import {stringify, inlineLayout, multilineLayout} from "../../graphql/queries";
+import {
+  continuationsFromQuery,
+  continuationQuery,
+  createQuery,
+  createVariables,
+} from "./graphql";

 /**
  * Scrape data from a GitHub repo using the GitHub API.
@@ -66,8 +71,13 @@ function postQuery(payload, token) {
       if (x.errors) {
         return Promise.reject(x);
       }
-      ensureNoMorePages(x);
-      return Promise.resolve(x);
+      console.log(
+        stringify.body(
+          continuationQuery(Array.from(continuationsFromQuery(x.data))),
+          multilineLayout("  ")
+        )
+      );
+      throw new Error("STOPSHIP");
     });
 }

diff --git a/src/plugins/github/graphql.js b/src/plugins/github/graphql.js
index 9ea2592..9ead42b 100644
--- a/src/plugins/github/graphql.js
+++ b/src/plugins/github/graphql.js
@@ -39,11 +39,11 @@ import {build} from "../../graphql/queries";
  *
  * [1]: https://developer.github.com/v4/guides/resource-limitations/#node-limit
  */
-export const PAGE_LIMIT = 100;
-const PAGE_SIZE_ISSUES = 100;
-const PAGE_SIZE_PRS = 100;
-const PAGE_SIZE_COMMENTS = 20;
-const PAGE_SIZE_REVIEWS = 10;
+export const PAGE_LIMIT = 10;
+const PAGE_SIZE_ISSUES = 10;
+const PAGE_SIZE_PRS = 10;
+const PAGE_SIZE_COMMENTS = 3;
+const PAGE_SIZE_REVIEWS = 1;
 const PAGE_SIZE_REVIEW_COMMENTS = 10;

 /**
@@ -340,6 +340,36 @@ function* continuationsFromReview(
   }
 }

+/**
+ * Combine continuations into a query.
+ */
+export function continuationQuery(
+  continuations: $ReadOnlyArray<Continuation>
+): Body {
+  const nonces: string[] = continuations.map((_, i) => `_n${String(i)}`);
+  const nonceToIndex = {};
+  nonces.forEach((n, i) => {
+    nonceToIndex[n] = i;
+  });
+  const b = build;
+  const query = b.query(
+    "Continuations",
+    [],
+    continuations.map((continuation, i) =>
+      b.alias(
+        nonces[i],
+        b.field(
+          "node",
+          {id: b.literal(continuation.enclosingNodeId)},
+          continuation.selections.slice()
+        )
+      )
+    )
+  );
+  const body = [query, ...createFragments()];
+  return body;
+}
+
 /**
  * These fragments are used to construct the root query, and also to
  * fetch more pages of specific entity types.
```
</details>

wchargin-branch: ad-hoc-pagination-continuations
2018-04-05 02:23:17 -07:00
William Chargin 7711f01b84
Extract paginatable fragments of GitHub query (#120)
Summary:
Any time that we pull fields off a connection object, we may need to
repeat the query for subsequent pages. Therefore, such fragments will be
shared across multiple queries, and also shared within a query if we
need to fetch—say—more issue comments on two or more distinct issues.
This is a perfect use case for fragments.

This commit refactors the GitHub query to be organized in terms of
fragments, without changing the format of the results.

(We also take this opportunity to factor the page limits into
constants.)

Test Plan:
After running `yarn backend`, the `fetchGithubRepoTest.sh` test passes.

wchargin-branch: extract-github-query-fragments
2018-04-05 02:19:29 -07:00
William Chargin fbb6ec28db
Extract GitHub GraphQL code to a module (#119)
Summary:
Per #117, we want to develop an ad hoc pagination API written
specifically against the query that we use to interact with GitHub.
The pagination logic should be separate from the logic to actually fetch
the repo, but should be colocated with the query itself, so this commit
extricates the query from `fetchGithubRepo.js` into a new module.

Test Plan:
Existing tests pass, including `fetchGithubRepoTest.sh`.

wchargin-branch: extract-github-graphql
2018-04-05 00:08:52 -07:00
William Chargin 806c5e5687
Update example data for @decentralion name change (#118)
Summary:
This was created by re-crawling the GitHub repo via `fetchGithubRepo`,
and then updating Jest snapshots.

Test Plan:
Note that `fetchGithubRepoTest.sh` passes, so the data is now up to
date.

Inspect the snapshot, and note that the only changes are to change login
names from `dandelionmane` to `decentralion`. To do so automatically:
```bash
set -eu
diff_contents() {
    git difftool HEAD^ HEAD --extcmd=diff --no-prompt
}
! diff_contents | grep '^<' | grep -vF '"dandelionmane"'
! diff_contents | grep '^>' | grep -vF '"decentralion"'
```

wchargin-branch: decentralion-data
2018-04-04 23:29:30 -07:00
William Chargin e6f401df30
Add field aliases to structured GraphQL queries (#116)
Summary:
For pagination, we’ll want to query against multiple entities of the
same type. GraphQL uses aliases to facilitate this. This commit adds
support for aliases to our GraphQL query DSL.

Test Plan:
Inspect snapshot changes, and note that `yarn flow` and `yarn test`
pass.

wchargin-branch: graphql-aliases
2018-03-26 20:54:16 -07:00
William Chargin 2f50aa7364
Rename getAll{Nodes,Edges} to get{Nodes,Edges} (#115)
Test Plan:
Standard `yarn flow` and `yarn test` suffice.

wchargin-branch: get-no-all
2018-03-26 20:25:06 -07:00
William Chargin f93c9a8e42
Allow editing artifact descriptions (#114)
Summary:
It looks like this:
![Screenshot](https://user-images.githubusercontent.com/4317806/37943962-a8352b94-312e-11e8-9523-855a34020709.png)

Test Plan:
Interaction tests included. Run `yarn test`.

wchargin-branch: artifact-descriptions
2018-03-26 20:11:30 -07:00
William Chargin 99f24c420a
Convert ArtifactList to ArtifactGraphEditor (#112)
Summary:
This component now maintains a graph of just artifacts and the edges
among them. It owns the state, and notifies its parents of changes with
a callback. We treat the graph objects as properly immutable, copying
them on each change. So far, descriptions are always the empty string.

Test Plan:
Interaction tests added. Run `yarn test`.

wchargin-branch: artifact-graph-editor
2018-03-26 19:49:21 -07:00
Dandelion Mané 823e7da374
Cleanup GitHub plugin Node/Edge types (#113)
Update GitHub plugin to respect two new conventions:
- Node/Edge types are exported as UPPER_CASE_CONSTANTS
- Edge types are always verbs, which can be read as $src verb $dst

Test plan:
Flow passes. Inspect snapshot changes.
2018-03-26 19:24:14 -07:00
William Chargin 458744e77f
Redefine artifact plugin node and edge semantics (#111)
Summary:
This commit revises our implementations of node and edge types, and
specifies the semantics for artifact plugin IDs: we create IDs by
slugifying an artifact name and then resolving collisions

Test Plan:
Unit tests added. Run `yarn flow` and `yarn test`.

wchargin-branch: artifact-plugin-node-edge-semantics
2018-03-26 19:17:42 -07:00
William Chargin e57a16efbd Allow removing nodes and edges from the graph (#110)
Summary:
Wherein we change the semantics to allow\* dangling edges. This is
necessary for plugins that want to update nodes, such as changing a
description or other noncritical field.

\* (It was technically possible before by abusing `merge`, but now you
can just do it.)

Paired with @dandelionmane.

Test Plan:
Extensive tests added. Run `yarn flow` and `yarn test`.

wchargin-branch: allow-removing-from-graph
2018-03-26 17:40:19 -07:00
William Chargin 26508051a4 Add a covariant `copy` method on `Graph` (#109)
Summary:
Clients of `Graph` that wish to treat the graph as immutable will
benefit from a `copy` method. We should provide it on `Graph` instead of
asking clients to reimplement it because it affords us the opportunity
to get the type signature right: in particular, copying should allow
upcasting of the type parameters, even though `Graph` itself is
invariant.

Paired with @dandelionmane.

Test Plan:
Unit tests added. Run `yarn flow` and `yarn test`. To check that
downcasting is not allowed, change the types in the new static test case
in `graph.test.js` to be contravariant instead of covariant, and note
that `yarn flow` fails.

wchargin-branch: graph-copy
2018-03-26 13:58:12 -07:00
William Chargin 007cf88172
Separate artifact settings from GitHub graph fetch (#108)
Summary:
We need to know the repo owner and name for purposes other than fetching
the GitHub graph: for instance, fetching the `artifacts.json` file that
describes the artifact subgraph. It makes sense that these should be
settings global to the application. This commit separates a settings
component and the original GitHub graph fetcher.

This invalidates localStorage; you can manually migrate.

Paired with @dandelionmane.

Test Plan:
Note that the data continues to be stored in localStorage and that it is
updated on each keypress. Note that the state is properly passed around:
if you change the repository name from `example-repo` to `sourcecred`,
e.g., and click “Fetch GitHub graph”, then the proper graph is fetched.

wchargin-branch: separate-artifact-settings
2018-03-26 13:26:44 -07:00
Dandelion Mané d310561b94
Generate Edge ids automatically (#107)
* Generate Edge ids automatically

Adds edgeID in graph, which creates a string id from src and dest.
Provided that the plugin only uses edgeID for generating edge ids of
that type, these ids will be unique.

Modify the GitHub plugin to use edgeID. This allows the code and
typesignature to be simpler, and will be more consistent with other
plugins.

Test plan:
Carefully inspect the snapshots.
2018-03-26 12:05:25 -07:00
Dandelion Mané 043c37f9c6
Add a toString and fromString method on addresses (#106)
* Add a toString and fromString method on addresses

The toString and fromString methods use json-stable-stringify,
and we've modified other address code to use these methods.
As such, a number of the snapshots have changed (ordering).

Test plan:
Verify the new included unit tests are comprehensive. Inspect the
snapshots.
2018-03-26 11:32:25 -07:00
Dandelion Mané b2c13ac891
[refactor] Move node/edge types to addresses (#105)
This pull request adds a string type as a field of the address, thus canonicalizing that all Nodes and Edges have a Type. This allows for simpler PluginAdapters, and simpler implementation of the GitHub plugin (as it no longer needs to invent its own mechanism for storing types).

I explored making the Address interface generic, with a Type parameter that is a subtype of string. Unfortunately, Flow type resolution seems to have an exponential performance degradation with many subtyped parameters, and adding the extra type parameters to Graph.merge resulted in my flow instance locking. Maybe we can explore adding the subtypes later.

In the GitHub plugin, we entirely do away with the NodeIDs, but we still have EdgeIDs. I plan to remove the need for EdgeIDs in a separate PR which will enforce uniqueness of (srcAddress, dstAddress, pluginName, type) tuples, so that explicitly generating IDs for edges will be unnecessary. In the mean time, I bifurcated the makeAddress function in the GitHub plugin to makeEdgeAddress and makeNodeAddress.

Test plan:
Flow and unit tests all pass. Inspect snapshots, and verify that all the changes are reasonable. Note that since we order by serialized address, adding the type field to address has changed the snapshot order in a few cases.

Close #103
2018-03-26 10:36:49 -07:00
William Chargin 8fdf758cb9
Standardize on Enzyme shallow rendering (#104)
Summary:
This commit moves our existing frontend tests to use Enzyme’s shallow
rendering API <http://airbnb.io/enzyme/docs/api/shallow.html>. The
benefit over also using `react-test-renderer` is simply consistency (the
two are functionally equivalent); the benefits over `mount` are that
subcomponents cannot contaminate the test state (i.e., you’re only
testing one component at a time), that the resulting snapshots are more
readable because the root props are not shown, and that the
implementation is more efficient. This is a follow-up to #102.

In a case where we actually need a full DOM tree, we should still feel
free to use `mount`, but we haven’t needed that yet.

Test Plan:
Verify that the new `ContributionList.test.js.snap` represents the same
data as the old one.

wchargin-branch: standardize-enzyme-shallow
2018-03-21 18:28:06 -07:00
William Chargin feac85ad2c
Use Enzyme to test ContributionList dynamics (#102)
Summary:
This is our first dynamic test of a React component! Enzyme looks pretty
easy to use to me, for both snapshot tests and interaction simulation.

In doing so, we catch a minor bug in the edge case where a contribution
is not owned by any plugin (`colSpan`, not `colspan`). This edge case
does not appear in the sample data, but it does appear in the test data,
even prior to this commit. The previous renderer, `react-test-renderer`,
appears not to surface this error. Furthermore, this bug did not cause
any user-visible errors except a `console.error`.

Test Plan:
Inspect the snapshot file to make sure that it is reasonable. (The
existing test case has its snapshot regenerated due to formatting
differences between the two renderers.)

To test that the browser error is fixed, render a contribution list on a
GitHub graph but with an empty adapter set. One way to do this is to comment out line 7 of
`standardAdapterSet.js`; alternately, you can use the React Dev Tools to
select the `ContributionList` node, then run
```js
$r.props.adapters.adapters = {};
$r.forceUpdate();
```
Note subsequently that there is no console error and that the `<td>`s in
question span three columns.

wchargin-branch: contributionlist-dynamic-test
2018-03-21 17:35:17 -07:00
William Chargin 5dd5de306c
Allow filtering by type in the contribution list (#101)
Summary:
Filter options are “all contributions” or a specific plugin/type
combination. This includes a snapshot test for the static state. I’ll
add an interaction test in a subsequent commit.

Test Plan:
`yarn start`, fetch graph, play with the filtering options.

wchargin-branch: filter-contributions
2018-03-20 18:50:25 -07:00
Dandelion Mané 39fd3fa354
Make GitHub capitalization consistent within code (#100)
* Make GitHub capitalization consistent within code

We now never capitalize the H in GitHub within variable or function
names. We still capitalize it in comments or user facing strings.

Test plan:
Unit tests, the fetchGithubRepoTest.sh, and
`git grep itHub` only shows comment lines and print statements.

* Fix William's klaxon
2018-03-20 18:32:05 -07:00
Dandelion Mané 41cdf2d855
Implement GitHub reference detection (#98)
This commit only adds logic for finding references in GitHub posts,
either by #-numeric reference, or explicit urls. Adding the reference
edges to the graph will occur in a followon commit.

Test plan: New unit tests are included
2018-03-20 18:10:03 -07:00
William Chargin 61624a5dcf
Extract Aphrodite-management test code to testUtil (#99)
Summary:
Any tests that render Aphrodite-styled React elements will need to do
this, so it’s nice to have the code in one place.

wchargin-branch: aphrodite-testutils
2018-03-20 17:35:26 -07:00
Dandelion Mané 5b420c6294
Add EdgeTypes type in githubPlugin (#97) 2018-03-20 15:35:22 -07:00
William Chargin f02e0610be
Use structured GraphQL query API in GitHub fetcher (#94)
Summary:
In addition to being nicer on the eyes, this enables the query to be
statically analyzed (e.g., by an auto-pagination API) and used by other
modules.

Test Plan:
Manually running
```shell
$ yarn backend
$ GITHUB_TOKEN="<your_token>" src/plugins/github/fetchGitHubRepoTest.sh
```
succeeds.

wchargin-branch: use-structured-graphql-queries
2018-03-20 15:29:31 -07:00
William Chargin ab619432e1
Begin work on contributions and adapters (#93)
Summary:
This commit begins to extend the artifact editor to display
contributions. To display contributions from arbitrary plugins, we need
to communicate with those plugins somehow. We do so via an adapter
interface that plugins implement; included in this commit is an
implementation of this interface for the GitHub plugin (partially: we
punt on rendering).

This includes a snapshot test. The snapshot format is designed to be
human-readable and -auditable so that it can serve as documentation.

Test Plan:
Run the application with `yarn start`. Then, fetch a graph and watch as
its contributions appear in the view.

wchargin-branch: contributions-and-adapters
2018-03-20 14:26:02 -07:00
William Chargin e9ca833448
Expose `NodeTypes` from the GitHub plugin (#96)
Summary:
This is useful for metaprogramming. For instance, suppose we have an
object like this:
```js
const stringifiers = {
  ISSUE: (stringifyIssue: (Node<IssueNodePayload>) => string),
  COMMENT: (stringifyComment: (Node<CommentPayload>) => string),
  ...
}
```
How do we type this? We might try
```js
{[type: NodeType]: (Node<NodePayload>) => string}
```
but this is not correct, because `Node<IssueNodePayload>` is a subtype of
`Node<NodePayload>`, and `(_) => K` is contravariant, not covariant. (In
other words, a function from `Node<IssueNodePayload>` is not as general
as a function from `Node<NodePayload>`.) We need to express a dependency
between the object key and the value. We instead write:
```js
type TypedNodeToStringifier = <T: $Values<NodeTypes>>(
  T
) => (node: Node<$ElementType<T, "payload">>) => string;
(stringifiers: $Exact<$ObjMap<NodeTypes, TypedNodeToStringifier>>);
```
This expresses exactly (heh) the right type.

Test Plan:
Note that removing any of the elements of `NodeTypes` yields a Flow
error, due to the static assertion following the type definition.

wchargin-branch: node-types
2018-03-20 13:27:08 -07:00
Dandelion Mané 559ed393a9
Update example-repo json and snapshots (#95)
I added some new issues to sourcecred/example-repo to test unicode
support and parsing of extremely long issues titles.

This commit merely updates our example-repo json and the corresponding
snapshots.

Test plan:
Run testFetchGithubRepo.sh
Run unit tests
2018-03-20 13:18:38 -07:00
Dandelion Mané 02754d2523
GitHub parser now recognizes pull request reviews (#91)
Also, since there are now two types of things that are being
"contained" (comments and pull request reviews), I factored out an
addContainment method to avoid repeating that code.

To make our handling of PullRequestReviewComments and regular Comments
consistent, I modified our query string so that we now request urls on
PullRequestReviewComments. Also, since I didn't notice until closely
inspecting the snapshot that we had been adding payloads with some
undefined properties, I added a test to verify that every property on
every node and edge payload is defined.

I regenerated the example-repo data to reflect the change to query
string.

Test plan:
Verify that the snapshot changes are appropriate
Run standard tests
Run `yarn backend`
Run `GITHUB_TOKEN={your_token} ./src/plugins/github/fetchGithubRepoTest.sh`
2018-03-20 11:46:46 -07:00
Dandelion Mané ab6e0d91a8
Don't run prettify on build (#92)
Running prettify on build takes a whole second!
2018-03-20 11:46:30 -07:00
William Chargin 55225fd53e Save graph fetcher credentials in local storage
Test Plan:
Make a request, then refresh, and note that the fields are populated.

Paired with @dandelionmane.

wchargin-branch: graph-fetcher-localstore
2018-03-19 20:06:52 -07:00
William Chargin 5d80e39473 Fetch and parse GitHub graphs on the frontend
Summary:
This is quick and dirty. No error handling yet. We’ll soon save
credentials and repository to local storage.

Paired with @dandelionmane.

Test Plan:
Run `yarn start`, then enter your API key and specify the
`sourcecred/example-repo` repo. Note that the resulting graph is shortly
logged to the console.

wchargin-branch: fetch-parse-github-frontend
2018-03-19 20:06:52 -07:00
William Chargin 0eed384850 Enable creating artifacts in the artifact list
Summary:
Paired with @dandelionmane.

wchargin-branch: create-artifacts
2018-03-19 20:06:52 -07:00
William Chargin 0c3aa9c7ba Create a view-only artifact viewer
Summary:
Paired with @dandelionmane.

wchargin-branch: create-artifact-viewer
2018-03-19 20:06:52 -07:00
William Chargin 5d042c0008 Use isomorphic-fetch instead of node-fetch
Summary:
Paired with @dandelionmane.

Test Plan:
```
$ CI=true yarn test
$ yarn backend
$ GITHUB_TOKEN="<your_token>" src/plugins/github/fetchGitHubRepoTest.sh
```

wchargin-branch: isomorphic-fetch
2018-03-19 20:06:52 -07:00
William Chargin d18cb945af Add style support to the artifacts app
Test Plan:
Note that the header, when rendered, is magenta.

wchargin-branch: stylish-artifacts
2018-03-19 20:06:52 -07:00
William Chargin bbecf00615
Repurpose React app as artifact editor (#89)
Summary:
We’ll now start creating the artifact plugin. A large part of this will
be the user interface, including a GUI. For now, our build system just
builds a single React app, so we’re cannibalizing the main explorer to
serve this purpose.

Paired with @dandelionmane.

Test Plan:
The following still work:
  - `yarn test`
  - `yarn start`
  - `yarn build; (cd build; python -m SimpleHTTPServer)`

wchargin-branch: repurpose-react-app-as-artifact-editor
2018-03-19 15:25:23 -07:00
William Chargin 8f8d9c4564 Strip down explorer app to a barebones React app (#88)
Summary:
We’re not deleting it because it works with the build system and has the
service worker stuff from create-react-app, but we’ll soon repurpose it.

Paired with @dandelionmane.

Test Plan:
The following still work:
  - `yarn test`
  - `yarn start`
  - `yarn build; (cd build; python -m SimpleHTTPServer)`

wchargin-branch: dismantle-explorer
2018-03-19 15:09:11 -07:00
William Chargin ca85fdf234 Reorganize `src/` directory (#87)
Test Plan:
Note that tests still pass, and all changes to snapshot files are
verbatim moves.

wchargin-branch: reorg
2018-03-19 14:31:50 -07:00
Dandelion Mané 30600004e4
Implement the GitHub graph parser (#81)
The GitHub parser transforms GraphQL api data from GitHub into our Graph
data structure. This commit focuses on properly parsing Issues, Pull
Requests, Comments, and Users.

Test Plan:
Run the unit tests. Inspect the snapshot results (particuarly those for
individual pull requests or issues, which are easier to parse) and
verify that the output is appropriate.
2018-03-19 12:26:32 -07:00
William Chargin 274007c90d
Configure Webpack for backend applications (#84)
Summary:
Running `yarn backend` will now bundle backend applications. They’ll be
placed into the new `bin/` directory. This enables us to use ES6 modules
with the standard syntax, Flow types, and all the other goodies that
we’ve come to expect. A backend build takes about 2.5s on my laptop.

Created by forking the prod configuration to a backend configuration and
trimming it down appropriately.

To test out the new changes, this commit changes `fetchGitHubRepo` and
its driver to use the ES6 module system and Flow types, both of which
are properly resolved.

Test Plan:
Run `yarn backend`. Then, you can directly run an entry point via
```
$ node bin/fetchAndPrintGitHubRepo.js sourcecred example-repo "${TOKEN}"
```
or invoke the standard test driver via
```shell
$ GITHUB_TOKEN="${TOKEN}" src/backend/fetchGitHubRepoTest.sh
```
where `${TOKEN}` is your GitHub authentication token.

wchargin-branch: webpack-backend
2018-03-18 22:43:23 -07:00
William Chargin 291dcb17c3
Add a GraphQL structured query format (#77)
Summary:
See motivation in #76. Feel free to look at the new snapshot file to
inspect the structured representation and also the stringified output.

This implementation is sufficient to encode our query against the
GitHub v4 API; see the test plan below.

Test Plan:
Unit tests added; run `yarn flow && yarn test`.

This code has full coverage except for lines 260, 315, and 380 of
`queries.js`; these lines check invariants that should never be
violated.

You can also use the following steps to verify that the sample query is
valid GraphQL that produces the same results as our hand-written query:

 1. Apply the following hacky patch:

    ```diff
    diff --git a/src/backend/graphql/queries.test.js b/src/backend/graphql/queries.test.js
    index 52bdec7..c04a636 100644
    --- a/src/backend/graphql/queries.test.js
    +++ b/src/backend/graphql/queries.test.js
    @@ -3,6 +3,18 @@
     import type {Body} from "./queries";
     import {build, stringify, multilineLayout, inlineLayout} from "./queries";

    +function emitGitHubQuery(layout, filename) {
    +  const fs = require("fs");
    +  const path = require("path");
    +  const result = stringify.body(usefulQuery(), layout);
    +  const outputFilepath = path.join(__dirname, "..", filename);
    +  const outputText = `module.exports = ${JSON.stringify(result)};\n`;
    +  fs.writeFileSync(outputFilepath, outputText);
    +  console.log(`Wrote output to ${outputFilepath}.`);
    +}
    +emitGitHubQuery(multilineLayout("  "), "githubQueryMultiline.js");
    +emitGitHubQuery(inlineLayout(), "githubQueryInline.js");
    +
     describe("queries", () => {
       describe("end-to-end-test cases", () => {
         const testCases = {
    ```

 2. Run `CI=true yarn test`, and verify that the following two files
    written to `src/backend/` contain appropriate contents. You can just
    eyeball them, or check that they match my results:
    https://gist.github.com/wchargin/f37b99fd4ec345c9d2541c2dc53ceda9

 3. In `fetchGitHubRepo.js`, change the definition of `const query` to

    ```js
    const query = require("./githubQueryMultiline.js");
    ```

    Run

    ```shell
    GITHUB_TOKEN="<your_token_here>" src/backend/fetchGitHubRepoTest.sh
    ```

    and verify that it exits successfully.

 4. Repeat for `require("./githubQueryInline.js")`.

wchargin-branch: graphql-structured-queries
2018-03-18 22:35:20 -07:00
William Chargin 1083540d21
Parameterize `Graph` over node and edge payloads (#83)
Summary:
Closes #82. This affords clients type-safety without needing to
verbosely annotate every node or edge passed into the graph functions.
It also enables graph algorithms to be more expressive in their types:
for instance, the merge function now clearly indicates from its type
that the first graph’s nodes are passed as the first argument to the
node reducer, and the second graph’s nodes to the second. Clients can
upgrade immediately by using `Graph<*, *>`.

Thankfully, Flow supports variance well enough for this all to be
possible without too much trouble.

Test Plan:
Existing unit tests pass statically and at runtime. I added a test case
to demonstrate that merging works covariantly.

To see some failures, change `string` to an incompatible type, like
`number`, in the definitions of `makeGraph` in test functions for
conservatively rejecting graphs with conflicting nodes/edges
(ll. 446, 462).

wchargin-branch: parameterize-graph
2018-03-17 11:36:53 -07:00
Dandelion Mané 894d6a2291
Allow adding explicitly typed nodes/edges (#80)
Summary:
Flow doesn’t allow us to specify variance annotations in generic
function parameters, and doesn’t allow coercing `Node<T>` to
`Node<mixed>`. This forces us to put `any`s in our code, which…works.

Paired with @dandelionmane.

Test Plan:
New unit tests trivially pass dynamically, and now pass statically
(failing before the changes to `graph.js`).

wchargin-branch: explicitly-typed-nodes-edges
2018-03-17 00:28:33 -07:00
Dandelion Mané 1e791782d5
Allow redundant adds to the Graph (#79)
Graph.addNode and Graph.addEdge now allow adding the same node or edge
multiple times, provided that the duplicate adds are trying to insert
identical content.

This came up while prototyping the GitHub plugin; rather than create
myriad subgraphs and merge them, I found it convenient to construct a
single graph and iteratively add nodes. Since the same node may be
discovered multiple times (most notably user identities), there was a
need for a "conservative add" abstraction that adds a node if it doesn't
exist yet, but errors only if multiple adds conflict.

Since this behavior is generic and highly conservative, it seemed
appropriate to include in the graph class itself.

Test Plan:
The unit tests have been updated to include the new behavior.
2018-03-17 00:28:17 -07:00
Dandelion Mané d2501947a6
Regenerate example-repo testdata (#78)
I moved sourcecred/tiny-example-repository to sourcecred/example-repo
as it's simpler to remember. I also unarchived it and added comments
to an issue, so that we can create a simple test for issue parsing.

This commit merely updates SourceCred to point to example-repo with
the regenerated canoncial output.
2018-03-16 11:22:17 -07:00
Dandelion Mané 0e57b42095 Fetch GitHub repos using the GitHub v4 API (#75)
Summary:
It’s a whole new world of GraphQL! Our parser is now just a GraphQL
query that asks for exactly what we want and dumps it to a file. The
data exposed by the v4 API is also in a much nicer format than that of
the v3 API, so this is pretty much a universal improvement.

Currently, we do not handle pagination. We require that the repository
in question have fewer than a fixed number of issues, and comments per
issue, and reviews per PR, and review comments per PR, and so on. If
this limit is exceeded, the script will fail-fast with a nice error
message. To fix this, we’ll need to write a general-purpose pagination
API that allows traversing cursors at any level of the query.

Paired with @wchargin.

Test Plan:
Run

    $ GITHUB_TOKEN="your_token_here" src/backend/fetchGitHubRepoTest.sh

and verify that it exits with 0. Note that if you change this script’s
repository from `tiny-example-repository` to `sourcecred`, the script
correctly fails and outputs a useful diff.

wchargin-branch: github-v4-graphql
2018-03-15 14:56:25 -07:00
Dandelion Mané ef7160d7d4
Update README.md
Guide contributors to participate via issues.
2018-03-06 19:09:46 -08:00
Dandelion Mané fb00c35823
Factor eventide graph demo data to a new module (#71)
* Factor evertide graph demo data to a new module

It would be helpful to make our standard tiny graph available to other
test and demo instances, outside of just graph.test.js. This way we can
use it as a test case for the Graph Explorer.
2018-03-05 20:58:47 -08:00