Commit Graph

1164 Commits

Author SHA1 Message Date
William Chargin 2e0b17cef7
mentionsAuthorReference: remove legacy GraphQL dep (#963)
Summary:
This test has data in the old format, and uses the `RelationalView`
method that automatically translates it. As we prepare to delete that
code, we upgrade the underlying format of this test data. The end code
is nicer to read, too (e.g., we don’t need the `connection` helper
function).

Recommend reviewing with `git show -b`.

Test Plan:
Running `yarn test` suffices.

wchargin-branch: mentionsAuthorReference-remove-legacy-graphql
2018-10-31 12:36:48 -07:00
William Chargin b77db72c1d
github: remove some deps on github/graphql.js (#962)
Summary:
A number of modules depended on the legacy `github/graphql.js` module
solely to get at the `Reactions` enum object. As of #961, that object is
exposed from the much lighter-weight `graphqlTypes.js`. This patch
switches over the relevant imports, reducing our dependencies on this
legacy module and its large bundle size.

Test Plan:
It suffices to run `yarn flow` and verify that the two values being
imported are identical.

wchargin-branch: github-use-generated-enums
2018-10-31 12:24:06 -07:00
William Chargin 2377f1980f
schema: generate runtime constants for enum values (#961)
Summary:
We have a `const Reactions` convenience enum in `github/graphql.js`.
That value is useful, but that module is slated to die. This commit
extends our Flow type generation script to include these values.

Test Plan:
Existing unit tests suffice.

wchargin-branch: schema-generate-enums
2018-10-31 12:22:58 -07:00
Dandelion Mané 9bccb7661d
Change the GitHub default TTL to one week (#960)
While we wait for explicit configurability, one week is a better
default for the many SourceCred demos I maintain.

Test plan: n/a
2018-10-31 12:21:08 -07:00
Dandelion Mané b86b0b32ec
Add `analysisAdapter` (#950)
For #704, we're adding plugin adapters that are specific to the needs of
the analysis module. They have a simple scope: they just provide a way
to get the declaration, and to load the corresponding graph.

Adapters for the `git` and `github` plugins have been implemented, along
with unit tests.

Test plan: `yarn test` suffices.
2018-10-31 11:36:30 -07:00
William Chargin 1db146ba70
test: fix `test_load_example_github` (#959)
Summary:
Fixes #955. Our test runner does run `yarn backend` before Sharness
tests, but it builds the backend applications into a temporary directory
rather than squashing the repository backend (which is good!). The test
script has learned about this directory.

Test Plan:
Run `rm -rf ./bin && yarn test --full`, which fails before this commit
and passes after it.

wchargin-branch: fix-test-load-example-github
2018-10-31 11:12:36 -07:00
William Chargin a653638e42
test: run Sharness verbosely on nightly jobs (#958)
Summary:
Tests that run only on nightly builds (`yarn test --full`) and fail only
on CI (not locally) are a bit more inconvenient to debug when they fail.
This patch makes the `yarn test --full` script print all the
intermediate output in Sharness tests, which can be helpful. We don’t do
this for `yarn test` simply because it generates a ton of spam even on
successful tests.

Test Plan:

    $ yarn test --full 2>&1 | wc -l
    1173

wchargin-branch: test-full-verbose
2018-10-31 11:06:48 -07:00
William Chargin 8706fa9771
git: don’t warn when rendering unknown commits (#957)
Summary:
Fixes #953. See that issue for context.

Test Plan:
Unit tests updated. To see the change in action, load the SourceCred
data and expand @decentralion’s commits-authored to find commits that
were merged into non-`master` branches. Note that these commits are
rendered correctly (in the same way that they were before this patch),
and that there is no console error (new as of this patch).

![Screenshot](https://user-images.githubusercontent.com/4317806/47805669-1f98b580-dcf5-11e8-8683-8ee91f7f478a.png)

wchargin-branch: git-no-warn-on-unknown-commit
2018-10-31 10:45:25 -07:00
William Chargin 1e87fdaa07
test: skip `test_load_example_github` on CircleCI (#956)
Summary:
This is a quick patch for #955, pending investigation and fix.

Test Plan:

```shell
$ cd sharness/
$ ./test_load_example_github.t --long
ok 1 - should load sourcecred/example-github
ok 2 # skip should update the snapshot (missing UPDATE_SNAPSHOT of !CIRCLECI,LOADED_GITHUB,UPDATE_SNA
PSHOT)
ok 3 - should be identical to the snapshot
# passed all 3 test(s)
1..3
$ CIRCLECI=true ./test_load_example_github.t --long
ok 1 # skip should load sourcecred/example-github (missing !CIRCLECI of !CIRCLECI,EXPENSIVE,HAVE_GITH
UB_TOKEN)
ok 2 # skip should update the snapshot (missing UPDATE_SNAPSHOT,LOADED_GITHUB,!CIRCLECI of !CIRCLECI,
LOADED_GITHUB,UPDATE_SNAPSHOT)
ok 3 # skip should be identical to the snapshot (missing LOADED_GITHUB,!CIRCLECI of !CIRCLECI,LOADED_
GITHUB)
# passed all 3 test(s)
1..3
```

Ref.: <https://circleci.com/docs/2.0/env-vars/#built-in-environment-variables>

wchargin-branch: test-skip-failing-circleci
2018-10-31 10:45:12 -07:00
William Chargin 7604b11617
env: fix commit date formatting (#954)
Summary:
Fun facts:

  - `new Date().getDay()` does not return the day of the month. It
    returns the day of the _week_ as an integer `0 ≤ n ≤ 6`.
  - `new Date().getDate()` returns the day of the month from 1 to 31.
  - `new Date().getMonth()` really does return the month, but _this_ one
    is zero-based!

All this to say, my implementation in #901 was a bit flawed.
Why didn’t I notice at the time? I wrote and tested the change on
2018-10-01, which was a Monday, so both `getDay()` and `getDate()` were
in fact `1`. As for me failing to notice that `getMonth()` was off by
one—well, sometimes I’m very dumb.

Test Plan:

```shell
$ NODE_ENV=development node -e '
>     require("./config/env");
>     console.log(process.env.SOURCECRED_GIT_STATE);
> '
{"commitHash":"f9bb75ef71c5","commitTimestamp":"20181030-1518","dirty":true}
$ date -I
2018-10-30
```

wchargin-branch: env-fix-date-formatting
2018-10-30 17:24:39 -07:00
William Chargin f9bb75ef71
release: v0.2.0 (#952)
Test Plan:
Remove the SourceCred output directory, run `yarn backend`, and load
data for `sourcecred/example-github` and `sourcecred/sourcecred`. Then,
run `yarn start` and note that the cred explorer still works. Finally,
note that `yarn test --full` passes.

wchargin-branch: release-v0.2.0
2018-10-30 15:18:19 -07:00
William Chargin ea575cf5da
changelog: add Mirror module entry (#951)
Summary:
This points to #622 as the blanket issue, though really there was a long
series of pull requests worth of implementation.

Test Plan:
None.

wchargin-branch: changelog-mirror
2018-10-29 19:49:11 -07:00
Dandelion Mané 4fdb65b866
Snapshot the results of running `sourcecred load` (#949)
This will enable us to test code that needs to consume the results of
running `sourcecred load`, e.g. plugin adapter code.

If you need to update the snapshot, run

    (cd sharness; UPDATE_SHAPSHOT=1 ./test_load_example_github.t)

Test plan: `yarn sharness-full` passes.

Paired with @wchargin
2018-10-30 02:34:23 +00:00
Dandelion Mané c4e2ec8839
Rename `PluginAdapter` to `AppAdapter` (#948)
Now that we're planning to add adapters for the `analysis` module, we
should rename the `PluginAdapter` to make it clear that it's scoped for
`app`.

Test plan: `yarn test` suffices
2018-10-30 00:26:22 +00:00
Dandelion Mané cb30023a02
Factor out plugin declarations (#947)
The plugin adapters are specific to `app/` and have logic for fetching
data from the backend, producing React nodes for descriptions, et
cetera.

However, they also have information that is generic to the plugin
itself: its name, its node/edge prefixes, and its types.

This method factors out the generic info into a `PluginDeclaration`,
which is a type (rather than an interface). Then, the plugin adapter has
a `declaration` method which returns the declaration.

Current users of the plugin adapters get additional mechanical
complexity because they need to call `.declaration().property` rather
than `.property()`, but this is not too significant.

The main benefit is that #704 is unblocked, as the cli `analyze` command
will be able to get necessary information from the declaration. As an
added benefit, the organization of plugin code is somewhat improved.

Test plan: `yarn test` sufficies, but I also ran `yarn start` and
checked the UI behavior to be safe.
2018-10-30 00:13:11 +00:00
Dandelion Mané 5f2cc56172
Move `{Node,Edge}Type` to src/analysis/types.js (#946)
Test plan: `yarn test`

Part of ongoing work on #704
2018-10-29 23:49:53 +00:00
Dandelion Mané 917b793aca
Move some files from core/attribution/ to analysis/ (#944)
The `core/attribution/` folder has some code that really is "core" in
that it deals with very basic concepts around converting graphs to
markov chains and running PageRank on them, and some code that is less
"core", like for normalizing scores and doing analysis on them.

To make progresson #704, we need an intermediary directory that has
analysis-related code that is e.g. aware of Node and Edge types, and
weights on those types, and can use them to run weight-informed
PageRank. That code shouldn't live in the app directory (since it is not
coupled to the frontend rendering), but also shouldn't live in core
(since "core" is basically finalized code with fully baked abstractions,
and per #710, this is not true of the node/edge type system).

Thus, I've decided to create the `analysis` directory. To get that
directory started, I've moved the non-core code in `core/attribution/`
to `analysis/`.

Test plan: `yarn test` passes, which is all we need, since this is a
straightforward file rename.
2018-10-29 22:54:15 +00:00
Dandelion Mané 542e2f9723
Add skeleton of `sourcecred analyze` (#942)
The `analyze` command is the first step towards #704 and #703. When
fully implemented, it will run PageRank for a loaded repository,
generating a complete graph and cred attribution.

For now, this just adds a scaffold. It does basic argument parsing, and
has help text, but the actual command is not yet implemented.

Test plan:
Unit tests verify that the analyze command is hooked into `sourcecred`
and `sourcecred help`, and that it responds to the `--help` command and
parses its arguments appropriately.
2018-10-29 22:27:06 +00:00
Dandelion Mané d3e79e3c4e
Add integrity lines to yarn lockfile (#941)
Generated by running `yarn` with version 1.10
2018-10-29 20:59:13 +00:00
William Chargin 08219f98bf
fetchGithubRepo: use Mirror pipeline (#937)
Summary:
As of this commit, `node ./bin/sourcecred.js load` uses the Mirror code,
and the legacy continuation-fetching code is not included in the
`sourcecred.js` bundle.

We do not yet perform the commit prefetching described in #923. The code
should be plenty fast for repositories that merge pull requests at least
occasionally.

Test Plan:
Running `yarn test --full` passes. Loading `sourcecred/sourcecred` works
and generates a reasonable credit attribution. Loading it again
completes immediately.

wchargin-branch: fetchGithubRepo-mirror
2018-10-28 12:03:06 -07:00
William Chargin e2c99c418b
relationalView: use new data format (#934)
Summary:
This makes significant progress toward #923. As of this commit, it is
possible to use the Mirror module for the whole loading pipeline. This
process may be slow for repositories that do not use pull requests at
all (more precisely, that have large connected commit subgraphs none of
whose nodes is the merge commit of a pull request; see #920 for details)
so it is not yet the default codepath.

Test Plan:
Existing unit tests should suffice. For extra testing, I’ve added a
script that fetches a repository both via the old continuations logic
and the new Mirror logic, then constructs relational views and checks
whether the data is the same. For `example-github`, the views are
identical. For `sourcecred`, they are not: the old continuations logic
erroneously omits two commits, which the Mirror logic includes.

You can run the test like this:

```
$ node ./bin/testContinuations.js \
> sourcecred sourcecred MDEwOlJlcG9zaXRvcnkxMjAxNDU1NzA= \
> /tmp/continuations.json /tmp/mirror.json \
> 2> >(jq . >&2)
{
  "child": "0d38dde23a6de831315f3643a7d2bc15e8df7678",
  "parent": "cb8ba0eaa1abc1f921e7165bb19e29b40723ce65",
  "type": "UNKNOWN_PARENT_OID"
}
{
  "child": "d152f48ce4c2ed1d046bf6ed4f139e7e393ea660",
  "parent": "de7a8723963d9cd0437ef34f5942a071b850c0e7",
  "type": "UNKNOWN_PARENT_OID"
}
Different. Saving to disk...
```

Use `diff -u <(jq . /tmp/continuations.json) <(jq . /tmp/mirror.json)`
to inspect the differences, and note that exactly the two missing
commits have been added and that there are no other changes. (The diff
is small: just 51 lines of nicely formatted JSON.) The full log is here:
<https://gist.github.com/wchargin/e159cac9dcf3cc3b1efbd54f59e24e0b>

I also generated the `sourcecred/sourcecred` cred attribution and viewed
it with `yarn start`, which seems to work fine.

wchargin-branch: relationalview-new-data-format
2018-10-23 16:42:49 -07:00
William Chargin 58e98124ac
relationalView: make snapshots order-invariant (#933)
Summary:
An upcoming commit will happen to change the order in which commits are
ingested. This is not an observable change, and should not cause a
snapshot failure.

Test Plan:
Inspection.

wchargin-branch: relationalview-snapshots-order-invariant
2018-10-23 10:50:18 -07:00
William Chargin 24242eccfe
chore: update flow-typed libdefs (#932)
Summary:
Generated with `flow-typed install --skip --overwrite`.

Test Plan:
`yarn flow` passes.

wchargin-branch: flow-typed-update
2018-10-22 10:05:11 -07:00
William Chargin 2687bc5115
flow-typed: update prettier libdef metadata (#931)
Summary:
Generated with `flow-typed install prettier@1.13.4 --overwrite`. The
changes in #925 have been merged upstream; this pulls in the updated
signature and version.

Test Plan:
`yarn flow` passes.

wchargin-branch: flow-typed-update-prettier
2018-10-22 10:02:51 -07:00
William Chargin 993de9303a
github: translate old format to structured format (#930)
Summary:
This implements the translation module described in #923. See that issue
for context.

Test Plan:
This is a mostly straightforward translation from one strongly typed
data structure to another, so Flow handles most of it.

As a check on the snapshot, run:

```
$ grep -e oid -e target -e mergeCommit \
> src/plugins/github/__snapshots__/translateContinuations.test.js.snap
      "target": Object {
        "oid": "6bd1b4c0b719c22c688a74863be07a699b7b9b34",
            "oid": "c430bd74455105f77215ece51945094ceeee6c86",
                "oid": "6d5b3aa31ebb68a06ceb46bbd6cf49b6ccd6f5e6",
                    "oid": "0a223346b4e6dec0127b1e6aa892c4ee0424b66a",
                        "oid": "ec91adb718a6045b492303f00d8e8beb957dc780",
                        "oid": "ecc889dc94cf6da17ae6eab5bb7b7155f577519d",
                            "oid": "ec91adb718a6045b492303f00d8e8beb957dc780",
        "mergeCommit": Object {
          "oid": "0a223346b4e6dec0127b1e6aa892c4ee0424b66a",
              "oid": "ec91adb718a6045b492303f00d8e8beb957dc780",
              "oid": "ecc889dc94cf6da17ae6eab5bb7b7155f577519d",
                  "oid": "ec91adb718a6045b492303f00d8e8beb957dc780",
        "mergeCommit": Object {
          "oid": "6d5b3aa31ebb68a06ceb46bbd6cf49b6ccd6f5e6",
              "oid": "0a223346b4e6dec0127b1e6aa892c4ee0424b66a",
                  "oid": "ec91adb718a6045b492303f00d8e8beb957dc780",
                  "oid": "ecc889dc94cf6da17ae6eab5bb7b7155f577519d",
                      "oid": "ec91adb718a6045b492303f00d8e8beb957dc780",
        "mergeCommit": null,
```

Cross-check this against [the example-github commits][commits] thus:

  - Note that commit `6bd1b4c` is the head commit, and is thus the root
    commit of the `target` chain.
  - Note that commits `0a22334` and `6d5b3aa`, which were merged via
    pull request, appear twice each: once in the history from head, and
    once as the merge commit of a pull request.
  - Note that commit `0a22334` has two parents at each occurrence.
  - Note that the unmerged pull request’s merge commit is `null`.

[commits]: https://github.com/sourcecred/example-github/commits/master

To run this on real-world data, apply the following patch:

```diff
diff --git a/src/plugins/github/fetchGithubRepo.js b/src/plugins/github/fetchGithubRepo.js
index 6ac201af..b14ca760 100644
--- a/src/plugins/github/fetchGithubRepo.js
+++ b/src/plugins/github/fetchGithubRepo.js
@@ -11,6 +11,7 @@ import {stringify, inlineLayout, type Body} from "../../graphql/queries";
 import {createQuery, createVariables, postQueryExhaustive} from "./graphql";
 import type {GithubResponseJSON} from "./graphql";
 import type {RepoId} from "../../core/repoId";
+import translateContinuations from "./translateContinuations";

 /**
  * Scrape data from a GitHub repo using the GitHub API.
@@ -44,6 +45,11 @@ export default function fetchGithubRepo(
     payload
   ).then((x: GithubResponseJSON) => {
     ensureNoMorePages(x);
+    console.warn("Translating continuations...");
+    for (const w of translateContinuations(x).warnings) {
+      console.warn(w);
+    }
+    console.warn("Done.");
     return x;
   });
 }
```

Then run:

```
$ yarn backend >/dev/null 2>/dev/null; echo $?
0
$ node ./bin/sourcecred.js load sourcecred/sourcecred --plugin github 2>&1 |
> ts -s '%.s'
55.015740 Translating continuations...
55.037217 { type: 'UNKNOWN_PARENT_OID',
55.037273   child: '0d38dde23a6de831315f3643a7d2bc15e8df7678',
55.037290   parent: 'cb8ba0eaa1abc1f921e7165bb19e29b40723ce65' }
55.037309 { type: 'UNKNOWN_PARENT_OID',
55.037336   child: 'd152f48ce4c2ed1d046bf6ed4f139e7e393ea660',
55.037359   parent: 'de7a8723963d9cd0437ef34f5942a071b850c0e7' }
55.037383 Done.
```

Note that the two commits in question were each merged into a non-master
branch, in #28 and #329 respectively. Note also that translating these
continuations took just 22 milliseconds.

wchargin-branch: github-translate-continuations
2018-10-22 10:01:49 -07:00
William Chargin 6499df6b6b
github: fix misc. errors in old GraphQL system (#929)
Summary:
This fixes the following issues:

  - Pull request reviews actually do not have reactions.
  - We must fetch the `id` of a `Ref`.
  - We must fetch the `id` of a `Commit`, `Tree`, `Blob`, or `Tag`, and
    should also fetch its `oid`.
  - Repository owners cannot be bots.
  - Commit and reaction authors cannot be bots, organizations, or
    `undefined`.

Test Plan:
Running `yarn test --full` passes, and the snapshot diff is clearly
correct.

wchargin-branch: github-fix-up-continuations
2018-10-22 09:50:24 -07:00
William Chargin 889febb7f6
github: add GraphQL schema and Flow types (#928)
Summary:
The included schema is forked from the one in `graphql/demo.js`.
Primitive types have been added, and the `parents` connection has been
added to commit objects per #920. (We do not include this in the demo
script because without prefetching it would take a long time to load.)

Test Plan:
Unit tests added; run `yarn unit`. Then run `yarn backend` and verify
that `node ./bin/generateGithubGraphqlFlowTypes.js` generates exactly
the same output as in the types file:

```
$ node ./bin/generateGithubGraphqlFlowTypes.js |
> diff -u - ./src/plugins/github/graphqlTypes.js
$ echo $?
0
```

Change the `graphqlTypes.js` file and verify that `yarn unit` fails.

As the build config has been changed, a `yarn test --full` is warranted.
It passes.

Finally, I have manually verified that the schema is consistent with the
documentation at <https://developer.github.com/v4/object/repository/>
and related pages.

wchargin-branch: github-schema-flow-types
2018-10-19 09:04:54 -07:00
William Chargin 04f7e9ea8c
graphql: generate Flow types from a schema (#927)
Summary:
Generating Flow types from a structured schema is both straightforward
and terribly useful. This commit implements it.

Test Plan:
Inspect the snapshot for correctness manually. Then, copy it into a new
file, remove backslash-escapes, and verify that it passes Flow.
A subsequent commit will generate types for the actual GitHub schema.

wchargin-branch: graphql-generate-flow-types
2018-10-19 09:02:49 -07:00
William Chargin e787db53c4
schema: allow annotating primitive field types (#926)
Summary:
Prior to this change, primitive fields were un\[i\]typed. This commit
allows adding type annotations. Such annotations do not change the
semantics at all, but we will be able to use them to generate Flow types
corresponding to a schema.

This commit also strengthens the validation on schemata to catch some
errors that would previously have gone unnoticed at schema construction
time: e.g., a node reference to a type that does not exist.

Test Plan:
Unit tests updated, retaining full coverage. The demo script continues
to work when loading `example-github` or `sourcecred`.

wchargin-branch: schema-annotated-primitives
2018-10-19 09:00:52 -07:00
William Chargin 4558d775a5
flow-typed: fix Prettier libdefs (#925)
Summary:
This patches in the following two pull requests from upstream:

  - https://github.com/flow-typed/flow-typed/pull/2856
  - https://github.com/flow-typed/flow-typed/pull/2860

The first exports types so that they can be used in client code.
The second fixes a broken type definition.

Test Plan:
See the test plans in those two PRs.

wchargin-branch: flow-typed-fix-prettier
2018-10-19 08:54:00 -07:00
William Chargin ee4900616b
github: fetch commit parent oids (#924)
Summary:
To ease the transition from manual continuation resolution to the Mirror
API, we update the old system to fetch commit parent OIDs, as described
in #923.

Test Plan:
To check that the continuations are wired correctly, apply the following
patch to force continuations to be followed at every step:

```diff
diff --git a/src/plugins/github/graphql.js b/src/plugins/github/graphql.js
index 05761ca..a21a364 100644
--- a/src/plugins/github/graphql.js
+++ b/src/plugins/github/graphql.js
@@ -53,7 +53,7 @@ const PAGE_SIZE_COMMENTS = 20;
 const PAGE_SIZE_REVIEWS = 5;
 const PAGE_SIZE_REVIEW_COMMENTS = 10;
 const PAGE_SIZE_COMMIT_HISTORY = 100;
-const PAGE_SIZE_COMMIT_PARENTS = 5;
+const PAGE_SIZE_COMMIT_PARENTS = 0;
 const PAGE_SIZE_REACTIONS = 5;

 /**
@@ -358,6 +358,7 @@ function* continuationsFromCommit(
 ) {
   const b = build;
   if (result.parents && result.parents.pageInfo.hasNextPage) {
+    console.warn(result.parents.pageInfo);
     yield {
       enclosingNodeType: "COMMIT",
       enclosingNodeId: nodeId,
@@ -366,7 +367,7 @@ function* continuationsFromCommit(
           b.field(
             "parents",
             {
-              first: b.literal(PAGE_LIMIT),
+              first: b.literal(1),
               after: b.literal(result.parents.pageInfo.endCursor),
             },
             [b.fragmentSpread("commitParents")]
```

(It is important that the initial page limit be `0` and not (say) `1`,
because the `defaultBranchRef` is likely to have just one parent; by
using `0`, we test its continuations as long as it has at least one
parent.)

Then run `./src/plugins/github/fetchGithubRepoTest.sh` and note that the
test passes and the output is

```
{ hasNextPage: true, endCursor: null }
{ hasNextPage: true, endCursor: null }
{ hasNextPage: true, endCursor: null }
{ hasNextPage: true, endCursor: null }
{ hasNextPage: true, endCursor: null }
{ hasNextPage: true, endCursor: null }
{ hasNextPage: true, endCursor: null }
{ hasNextPage: true, endCursor: 'MQ==' }
{ hasNextPage: true, endCursor: 'MQ==' }
```

Note that the test output (as updated in this commit) includes commits
with a unique parent, a commit with no parents, and a commit with two
parents.

I also ran `node ./bin/sourcecred.js load` on sourcecred/sourcecred and
ipfs/js-ipfs. Each worked, and the resulting credit attributions loaded
fine.

wchargin-branch: github-commit-parent-oids
2018-10-18 19:16:44 -07:00
William Chargin 7c5923959e
github: fetch dates from commits (#919)
Summary:
This has two benefits:

  - The dates on commits are data that we will probably want when we add
    timestamps to authorship edges to accommodate time-weighted cred.

  - Once the mirror module is integrated with the GitHub plugin, we’ll
    want to fetch dates on commits, because this is the only real-world
    test case for a nested field that contains a primitive field (as
    opposed to a node reference), so it’ll be nice to be continually
    exercising that somewhat-edge case.

Date strings are in commit-local time and do not depend on the time zone
of the requester (in contrast to [cursors]). For example, on SourceCred:

```shell
$ time node ./bin/fetchAndPrintGithubRepo.js \
> sourcecred sourcecred "${GITHUB_TOKEN}" |
> jq -rc '
>     .repository.defaultBranchRef.target.history.nodes[]
>     .author?.date[-6:]
> ' | sort | uniq -c
      1 +03:00
      6 -04:00
    717 -07:00
     58 -08:00
```

[cursors]: <https://github.com/sourcecred/sourcecred/pull/129#issuecomment-382970474>

Test Plan:
The snapshot contains 8 instances of `oid` and 8 instances of `date`,
which is good (each of these properties appears exactly once on each
commit, and nowhere else). Running `yarn test --full` passes.

wchargin-branch: github-commit-dates
2018-10-16 22:54:39 -07:00
William Chargin 1155c439b9
mirror: add support for shallow-nested fields (#918)
Summary:
This commit follows up on the previous two pull requests by drawing the
rest of the owl.

Resolves #915.

Test Plan:
Unit tests included.

To verify the snapshot change, open the snapshot file, and copy
everything from `query TestUpdate {` through the matching `}`, not
including the enclosing quotes. Strip all backslashes (Jest adds them).
Post the resulting query to GitHub and verify that it completes
successfully and that the result contains a commit with an `author`.

In other words, `xsel -ob | tr -d '\\' | ghquery | jq .` with [ghquery].

[ghquery]: https://gist.github.com/wchargin/630e03e66fa084b7b2297189615326d1

The demo entry point has also been updated. For an end-to-end test, you
can run the following command to see a commit with a `null` author (with
the current state of the repository) and a commit with a non-`null`
author:

```
$ node bin/mirror.js /tmp/mirror-example.db \
>     Repository MDEwOlJlcG9zaXRvcnkxMjMyNTUwMDY= \
>     3600 2>/dev/null |
> jq '(.defaultBranchRef.target, .pullRequests[0].mergeCommit) | {url, author}'
{
  "url": "6bd1b4c0b7",
  "author": {
    "date": "2018-09-12T19:48:21-07:00",
    "user": null
  }
}
{
  "url": "0a223346b4",
  "author": {
    "date": "2018-02-28T00:43:47-08:00",
    "user": {
      "id": "MDQ6VXNlcjE0MDAwMjM=",
      "__typename": "User",
      "url": "https://github.com/decentralion",
      "login": "decentralion"
    }
  }
}
```

You can also check that it is possible to fetch the whole SourceCred
repository (ID: `MDEwOlJlcG9zaXRvcnkxMjAxNDU1NzA=`).

wchargin-branch: mirror-shallow
2018-10-04 16:00:07 -07:00
William Chargin 49f0803a7a
mirror: reflect nested fields in schema info (#917)
Summary:
See #915 for context. This adds nested field data to the “useful info”
data structure added in #857.

Test Plan:
Unit tests for `_buildSchemaInfo` updated.

wchargin-branch: mirror-schemainfo-shallow
2018-10-04 15:30:04 -07:00
William Chargin 3d2206088c
schema: support shallow-nested object fields (#916)
Summary:
See #915 for context. This commit changes the `schema` module only.

I had a hard time picking names that clearly distinguish the top-level
field on the object and the subfields that it contains. @decentralion
and I independently came up with “nest” and “egg”. It’s a bit colorful,
but it’s certainly easy to remember which one is which, and it doesn’t
conflict with existing notions like “parent”/“child”.

Test Plan:
Unit tests expanded slightly, retaining full coverage.

wchargin-branch: schema-shallow
2018-10-04 15:28:58 -07:00
William Chargin d8d857fdd3
ci: remove Travis (#914)
Summary:
Closes #902. CircleCI continues to work fast and well, on both the
commit and nightly workflows. Travis continues to be slower. It is time.

I have disabled Travis for `sourcecred/sourcecred` via:
<https://travis-ci.org/profile/sourcecred>

This commit removes the vestigial configuration files and code.

Test Plan:
Running `yarn test` still works. Running `yarn test --full` still works,
and properly invokes `sharness-full` and the various extra tests.
Running `git grep -i travis` yields no results.

wchargin-branch: ci-remove-travis
2018-10-04 12:31:14 -07:00
William Chargin 48b68b221a
link: remove `styles` attribute from child (#911)
Summary:
By using `<a {...this.props}>{children}</a>`, we were forwarding the
Aphrodite selectors as `styles`. This caused the static HTML for the
page to include `<a styles="[object Object]">`, which is annoying.

Test Plan:
Unit tests extended: they fail before this change and pass after it.
Also clicked a router link and an external link in the application.

wchargin-branch: link-child-styles
2018-10-03 12:14:07 -07:00
William Chargin 16ed92549a
readme: replace Travis badge with CircleCI badge (#912)
Summary:
See #902 for context.

Test Plan:
Push to GitHub; observe badge in rendered README.

wchargin-branch: readme-circleci-badge
2018-10-03 11:43:10 -07:00
William Chargin 55950f5354
mirror: add an end-to-end `update` function (#909)
Summary:
This completes the minimum viable public API for the `Mirror` class. As
described on the docstring, the typical workflow for a client is:

  - Invoke the constructor to acquire a `Mirror` instance.
  - Invoke `registerObject` to register a root object of interest.
  - Invoke `update` to update all transitive dependencies.
  - Invoke `extract` to retrieve the data in structured form.

It is the third step that is added in this commit.

In this commit, we also include a command-line demo of the Mirror
module, which exercises exactly the workflow above with a hard-coded
GitHub schema. This can be used to test the module’s behavior with
real-world data. I plan to remove this entry point once we integrate
this module into SourceCred.

This commit makes progress toward #622.

Test Plan:
Unit tests included for the core functionality. The imperative shell
does not have automated tests. You can test it as follows.

First, run `yarn backend` to build `bin/mirror.js`. Then, run:

```shell
$ node bin/mirror.js /tmp/mirror-demo.db \
> Repository MDEwOlJlcG9zaXRvcnkxMjMyNTUwMDY= \
> 60
```

Here, the big base64 string is the ID for the sourcecred/example-github
repository. (This is listed in `graphql/demo.js`, alongside the ID for
the SourceCred repository itself.) The value 60 is a TTL in seconds. The
database filename is arbitrary.

This will print out a ton of output to stderr (all intermediate queries
and responses, for debugging purposes), and then the full contents of
the example repository to stdout.

Run the command again, and it should finish instantly. On my machine,
the main function runs faster than the Node startup time (50ms vs 60ms).

Then, re-run the command, changing the TTL from `60` to `1`. Note that
it sends off some queries and then prints the same repository.

It is safe to kill the program at any time, either while waiting for a
response from GitHub or while processing the results, because the mirror
module takes advantage of SQLite’s transaction-safety. Intermediate
updates will be persisted, so you’ll lose just a few seconds of
progress.

You can also of course dive into the generated database yourself to
explore the data. It’s good fun.

wchargin-branch: mirror-e2e-update
2018-10-02 21:06:01 -07:00
William Chargin 23e56f481a
mirror: paginate own-data node updates (#908)
Summary:
GitHub has an undocumented node limit on the number of IDs that can be
provided to the top-level `nodes` connection. This is silly, because we
can just spread the IDs over multiple identical connections. This commit
implements the logic to do so.

Test Plan:
Create some queries that use `nodes(ids: ...)` to fetch varying numbers
of objects:

```shell
id="MDEwOlJlcG9zaXRvcnkxMjAxNDU1NzA="
nodes() {
    n="$1"
    ids="$(yes "$id" | head -n "$n" | jq -R . | jq -sc .)"
    printf 'nodes(ids: %s) { __typename }' "$ids"
}
query() {
    printf '{ '
    for num; do
        printf 'nodes_%s: %s ' "$num" "$(nodes "$num")"
    done
    printf '}'
}
```

Note that the query given by `query 101` results in an error…

```json
{
  "data": null,
  "errors": [
    {
      "message": "You may not provide more than 100 node ids; you provided 101.",
      "type": "ARGUMENT_LIMIT",
      "path": [
        "nodes_101"
      ],
      "locations": [
        {
          "line": 1,
          "column": 3
        }
      ]
    }
  ]
}
```

…but the query given by `query 98 99` happily returns 197 node entries.

wchargin-branch: mirror-paginate-own-data
2018-10-02 20:49:01 -07:00
William Chargin 1b09a7f61b
markdown: ignore references in HTML code elements (#907)
Summary:
Fixes #903. We already ignore Markdown code syntax (backticks), but
prior to this commit we treated the contents of all HTML elements,
including `<code>`, as normal text. As of this commit, `<code>` elements
are stripped entirely. Other HTML elements, like `<em>`, are unaffected.

Test Plan:
Unit tests added. Also, load data for `ipfs/js-ipfs-block-service`, and
observe in the UI that PR `#36` (Update aegir to version 9.0.0) no
longer has any outward references.

wchargin-branch: markdown-html-code
2018-10-02 20:34:49 -07:00
William Chargin 3e49466ad5
ci: add "commit" and "nightly" CircleCI workflows (#906)
Summary:
This should run `yarn test` on every commit, and `yarn test --full`
on `master` once per day at 15:00 PDT/14:00 PST.

Useful documentation links:

  - <https://circleci.com/docs/2.0/workflows/#scheduling-a-workflow>
  - <https://circleci.com/docs/2.0/reusing-config/#authoring-reusable-commands>

Test Plan:
I’ve pushed this branch and verified that the “commit” workflow executes
successfully on CircleCI. To test the cron workflow, we’ll have to wait
until the daily build.

wchargin-branch: circleci-workflows
2018-10-02 10:57:33 -07:00
William Chargin 163b2c1377
env: drop `--date=format:...` to support Git 2.1.4 (#901)
Summary:
Git only learned `--date=format:...` in Git 2.6.0. Some old Docker
images have older versions of Git. It’s not too much work to reimplement
this particular bit of functionality, so this commit does so.

Test Plan:
Install Git 2.1.4 (as used on CircleCI): from the Git repository for
Git itself, run:

    git checkout v2.1.4
    make
    make install

Watch `yarn test --full` pass. Before this commit, `yarn unit` failed.

wchargin-branch: env-support-git-2.1.4
2018-10-01 18:07:52 -07:00
William Chargin 440b6c2567
env: pass parent PATH to Git invocations (#900)
Summary:
Our environment-stripping logic used in `config/env.js` to read the
current Git state included stripping the `PATH` environment variable.
This had the effect that the system Git executable would always be used
in preference to a user-installed version.

Test Plan:
Run `/usr/bin/git --version`, and then install a different version of
Git. (For instance, check out an old tag, then `make && make install`
from the `git/git` repository.) Then, add

```js
  console.log(execFileSync("git", ["--version"], {env}).toString());
```

to `getGitState` in `config/env.js`, and run

    NODE_ENV=development node ./config/env.js

Note that this prints the version of the system Git before this change,
and the user Git after this change.

Alternately, local-install a version of Git earlier than 2.6.0, and note
that `yarn unit` now _fails_ because the `--date=format:…` syntax is not
known to such versions of Git. Prior to this commit, the tests would
pass as long as the system Git were more recent.

wchargin-branch: env-git-path
2018-10-01 17:52:56 -07:00
William Chargin e0dcce220b
ci: add CircleCI config (#899)
Summary:
CircleCI seems fast.

This config file copied from the one suggested by CircleCI, with the
following modifications: Node version changed from 7.10 to 8.12.0;
commented-out MongoDB dependency removed; capitalization error in header
comment fixed.

Test Plan:
Per the CircleCI instructions, merge this PR and then turn on the
project.

wchargin-branch: circleci-config
2018-09-28 16:58:54 -07:00
Dandelion Mané 4a374d755e
Hyperlink Git commits to GitHub (#887)
This modifies the `nodeDescription` code for the Git plugin so that when
given a Git commit, it will hyperlink to that commit on GitHub. It does
this by looking up the corresponding `RepoId`s from the newly-added
`commitToRepoId` field in the `Repository` (#884).

Per a [suggestion in review], rather than hardcoding the GitHub url
logic in the Git plugin, we provide them via a `GitGateway`.

[suggestion in review]: https://github.com/sourcecred/sourcecred/pull/887#issuecomment-424059649

When no `RepoId` is found, it errors to console and does not include a
hyperlink. When multiple `RepoId`s are available, it chooses to link to
one arbitrarily. (In the future, we could amend this behavior to add
links to every valid repo). This behavior is tested.

Test plan:
I ran the application on newly-generated data and verified that it sets
up commit hyperlinks appropriately. Also, see unit tests.
2018-09-27 20:32:43 -07:00
William Chargin 65d811fb44
mirror: add helpers for full queries and updates (#898)
Summary:
An update step is now roughly as simple as:

    _updateData(postQuery(_queryFromPlan(_findOutdated())))

(with some option/config parameters thrown in).

This makes progress toward #622.

Test Plan:
Unit tests included. They are light, because these functions are light.
They still retain full coverage.

wchargin-branch: mirror-full-pipeline
2018-09-27 19:12:47 -07:00
William Chargin 9a4c91887b
mirror: include typenames in extracted data (#897)
Summary:
These typenames are often superfluous, but sometimes they are useful.
For instance, we might want to fetch the same data for `User`s, `Bot`s,
and `Organization`s, but still differentiate which kind of node we
fetched from an `Actor` union reference. Similarly, many timeline events
may have similar signatures (like, “issue closed” vs. “issue reopened”).

Test Plan:
Existing unit tests have been updated; run `yarn unit`.

wchargin-branch: mirror-extract-typenames
2018-09-26 21:24:30 -07:00
William Chargin c7ba89b807
license: relicense under MIT + Apache-2 (#896)
Summary:
All contributors to SourceCred have agreed to this more permissive
licensing option:

  - @decentralion: [link to comment][decentralion]
  - @wchargin: [link to comment][wchargin]
  - @claireandcode: [link to comment][claireandcode]

[decentralion]: https://github.com/sourcecred/sourcecred/issues/812#issuecomment-420817902
[wchargin]: https://github.com/sourcecred/sourcecred/issues/812#issuecomment-420819732
[claireandcode]: https://github.com/sourcecred/sourcecred/issues/812#issuecomment-424914639

Archive link to thread: <https://archive.fo/BH2v5>

Resolves #812.

Test Plan:
Note that the GitHub tree explorer correctly links from the README to
the individual license files.

wchargin-branch: license-dual-mit-apache2
2018-09-26 19:28:41 -07:00
William Chargin b74f1f3714
mirror: add public method `extract` (#894)
Summary:
The `extract` method lets you get data out of a mirror in a structured
format.

The mirror module now contains all the plumbing needed to provide
meaningful value. Remaining to be implemented are some internal
porcelain and a public method to perform an update step.

This makes progress toward #622.

Test Plan:
Comprehensive unit tests included, with full coverage; run `yarn unit`.

wchargin-branch: mirror-extract
2018-09-26 12:04:10 -07:00