31 Commits

Author SHA1 Message Date
William Chargin
65d811fb44
mirror: add helpers for full queries and updates (#898)
Summary:
An update step is now roughly as simple as:

    _updateData(postQuery(_queryFromPlan(_findOutdated())))

(with some option/config parameters thrown in).

This makes progress toward #622.

Test Plan:
Unit tests included. They are light, because these functions are light.
They still retain full coverage.

wchargin-branch: mirror-full-pipeline
2018-09-27 19:12:47 -07:00
William Chargin
9a4c91887b
mirror: include typenames in extracted data (#897)
Summary:
These typenames are often superfluous, but sometimes they are useful.
For instance, we might want to fetch the same data for `User`s, `Bot`s,
and `Organization`s, but still differentiate which kind of node we
fetched from an `Actor` union reference. Similarly, many timeline events
may have similar signatures (like, “issue closed” vs. “issue reopened”).

Test Plan:
Existing unit tests have been updated; run `yarn unit`.

wchargin-branch: mirror-extract-typenames
2018-09-26 21:24:30 -07:00
William Chargin
b74f1f3714
mirror: add public method extract (#894)
Summary:
The `extract` method lets you get data out of a mirror in a structured
format.

The mirror module now contains all the plumbing needed to provide
meaningful value. Remaining to be implemented are some internal
porcelain and a public method to perform an update step.

This makes progress toward #622.

Test Plan:
Comprehensive unit tests included, with full coverage; run `yarn unit`.

wchargin-branch: mirror-extract
2018-09-26 12:04:10 -07:00
William Chargin
90ace93f91
mirror: add helper to find unused table name (#895)
Summary:
Per <https://github.com/sourcecred/sourcecred/pull/894#discussion_r220406780>.

Test Plan:
Unit tests included; run `yarn unit`.

wchargin-branch: mirror-unused-table-name
2018-09-26 11:37:21 -07:00
William Chargin
a7f29cb057
mirror: add connection object types to query plan (#891)
Summary:
This reestablishes harmony in light of #882.

Test Plan:
Existing unit tests suffice; run `yarn unit`.

wchargin-branch: mirror-query-plan-connection-object-typename
2018-09-25 14:25:11 -07:00
William Chargin
892c498f9c
mirror: query and process own-data updates (#883)
Summary:
This commit adds internal functions to (a) emit a GraphQL query to fetch
data for own-data of an object, and (b) ingest the results of said query
back into the database.

The API and implementation differ from the connection-updating analogues
introduced in #878 in that the query for own data is independent of an
object’s ID: it depends only on the object’s type. This affords us more
flexibility in composing queries.

As described in a internal documentation comment, values are stored in
the database in JSON-stringified form: we cannot use the obvious raw SQL
values, because there is no native encoding of booleans (`0`/`1` is
conventional), and we need to distinguish them from other data types.
There are other ways to solve this problem. Notably:

 1. We could take inspiration from OCaml: encode stronger static types
    and a simpler runtime representation. That is, we could change the
    schema field types from simply “primitive” to the various specific
    primitive types. Then, when reading data out from the database, we
    could reinterpret the values appropriately.

 2. We could take advantage of the fact that we are not using all of
    SQLite’s data types. In particular, we do not store anything as a
    binary blob, so we could encode `false` as a length-0 zeroblob and
    `true` as a length-1 zeroblob, for instance. Again, when reading
    data out from the database, we would reinterpret the values—but in
    this approach we would not need an explicit schema from the user.

For now, we take the easiest and simplest approach just to get ourselves
off the ground. We can easily move to the second option described above
later.

This commit makes progress toward #622.

Test Plan:
Unit tests included, with full coverage. While these tests check that
the GraphQL queries are as expected, they cannot check that they are
actually valid in production. To check this, follow the instructions in
the added snapshot test.

wchargin-branch: mirror-own-data-updates
2018-09-25 11:58:32 -07:00
William Chargin
3257df63fe
mirror: add helper to register nullable node results (#889)
Summary:
As <https://github.com/sourcecred/sourcecred/pull/883/files#r219648511>.
It is somewhat unfortunate that this mixes a command with a query, but
the concession is acceptable in this instance, I think.

Test Plan:
Existing unit tests suffice, retaining full coverage.

wchargin-branch: mirror-register-node-field-results
2018-09-24 18:20:28 -07:00
William Chargin
28c4e497fb
mirror: factor out _makeSingleUpdateFunction (#888)
Summary:
This helpful utility is already used in some test code, and will shortly
be used in main code. @decentralion suggested factoring it out in
<https://github.com/sourcecred/sourcecred/pull/883#discussion_r219647781>.

Test Plan:
Unit tests included, with full coverage; run `yarn unit`.

wchargin-branch: mirror-make-update-helper
2018-09-24 18:06:14 -07:00
William Chargin
838092194b
mirror: add support for connections of union types (#882)
Summary:
Almost every GitHub connection has nodes of an object type, like `User`
or `IssueComment`. But a few have nodes of union type, including
`IssueTimelineItemConnection` (which we will likely want to query), and
those require special handling. This commit adds susupport for such
connections.

Analysis to determine which connections have non-object elements:
<https://gist.github.com/wchargin/647fa7ed8d6d17ae2e204bd098104407>

Test Plan:
Unit tests modified appropriately, retaining full coverage.

The easiest way to verify the snapshot is probably to copy the raw
contents (everything inside the quotes) into `/tmp/snapshot`, then run:

```shell
$ sed -e 's/\\//g' </tmp/snapshot >/tmp/query  # Jest adds backslashes
$ jq -csR '{query: ., variables: {}}' </tmp/query >/tmp/payload
$ ENDPOINT='https://api.github.com/graphql'
$ AUTH="Authorization: bearer ${SOURCECRED_GITHUB_TOKEN}"
$ curl "$ENDPOINT" -X POST -H "$AUTH" -d @- </tmp/payload >/tmp/result
```

and then execute the JQ program mentioned in the comment in the test
case, and verify that it prints `true`.

wchargin-branch: connection-of-union
2018-09-21 16:20:58 -07:00
William Chargin
09ed51ed6e
mirror: align test schema more with GitHub schema (#880)
Summary:
This commit changes the `Issue` type of the schema used in the `mirror`
tests to have fields a faithful subset of those in the actual GitHub
schema. The tests are self-contained, so this is not strictly required.
However, it is convenient, because it means that we can snapshot a query
that can actually be posted to GitHub.

Test Plan:
Running `yarn unit mirror` suffices for the code change. The GitHub
schema docs at <https://developer.github.com/v4/object/issue/> indicate
that each of `id`, `url`, `author`, `repository`, `title`, and
`comments` is a valid field.

wchargin-branch: mirror-test-schema
2018-09-20 17:17:27 -07:00
William Chargin
6ae5c56624
mirror: query and process connection updates (#878)
Summary:
This commit adds internal functions to (a) emit a GraphQL query to fetch
data for a particular connection, and (b) ingest the results of said
query back into the database.

This commit makes progress toward #622.

Test Plan:
Unit tests included, with full coverage. While these tests check that
the GraphQL queries are as expected, they cannot check that they are
actually valid in production. To check this, follow the instructions in
the added snapshot test.

wchargin-branch: mirror-connection-updates
2018-09-20 15:46:03 -07:00
William Chargin
1dd8b7bcb7
mirror: add internal method _findOutdated (#875)
Summary:
This function finds all objects whose own data has not been updated
since a given time, and all connections whose entries have not been
updated since that time.

Note that this is scoped to the entirety of the database. In #622,
I discussed using a recursive common table expression to identify only
those transitive dependencies of the root. I think that this is overkill
for the `_findOutdated` method: you’ll usually want to update everything
in the database. Don’t worry—the cool recursive query will still be used
in the `extract` function. :-)

This commit makes progress toward #622.

Test Plan:
Unit tests added, with full coverage; run `yarn unit`.

wchargin-branch: mirror-findoutdated
2018-09-20 14:19:47 -07:00
William Chargin
ab5b6ecb68
mirror: add public method registerObject (#870)
Summary:
This function informs the GraphQL mirror of the existence of an object,
specified by its global ID and its concrete typename (“concrete” meaning
“object type”—like `User`, not `Actor`).

The function will be called extensively internally as more objects are
discovered while traversing the graph, but also needs to be exposed as a
public entry point: a client needs to call this function at least once
to register the root node of interest. A typical client workflow, once
all of #622 is implemented, might be:

 1. Issue a standalone GraphQL query to find the ID of a root node, like
    a GitHub repository: `repository(owner: "foo", name: "bar") { id }`.
 2. Call `registerObject` with the ID found in the previous step.
 3. Instruct the mirror to recursively update all dependencies.
 4. Extract data from the mirror.

As of this commit, steps (1) and (2) are possible.

This commit makes progress toward #622.

Test Plan:
Unit tests included, with full coverage; run `yarn unit`.

wchargin-branch: mirror-registerobject
2018-09-20 13:03:06 -07:00
William Chargin
e572551cd8
mirror: add internal method _createUpdate (#868)
Summary:
It’s useful to add this simple function now because the rest of the
commits required to implement #622 will want to use it extensively in
test code. Actual clients of the API will not need to use it, because
the concept of “updates” is an implementation detail: clients will
always provide simple timestamps.

Test Plan:
Unit tests included, with full coverage; run `yarn unit`.

wchargin-branch: mirror-createupdate
2018-09-20 11:12:21 -07:00
William Chargin
5debae414e
mirror: rename helper _primitivesTableName (#861)
Summary:
Some clients want to write

    const primitivesTableName = _primitivesTableName(typename);

which they cannot if the function is also called `primitivesTableName`,
due to ECMAScript shadowing semantics.

Test Plan:
Running `yarn flow` suffices; running `yarn unit` really suffices.

wchargin-branch: mirror-rename-primitivestablename
2018-09-19 14:28:26 -07:00
William Chargin
1d18652459
graphql: improve coverage of queries module (#859)
Summary:
Each change provides real value, by either testing a plausible happy
path that simply was not tested previously, or by adding an
`empty`-assertion to a switch against a discriminated union type.

Test Plan:
For the snapshot change relating to the query formatter, note that
Prettier formats the changed portion of the snapshot in the same way, by
visiting <https://prettier.io/playground> and setting the parser to
"graphql". (Prettier in general agrees with the stringification defined
by this module, except for commas and spacing, for which we don’t bother
to generate impeccably pretty output.)

Run `yarn coverage` and note that the coverage of the whole `graphql`
package is 100% on all axes.

wchargin-branch: graphql-coverage
2018-09-18 17:57:16 -07:00
William Chargin
85efa811e0
mirror: use SchemaInfo in _initialize (#858)
Summary:
This simplifies and clarifies the code with no observable change.

Test Plan:
Existing unit tests suffice; run `yarn unit`.

wchargin-branch: mirror-use-schemainfo
2018-09-18 16:30:43 -07:00
William Chargin
e69ff57c58
mirror: precompute some useful schema info (#857)
Summary:
This is mostly useful not for computational efficiency, but for ease of
implementation: there end up being multiple places where we want to find
(say) the primitive fields on an object, and having to go through the
whole iterate-and-switch-and-push process repeatedly is annoying.

Test Plan:
Unit tests included, with full coverage; run `yarn unit`.

wchargin-branch: mirror-schema-info
2018-09-18 16:24:38 -07:00
William Chargin
1b1a1e4d46
mirror: embed GraphQL schema into SQL (#849)
Summary:
This commit augments the `Mirror` constructor to turn the provided
GraphQL schema into a SQL schema, with which it initializes the backing
database. The schema is roughly as originally described in #622, with
some changes (primarily: we omit `WITHOUT ROWID`; we add indexes; we
store `total_count` on connections; and we use milliseconds instead of
seconds for epoch time).

Test Plan:
Unit tests included, with full coverage; run `yarn unit`.

wchargin-branch: mirror-sql-schema
2018-09-18 13:31:34 -07:00
William Chargin
a93ad80ebc
mirror: initialize a GraphQL database mirror (#847)
Summary:
This commit introduces the `Mirror` class that will be the centerpiece
of the persistent-loading API as described in #622. An instance of this
class represents a mirror of a remote GraphQL database, defined by a
particular schema. In this commit, we add the construction logic, which
includes a safety measure to ensure that the database is used within one
version of the code and schema.

Test Plan:
Unit tests included, with full coverage; run `yarn unit`.

wchargin-branch: mirror-class
2018-09-17 13:53:08 -07:00
William Chargin
e9279bee90
mirror: add a helper function for transactions (#844)
Summary:
In implementing #622, we’ll want to run lots of things inside of
transactions. This commit introduces a JavaScript API to do so more
easily, properly handling success and failure cases.

Test Plan:
Unit tests included, with full coverage; run `yarn unit`.

wchargin-branch: mirror-transaction-helper
2018-09-17 13:33:10 -07:00
William Chargin
f966ce300f
schema: make fields and clauses exact (#843)
Summary:
This affords more flexibility to clients, because an exact value can be
used in place of an inexact value, but not vice versa.

Test Plan:
Running `yarn flow` suffices.

wchargin-branch: schema-exact-type-fields
2018-09-17 12:07:52 -07:00
William Chargin
4675b84443
graphql: validate well-foundedness of unions (#835)
Summary:
GraphQL unions are required to be unions specifically of object types.
They cannot contain primitives or other union types as clauses. This is
good: it means that we don’t have to worry about unions that recursively
reference each other or themselves.

Unions are also required to have at least one clause, but we don’t
validate this because it’s not helpful for us. An empty union is
perfectly well-defined, if useless, and shouldn’t cause any problems.

Relevant portion of the spec:
<https://facebook.github.io/graphql/October2016/#sec-Union-type-validation>

Test Plan:
Unit tests added, retaining full coverage; `yarn unit` suffices.

wchargin-branch: graphql-schema-union-validation
2018-09-13 18:11:26 -07:00
William Chargin
7da9ef3a94
graphql: add a schema module (#834)
Summary:
This commit introduces a module for declaratively specifying the schema
of a GraphQL database. See `buildGithubSchema` in `schema.test.js` for
an example of the API.

This makes progress toward #622, though the design has evolved some
since its original specification there.

Test Plan:
Unit tests added, with full coverage; `yarn unit` suffices.

wchargin-branch: graphql-schema
2018-09-13 18:02:14 -07:00
William Chargin
5fa20ec89e
test: use proper top-level describe blocks (#801)
Summary:
Per #800, each test file should start with a `describe` block listing
its file path under `src`. Currently, nine of our tests do not do so.
Of these, eight had a top-level describe block with the wrong name
(either not a filepath or an outdated filepath), while only one short
test was missing a top-level describe block altogether. This patch fixes
each file to use the correct format.

Test Plan:
Apply the Sharness test in #802, and note that it fails before this
patch but passes after it.

wchargin-branch: describe-fix
2018-09-06 20:39:46 -07:00
Dandelion Mané
4406c96c95
Create a Repo type and use throughout the project (#555)
Our data model orients on getting repos from GitHub, which are
alternatively represented as strings like "sourcecred/sourcecred", or
pairs of variables representing the owner and name, or objects with
owner and name properties. We also have a few different implementations
of repo validation, which are not applied consistently.

This commit changes all that. We now have a consistent Repo type which
is an object containing a string owner and string name. Thanks to a
clever suggestion by @wchargin, it is implemented as an opaque subtype
of an object containing those properties, so that the only valid way to
construct a Repo typed object is to use one of the functions that
consistently validates the repo.

As a fly-by fix, I noticed that there were some functions in the GitHub
query generation that didn't properly mark arguments as readOnly. I've
fixed these.

Test plan: No externally-observable behavior changes (except insofar as
there is a slight change in variable names in the GitHub graphql query,
which has also resulted in a snapshot diff). `yarn travis --full`
passes. `git grep repoOwner` presents no hits.
2018-07-27 21:30:50 -07:00
Dandelion Mané
24cf35da22
Change src/v3/ to src/ and remove v3 naming (#474)
Test plan:
`git grep -i v3` only shows incidental hits in longer strings
`yarn travis --full` passes
`yarn backend` works
`yarn build` works
`yarn start` works
`node bin/sourcecred.js start` works
`node bin/sourcecred.js load sourcecred example-github` works

Paired with @wchargin
2018-06-30 16:01:54 -07:00
Dandelion Mané
ba721a6fbb
Fork project to v1/ and v2/ in preparation for v3 (#327)
We want to reset some of our basic assumptions, and make `Graph` into a
pure graph implementation, rather than a hybrid graph and key-value
store.

This is a substantial rewrite, so we want to start from scratch in a v3/
directory and pull code into v3 as necessary. So that we can do this in
a relatively clean fashion, we're first moving the v1 and v2 code into
their own directories.

Paired with @wchargin

Test plan:  Travis, and `yarn backend`, `node bin/sourcecred.js start`.

Note that `yarn backend` and `node bin/sourcecred.js start` both use the
v1 versions. We'll migrate those (by changing paths.js) to v3 when
appropriate.
2018-06-01 17:17:44 -07:00
William Chargin
f3a440244e
Fix all lint errors, adding a lint CI step (#175)
Test Plan:
Run `yarn lint` and `yarn travis` and observe success. Add something
that triggers a lint warning, like `const zzz = 3;`; re-run and observe
failures.

wchargin-branch: lint
2018-04-30 14:52:28 -07:00
William Chargin
e6f401df30
Add field aliases to structured GraphQL queries (#116)
Summary:
For pagination, we’ll want to query against multiple entities of the
same type. GraphQL uses aliases to facilitate this. This commit adds
support for aliases to our GraphQL query DSL.

Test Plan:
Inspect snapshot changes, and note that `yarn flow` and `yarn test`
pass.

wchargin-branch: graphql-aliases
2018-03-26 20:54:16 -07:00
William Chargin
ca85fdf234 Reorganize src/ directory (#87)
Test Plan:
Note that tests still pass, and all changes to snapshot files are
verbatim moves.

wchargin-branch: reorg
2018-03-19 14:31:50 -07:00