Commit Graph

423 Commits

Author SHA1 Message Date
Dandelion Mané 24cf35da22
Change `src/v3/` to `src/` and remove v3 naming (#474)
Test plan:
`git grep -i v3` only shows incidental hits in longer strings
`yarn travis --full` passes
`yarn backend` works
`yarn build` works
`yarn start` works
`node bin/sourcecred.js start` works
`node bin/sourcecred.js load sourcecred example-github` works

Paired with @wchargin
2018-06-30 16:01:54 -07:00
William Chargin 23704da7a5
Demolish the bridge (#473)
Summary:
The bridge introduced in #448 has now served its purpose, and may be
deconstructed. This implements the first part of the last step of the
plan described in that pull request.

Paired with @decentralion.

Test Plan:
After `yarn backend && yarn build`:
  - `node bin/sourcecred.js start` works, and
  - `yarn start` works, and
  - `yarn travis --full` works.

wchargin-branch: demolish-bridge
2018-06-30 15:56:36 -07:00
Dandelion Mané efefc73e6b
Delete the `v1` and `v2` directories from #327 (#472)
Test plan:
`node bin/sourcecred.js load sourcecred example-github` works
`yarn start` works
`node bin/sourcecred.js start-v3` works
`yarn travis --full` passes

Paired with @wchargin
2018-06-30 15:38:47 -07:00
William Chargin 0300a805fa
Copy `start` to `start-v3` (#471)
Summary:
This could also be moved into the bridge directory, but this way is
marginally easier, and it doesn’t really matter in the end.

Test Plan:
`yarn backend` followed by `node bin/sourcecredV3.js start-v3` works.

wchargin-branch: start-v3
2018-06-30 15:28:13 -07:00
Dandelion Mané addaf4e2a8
Add load CLI command (#470)
The `load` command replaces `plugin-load`. By default, it loads data for
all plugins, and does so in parallel using execDependencyGraph. If
passed the optional `--plugin` flag, then it will load data just for
that plugin.

As an implementation detail, when loading all plugins, load calls itself
with the plugin flag set.

Usage:
`node bin/sourcecred.js load repoOwner repoName`

Test plan:
Tested by hand; I blew away my SourceCred directory and then loaded the
example-github repository.
2018-06-30 15:16:35 -07:00
William Chargin bb75cc54cd
Move Express server API from V1 to bridge (#469)
Test Plan:
`yarn start` and `node bin/sourcecred.js start` both still work.

wchargin-branch: bridge-api
2018-06-30 15:11:52 -07:00
Dandelion Mané d627475119
Integrate PageRank to the v3 cred explorer! (#468)
This integrates the PageRank table from #466 into the v3 cred explorer
app, bringing the v3 frontend to better-than-parity with v1!

Test plan:
Some unit tests were included, and running `yarn start` and inspecting
the App reveals that it is working correctly. Loading a PageRank result
and then changing the repository no longer triggers a crash :).

Paired with @wchargin
2018-06-30 15:04:40 -07:00
William Chargin c10ce0060b
Update V1 example-github data (#467)
Summary:
Fixes #445. Created with:

```
$ src/v1/plugins/github/fetchGithubRepoTest.sh -u
$ CI=1 yarn test -u
```

Test Plan:
`yarn travis --full` passes.

wchargin-branch: fix-build
2018-06-30 14:13:18 -07:00
Dandelion Mané 65b5babac4
Recreate Pagerank Table (#466)
Ports #265 to the v3 branch, along with some tweaks:
- Only display log score, and normalize them by adding 10 (so that most
are non-negative)
- Change colors to a soothing green
- Improve display, e.g. make overflowing node description text wrap
within the row

Implements most of the tests requested in #269

Test plan: Many unit tests added

Paired with @wchargin
2018-06-30 14:10:18 -07:00
William Chargin 72be58a5c0
Add “node types” listing to plugin adapter (#464)
Summary:
This enables plugins to specify different semantic types of nodes, along
with human-readable names. This will be used, for instance, in the cred
explorer, where users may filter to one of these node types.

Paired with @decentralion.

Test Plan:
Flow passes.

wchargin-branch: plugin-node-types
2018-06-29 23:28:29 -07:00
William Chargin 728a3cdf37
Add `name` function to plugin adapter (#463)
Summary:
This presents a human-readable name for a plugin. It’s not yet used
anywhere.

Paired with @decentralion.

Test Plan:
Flow passes.

wchargin-branch: plugin-name
2018-06-29 19:37:00 -07:00
Dandelion Mané dd063f5203
Add GitPluginAdapter and Git render module (#462)
Test plan:
Run the following commands:

```
node bin/sourcecredV3.js load-plugin-v3 sourcecred example-github --plugin=git
node bin/sourcecredV3.js load-plugin-v3 sourcecred example-github --plugin=github
yarn start
```

Then, navigate in-browser to the v3 cred explorer and load data for
`sourcecred/example-github`. The following messages are printed to
console:

```
GitHub: Loaded graph: 31 nodes, 73 edges.
Git: Loaded graph: 15 nodes, 19 edges.
Combined: Loaded graph: 44 nodes, 92 edges.
```

Paired with @wchargin
2018-06-29 18:38:28 -07:00
William Chargin 4767dce749
Implement the GitHub plugin adapter (#461)
Summary:
This enables grabbing the GitHub relational view from disk and
converting it to a graph on the client.

Paired with @decentralion.

Test Plan:
For testing, information about the GitHub graph is printed when you
click the “Load data” button in the UI. Do so.

wchargin-branch: github-plugin-adapter
2018-06-29 18:22:15 -07:00
William Chargin 95c206b346
Allow selecting data to load in V3 cred explorer (#460)
Summary:
Text input boxes for repository owner and name now appear. “Loading the
data” consists of logging the attempt to the console.

Test Plan:
Run `yarn start`, and note that the inputs are keyed against the same
local store key as their V1 equivalents. Note that clicking “Load data”
prints a message to the console.

Paired with @decentralion.

wchargin-branch: v3-load-data-ui
2018-06-29 18:04:20 -07:00
Dandelion Mané 0b3c91a7bd
Add support for detecting cross-repo references (#459)
In GitHub, you can make cross repo references. For example,
sourcecred/sourcecred#459 is one such reference. This commit adds
support for detecting those references and adding them to the GitHub
graph.

Test plan:
See attached unit tests.
2018-06-29 17:53:15 -07:00
Dandelion Mané ba4fa8e820
Add `loadGitData` and wire it into CLI (#458)
This commit adds `loadGitData`, which clones the git repository for a
given repo and saves the corresponding git graph. It also adds that
method to the `loadPlugin` command, so that the following command now
works:

```
$ node bin/sourcecredV3.js load-plugin-v3  sourcecred example-git --plugin=git
```

After running that command, the correct file is present:

```
$ du -sh tmp/sourcecred/data/sourcecred/example-git/git/graph.json
28K     /home/dandelion/tmp/sourcecred/data/sourcecred/example-git/git/graph.json
```

The command takes:

| repository               | time (s)  |
:------------------------- | ----------:
| `sourcecred/example-git` | 1         |
| `sourcecred/sourcecred`  | 5         |
| `ipfs/js-ipfs`           | 18        |
| `ipfs/go-ipfs`           | ∞ (OOM)   |
2018-06-29 17:19:08 -07:00
William Chargin a5d19c80aa Remove `babel-plugin-flow-react-proptypes` (#457)
Summary:
Pending the resolution of brigand/babel-plugin-flow-react-proptypes#201,
we’re removing this plugin from our build, because it results in
incorrect code generation. We’ll be happy to add it back if the bug is
fixed.

Test Plan:
Fingers crossed.

wchargin-branch: remove-bpfrpt
2018-06-29 17:11:04 -07:00
Dandelion Mané ffdfdca22a
Add `view.entity` (#456)
This method takes an arbitrary structured address and returns an entity
for it (if a matching entity exists).

Test plan: travis
2018-06-29 15:01:23 -07:00
Dandelion Mané fe64377194
Add the `attribution/pagerank` module (#455)
This module exposes a method, `pagerank`, which is a convenient entry
point for taking a `Graph` and returning a `PagerankResult`. This
obviates the need for `src/v1/app/credExplorer/basicPagerank.js`.

Test plan: Unit tests included.
2018-06-29 14:24:28 -07:00
Dandelion Mané 5c93085430
Require all `findStationaryDistribution` options (#453)
I'm planning to make a `pagerank.js` module that is a clean entry point
for all the graph-pagerank-related code, so it will be cleaner to expose all
the default options there.

Test plan: travis

Paired with @wchargin
2018-06-29 14:04:15 -07:00
Dandelion Mané a5608dd7c8
Fix off-by-1 error in PageRank iteration limit (#452)
If `findStationaryDistribution` is passed `0` as `maxIterations`, then
it should return the initial distribution.

Test plan: see new unit test

Paired with @wchargin
2018-06-29 14:01:17 -07:00
Dandelion Mané 4afa542422
Implement detection of paired authorship (#451)
This commit enables paired authorship on GitHub authored entities.
If the entity has the string "Paired with" in the body, followed by a
username reference, that entity will be recorded as having dual
authorship, with the nominal author and the paired-with author being
treated identically in the relational view and the graph.

If there's a need to pair with more than one author, the "Paired with"
signifier may be repeated. The regex matcher is forgiving of
capitalizing the P or W, and an optional colon may be added immediately
after the word "with".

Note that the code assumes that every `TextContentEntity` is also an
`AuthoredEntity`. If that changes, it will cause a type error and we'll
need to refine the code somewhat.

As implemented, it is impossible for the same user to author a post
multiple time; if this is textually suggested (e.g. by a paired-with
reference to the post's nominal author), the extra paired-with
references are silently ignored. Also, having a paired-with reference
suppresses the basic reference (although it is possible to have a post
that is paired with someone, and additionally references them).

Test plan:
Tests have been updated, and the behavior of the parser is extensively
tested. For an end-to-end demonstration, I've also added a unit test in
the relational view that verifies that sourcecred/example-github#10 has
two authors. You can also see that the graph snapshot has updated to
include additional authorship edges (and that corresponding reference
edges have disappeared).

Closes #218
2018-06-29 13:55:51 -07:00
William Chargin d74d760f43
Add an entry point for the V3 app (#450)
Summary:
This implements Step 3 of the plan described in #448.

Test Plan:
Run `yarn start` and navigate to `/v3` (by clicking the nav link).

wchargin-branch: branch-v3
2018-06-29 13:55:45 -07:00
William Chargin 1d49ec87dc
Add a “version select” to the bridged app (#449)
Summary:
The bridge now lets you select any version of the app that you want, as
long as that’s V1, because that’s the only version that exists. We’ll
add a V3 version shortly.

This implements Step 2 of the plan described in #448.

Test Plan:
In `yarn start` and `node bin/sourcecred.js start`, note that navigating
to `/` redirects to `/v1`, and that the cred explorer works.

wchargin-branch: bridge-select
2018-06-29 13:13:57 -07:00
William Chargin ca5346b524
Create a bridge for the V1 and V3 apps (#448)
Summary:
Our build system doesn’t make it easy to have two separate React
applications, which we would like to have for the V1 and V3 branches.
Instead, we’ll implement a bridge to maintain compatibility.

The plan looks like this:

 1. Change the app from pointing to V1 to pointing to a bridge
 2. Move the router into the bridge and move the V1 app from the `/`
    route to the `/v1` route (e.g., `/v1/explorer`)
 3. Add a V3 app under the `/v3` route
 4. ???
 5. Delete the V1 app and remove it from the bridge
 6. Delete the bridge and move the V3 app from the `/v3` route to `/`

This commit implements Step 1.

Test Plan:
To verify that the bridge is in fact showing, apply

```diff
diff --git a/src/bridge/app/index.js b/src/bridge/app/index.js
index 379e289..72e784c 100644
--- a/src/bridge/app/index.js
+++ b/src/bridge/app/index.js
@@ -9,5 +9,11 @@ const root = document.getElementById("root");
 if (root == null) {
   throw new Error("Unable to find root element!");
 }
-ReactDOM.render(<V1App />, root);
+ReactDOM.render(
+  <React.Fragment>
+    <h1>Hello</h1>
+    <V1App />
+  </React.Fragment>,
+  root
+);
 registerServiceWorker();
```

and say “hello” back to the app.

wchargin-branch: bridge
2018-06-29 13:09:39 -07:00
William Chargin 4184e8594a
Save the GitHub relational store from the CLI (#447)
Summary:
This provides a command-line entry point `load-plugin-v3` (which will
become `load-plugin` eventually), which fetches the GitHub data via
GraphQL and saves the resulting `RelationalStore` to disk.

A change to the Babel config is needed to prevent runtime errors of the
form `_callee7` is not defined, where `_callee7` is a gensym that is
appears exactly once in the source (in use position, not definition
position). I’m not sure exactly what is causing the error or why this
config change fixes it. But while this patch may be fragile, I don’t
think that it’s likely to subtly break anything, so I’m okay with
pushing it for now and dealing with any resulting breakage as it arises.

Paired with @decentralion.

Test Plan:
Run `yarn backend`, then run something like:

```
node bin/sourcecredV3.js load-plugin-v3 \
    sourcecred example-github --plugin github
```

Inspect results in `SOURCECRED_DIR/data/OWNER/NAME/github/view.json`,
where `SOURCECRED_DIR` is `/tmp/sourcecred` by default, and `OWNER` and
`NAME` are the repository owner and name.

This example repository takes about 1.1 seconds to run. The SourceCred
repository takes about 45 seconds.

wchargin-branch: cli-load-plugin
2018-06-29 12:12:37 -07:00
William Chargin 3835862f82
Create a V3 command-line entry point (#446)
Summary:
Due to oclif’s structure, this entry point shares its `commands`
directory with that of the V1 entry point. We’ll therefore add commands
like `start-v3` as we go.

Test Plan:
`yarn backend` works, and `node bin/sourcecredV3.js start` launches the
V1 server.

wchargin-branch: v3-cli
2018-06-29 11:47:24 -07:00
Dandelion Mané 3bf496b06f
Update example-github data (#445)
Generated via
```
$ src/v3/plugins/github/fetchGithubRepoTest.sh -u
```

Test plan: travis
2018-06-29 11:37:41 -07:00
Dandelion Mané baec3c15dd
Include Pull additions/deletions (#444)
This adds additions and deletions to the v3 Pull data model, and also
uses them in the pull descriptions.

It's basically a port of #340 to v3.

Test plan: Snapshots
2018-06-29 11:28:51 -07:00
Dandelion Mané 6356c5477f
Add RelationalView.{to,from}JSON (#443)
This adds methods for serializing the GitHub RelationalView.

We have not put in the work to ensure that these methods generate
canonical data. Getting the issues in a different order, or finding
references in a different order, can change the JSON output even if the
resulting repositories are equivalent.

@decentralion think it's not worth putting in the effort, since we may
switch to a SQL database soon anyway.

Test plan: travis

Paired with @wchargin
2018-06-28 18:39:31 -07:00
Dandelion Mané 64df5b09c3
Add `RelationalView.addData` (#442)
Now that we want to implement RelationalView de/serialization, we need a
way to construct one without adding data to it.

Now that we're allowing `addData` to be called explicitly, we also want
to make sure it's idempotent, which necessitated a small change to
reference handling. A new test verifies idempotency.

Test plan: travis

Paired with @wchargin
2018-06-28 18:31:58 -07:00
William Chargin 4ee1ed54c8
Transform Markdown AST to strip formatting (#441)
Summary:
This makes progress on #432. We’d like to look for GitHub references
only within each text node of the Markdown AST. But there are two
complications:

  - Text nodes split across formatting, and it’s valid for someone to
    write `*Paired* with @decentralion, but *tested* independently`, or
    `**Closes** #12345`, or something.

  - Sometimes contiguous blocks of text expand to multiple text nodes,
    because of how CommonMark approaches smart punctuation. For
    instance: the document `It's got "punctuation" and stuff!` has eight
    text nodes ([demo][1]).

In this commit, we introduce functions `deformat` and `coalesceText` to
solve these problems. (They go together because `coalesceText` is useful
for testing `deformat`.)

[1]: https://spec.commonmark.org/dingus/?text=It%27s%20got%20%22punctuation%22%20and%20stuff!

wchargin-branch: markdown-deformat
2018-06-28 17:30:59 -07:00
William Chargin 0cc2907e9e
Add dependency on `commonmark` (#440)
Summary:
We plan to use this to more intelligently extract references from GitHub
text content. See #432.

Test Plan:
In a Node shell, running

```js
const cm = require("commonmark");
var parser = new cm.Parser();
var ast = parser.parse("Hello\nworld");
var html = new cm.HtmlRenderer({softbreak: " "}).render(ast);
console.log(html);
```

prints `<p>Hello world</p>`.

wchargin-branch: commonmark
2018-06-28 17:01:31 -07:00
Dandelion Mané 607adeca29
GH: Add a `description` method for entities (#439)
This commit adds a `description` method that takes a GitHub entity, and
returns a description of that entity. Based on the work in #261.

In contrast to the implementation in #261:
- It won't crash on entities without an author (although we don't have a
test case for this; see #389).
- It handles multi-authors reasonably (although we can't test that, as
we haven't implemented multi-authorship yet; see #218).

Test plan:
Inspect snapshot to see some examples.
2018-06-28 16:53:52 -07:00
Dandelion Mané 40db3cdfa3
Add `RelationalView.match` (#438)
`match` implements pattern matching over `Entity`

Test plan:
Unit tests included.
2018-06-28 14:57:23 -07:00
Dandelion Mané e239fdfeeb
Export a clean `Entity` type from relationalView (#437)
Callers will want to write functions that are generic over `Entity`.
This makes those call signatures cleaner.

Test plan: travis
2018-06-28 14:52:24 -07:00
Dandelion Mané a8f54530bc
Add a GitHub `example` module (#436)
Currently, GitHub tests load example data with ad-hoc methods. It makes
it easy for the author of a new test file to forget to clone the test
data (and risk cross-test-file state pollution), or to forget to apply
the correct typing.

This commit factors a shared `example` module which provides a safe way
to access the example data, along with some convenient helpers for
constructing a graph or relational view.

Test plan:
`yarn travis`

Fixes #430.
2018-06-28 14:52:16 -07:00
Dandelion Mané 529f7db374
Rename `demoData` folders to `example` (#435)
The Git and GitHub plugins have folders that contain small example data,
as used for tests and snapshots. These folders were called `demoData`
which is misleading since the data isn't used for demos. The folders
themselves contained files called "example", like "example-github.json"
or "exampleRepo.js". Renaming the folders to `example` is cleaner.

Test plan:
`yarn travis --full` passes.
2018-06-28 14:20:31 -07:00
Dandelion Mané 38942d1f7b
Add references to the GitHub graph (#434)
This is a very simple extension of #431 to use the new reference
detection logic added in #429.

Test plan:
Inspect snapshot change for plausibility. Note that the snapshot adds
exactly 16 reference edges, which is the same as the number of
references in the reference snapshot test.
2018-06-28 14:03:45 -07:00
Dandelion Mané 1421148a6d
Refactor GH `createGraph` to use `RelationalView` (#428)
This commit modifies `github/createGraph` to use the `RelationalView`
class created in #424. The code is now much cleaner.

I also fixed some `any`s that were leaking in our test code (due to use
of runtime require for GitHub example data). These anys were discovered
by bumping into uncaught type errors. :)

This commit supersedes #413 and #419.

Test plan:
Observe that the graph snapshot was not changed.
2018-06-28 13:45:41 -07:00
Dandelion Mané c022b3f4d0
`RelationalView` tracks GitHub references (#431)
For every `TextContentEntity` (`Issue`, `Pull`, `Review`, `Comment`),
this commit adds a `references` method that iterates over the entities
that the text content entity references.

For every `ReferentEntity` (actually, every entity), this commit adds a
`referencedBy` method which iterates over the text content entities that
reference that referent entity.

This method also adds `referentEntities` and `textContentEntities`
methods to the `RelationalView`, as they are used in both implementation
and test code.

Test plan:
The snapshot tests include every reference, in a format that is very
convenient for inspecting the ground truth on GitHub. For every
reference, it's easy to check that the reference actually exists by
copying the `from` url and pasting it into the browser. I've done this
and they check out. (It's not easy to prove that there are no missing
references, but I'm pretty confident that this code is working.)

Unit tests ensure that the `references` and `referencedBy`
methods are consistent.
2018-06-28 13:32:47 -07:00
Dandelion Mané 6235febdac
Add porcelain-style classes to `RelationalView` (#424)
Based on offline design discussion with @wchargin, we've decided to
upgrade the `RelationalView` to be *the* comprehensive source for GitHub
data inside SourceCred. The `RelationalView` will contain the full
dataset, including parsed relational information (such as
cross-references between GitHub entities). Then, we will project our
GitHub graph out of the `RelationalView`.

To that end, the `RelationalView` no longer exports raw data blobs.
Instead, it exports nice classes: `Repo`, `Issue`, `Pull`, `Review`,
and `Userlike`. These classes have convenient methods for accessing both
their own data and related entities, e.g. `repo.issues()` yields all
the issues in that repo.

This is effectively a port of #170 into the v3 API. The main difference
is that in v1, the Graph contained this data store, whereas in v3, we
will use this data store to generate the graph.

This supersedes #418.

Test plan:
The snapshot tests are quite readable.
2018-06-27 15:25:20 -07:00
William Chargin e7b28b81db
Convert V3 graphs to Markov chains (#427)
Summary:
This is based on the V1 file `basicPagerank.js`. The API is necessarily
changed for the new graph format, and we export additional utilities
compared to the previous version of the module (useful for testing and
serialization). We also improve the implementation to make it simpler
and easier to understand.

Test Plan:
Unit tests included.

wchargin-branch: v3-graph-markov-chain
2018-06-27 15:24:47 -07:00
William Chargin b9c67f447f
Expose `advancedGraph` test case (#426)
Summary:
We’d like to use this test case to generate a Markov chain, which
requires that it not be local to the `graph.js` tests.

Test Plan:
Existing unit tests suffice.

wchargin-branch: expose-advanced-graph
2018-06-27 15:18:47 -07:00
William Chargin faa2f8c9d0
Copy Markov chain code from V1 to V3 (#425)
Summary:
This code is independent of the graph abstraction, and so is mostly
copied. The only change is to the structure of the test code (we now
prefer to wrap everything in a big `describe` block with an absolute
path to the module under test).

Test Plan:
Unit tests included.

wchargin-branch: v3-markov-chain
2018-06-27 15:14:42 -07:00
Dandelion Mané 659fc51d9b
Rename `findReferences` to `parseReferences` (#429)
This code is about parsing references out of text, so `parseReferences`
is a better name.

The code that consumes this logic to find all the references in the
GitHub data shall be rightly called `findReferences`

Test plan:
`yarn travis`
2018-06-27 13:21:25 -07:00
William Chargin 518d5b819c
Represent submodule commits as normal commits (#423)
Summary:
Closes #417. Submodule commits are dead; long live commits. The ontology
is now:

  - A tree includes tree entries.
  - A tree entry may have a blob as contents.
  - A tree entry may have a tree as contents.
  - A tree entry may have a commit as contents.

Test Plan:
Existing unit tests suffice, especially `#commits yields all commits`.

wchargin-branch: git-remove-submodule-commits
2018-06-27 12:01:07 -07:00
William Chargin 38c364c916
Allow Git commits to have zero or one tree (#422)
Summary:
Submodule commits need not have associated tree objects, in case the
repository to which they belong does not exist in our graph. We’d like
to represent submodule commits as actual commits, which necessitates
this change. See #417 for context.

Test Plan:
Existing unit tests suffice.

wchargin-branch: git-affine-trees
2018-06-27 11:47:39 -07:00
William Chargin dd83d7b4ab
Implement a Git graph view (#415)
Summary:
Similar in structure to the GitHub graph view.

Test Plan:
Unit tests added, with full coverage.

wchargin-branch: git-graph-view
2018-06-26 14:00:19 -07:00
William Chargin 0522894a8d
Create Git graph (#406)
Summary:
This commit adds logic to create the Git graph, modeled after the GitHub
graph creator in #405. In this commit, we do not include the
corresponding porcelain; a Git `GraphView` will be added subsequently.

Kudos to @decentralion for suggesting in #187 that I write the logic to
detect BECOMES edges against the high-level data structures. Due to that
decision, the logic and tests are copied directly from the V1 code
without change, because the high-level data structures are the same. The
new code is exactly the body of the `GraphCreator` class.

Test Plan:
Verify that the new snapshot is likely equivalent to the V1 snapshot,
using the heuristic that the two graphs have the same numbers of nodes
(59) and edges (84). (I have performed this check.)

wchargin-branch: git-v3-create-graph
2018-06-26 13:54:47 -07:00