Commit Graph

669 Commits

Author SHA1 Message Date
Dandelion Mané 24895b3c7d
Make `yarn test` more quiet (#1037)
This commit adds a new runOption for execDependencyGraph, namely
`printVerboseResults`. If this flag is true, then execDependencyGraph
will print a "Full Results" section along with the standard error and
standard out of every task, regardless of whether it failed or
succeeded. (Note, this is the existing behavior for all invocations
prior to this commit).

If the flag is not true, then execDependencyGraph will not print a full
results section, and stdout/stderr will be logged only for tasks that
fail.

This commit also modifies `yarn test` to use the new flag so that it
prints verbose tests only when the `--full` option is provided. This is
consistent with our sharness behavior: we print the full sharness logs
only when `--full` was provided.

This fixes #1035, and ensures that running `yarn test` has a high signal
to noise ratio (i.e. it only shows an enumeration of top level tasks).
This improves the developer ergonomics of SourceCred by not having a
super commonly used and core script spam the user with mostly irrelevant
information.

Test plan:

Run `yarn test` when all tests are passing, and observe that the output
has much less noise:

```
yarn run v1.12.3
$ node ./config/test.js
tmpdir for backend output: /tmp/sourcecred-test-6337SZ9smvWsWvqE

Starting tasks
  GO   ensure-flow-typing
  GO   check-stopships
  GO   check-pretty
  GO   lint
  GO   flow
  GO   unit
  GO   backend
 PASS  check-stopships
 PASS  ensure-flow-typing
 PASS  flow
 PASS  backend
  GO   sharness
 PASS  sharness
 PASS  check-pretty
 PASS  lint
 PASS  unit

Overview
Final result:  SUCCESS
Done in 11.66s.
```

Run `yarn test` when there is a real failure (e.g. a unit test failure)
and observe that full details on the failure, including the output from
stdout/stderr, is still provided.

Run `yarn test --full` and observe that full, verbose logs are provided.
2019-01-05 18:16:29 -08:00
Brian Litwin 2b9cef66ed Add project title to explorer page (#1032)
Resolves #1027

Using `repoId.owner/repoId.name` for the project title
because that is how projects are identified on `PrototypePage`.

Created a `<ProjectDetail />` component inside `<App />`  that consumes a `RepoId`
and renders a title.

**Test Plan:**

Added two unit tests:

The first verifies that the parent `<App />` component
instantiates a `<ProjectDetai />` component with the correct props.
The current correct prop is a `RepoId` object.

The second test verifies that the `<ProjectDetail />` component renders
the title correctly given the `RepoId`, ie as a `<p>` element
with `repoId.owner/repoId.name` for text.

Visual tests verify that the title is above the Analyze Cred
button, and that clicking from one project to another renders
the appropriate title for separate projects.

Attaching a screenshot as a comment at #1032
for reference:

<img width="1253" alt="screenshot 2019-01-04 13 40 03" src="https://user-images.githubusercontent.com/26695477/50706562-34aeff00-102c-11e9-9c1c-6c1e3fa6c415.png">
2019-01-04 12:56:38 -08:00
Dandelion Mané 7c7fa2d83d
Graph: move invariant checker to bottom of class (#1026)
This moves the invariant checking code from the top of the Graph class
to the bottom. Most readers of this file will probably be more
interested in seeing the API, and reading the invariant checker first
is likely to be confusing and off-putting.

Test plan: `yarn test` suffices. No semantic change.
2019-01-03 14:15:40 -08:00
Dandelion Mané bbe773bb67
Elide assertValid & assertValidParts in production (#1017)
This commit substantially improves SourceCred's performance in
production.

Measurement methodology: I create a new tab in Chrome, navigate to my local
prototypes, and select go-ipfs. I then turn on profiling, and click the
analyze button, and then turn off profiling when analysis is done. I
then go to the "bottom-up" tab in the JS analysis box on the bottom and
sort by "Total Time".

__Before this commit:__

|        fn        | total time | time as % |
|:---------------- | ----------:| ---------:|
| assertValid      |      815ms |      8.6% |
| assertValidParts |      261ms |      2.7% |

__After this commit:__

|        fn        | total time | time as % |
|:---------------- | ----------:| ---------:|
| assertValid      |       21ms |      0.2% |
| assertValidParts |       23ms |      0.3% |

Test plan: `yarn test`, also performance measurement as described above.

Fixes #1011.
2018-12-01 17:33:44 -08:00
Dandelion Mané 973a72fe46
Update the blacklisted object ids (#1018)
This adds a blacklisted id for @greenkeeper, a bot which used to be a
user. This is a temporary fix until we solve #998.

Test plan: `yarn test` passes. Before this commit, attempting to load
`probot/probot` fails. After this commit, it succeeds.
2018-11-28 12:39:19 -08:00
Dandelion Mané 794b93e397
Improve performance of pagerank `decompose` (#1007)
When I implemented this function, I incorrectly assumed that
`lodash.sortBy` only calls subsequent accessor functions if there is a
tie from the first accessor. Actually, it calls it every time. We can
avoid lots of wasteful JSON.serialization by just grabbing the exact
properties of interest.

Test plan:

For correctness: `yarn test` suffices, as this functionality is already
tested.

For performance improvement: I ran the full load+analyze workflow, in
Chrome, on twbs/bootstrap. Before this change, decompose took 6.9s;
after this change, it takes 1.3s, for a 5.3x speedup.

Close #943.
2018-11-16 22:26:55 -08:00
William Chargin 80b458d719
core: allow repo ID registry to store metadata (#1003)
Summary:
Our registry was defined to simply be a list of IDs. This is
insufficiently flexible; we want to be able to annotate these IDs with,
e.g., last-updated times (#989). This commit wraps the entries in a
simple object, updating clients appropriately.

Test Plan:

  - Run `node ./bin/sourcecred.js load sourcecred/example-github` with a
    repository registry in the old format, and note that it errors
    appropriately.
  - Run `yarn build` with a repository registry in the old format, and
    note that it errors (“Compat mismatch”).
  - Delete the old registry and re-run the `load` command. Note that it
    runs successfully and outputs a registry. Run `yarn build`; note
    that this works.
  - Load data for two repositories. Run `yarn start`. Note that the list
    of prototypes still works, and that you can navigate to and render
    attributions for individual project pages.
  - Verify that `yarn test --full` passes.

wchargin-branch: repo-id-registry-metadata
2018-11-09 17:28:39 -08:00
William Chargin 332e776317
deps: upgrade `flow-bin@^0.86.0` (#1002)
Summary:
There have been some breaking changes that require new type annotations,
which is a good thing: these prevent `any`-leakage.

Test Plan:
Run `yarn flow`.

wchargin-branch: flow-v0.86.0
2018-11-09 09:24:40 -08:00
William Chargin 0a6eca7d79
link: verify that routes have trailing slash (#1001)
Summary:
This serves as a regression test for #1000.

Test Plan:
Note that `yarn unit` passes with this patch but fails if the change to
the code is reverted, or if the patch in #1000 is reverted. Note that
`yarn build` also passes but fails if the patch in #1000 is reverted.
Note also that `yarn test --full` passes.

wchargin-branch: link-verify-trailing-slash
2018-11-08 20:53:30 -08:00
William Chargin 24c1873dca
site: fix homepage link to prototypes page (#1000)
Summary:
Prior to this commit, clicking the in-copy link to the prototypes page
would raise a console error:

> Warning: [react-router] Location "/prototype" did not match any routes

Test Plan:
Run `yarn start` and click the link.

wchargin-branch: site-fix-homepage-prototype-link
2018-11-08 20:40:57 -08:00
William Chargin 897ba78d5c
site: fix prototypes page dimensions (#999)
Summary:
Prior to this commit, the prototypes page, which lists just a handful of
repositories, was rendered with a vertical scrollbar: you had to scroll
200px to see the version info. This is silly.

The `height: 100%` is necessary not to get it to fill up the whole page,
but to get it to _not_ fill up ~30 extra pixels. I have no idea why.

Test Plan:
Run `yarn start` and note that `/prototypes/` now renders without a
scrollbar, and with the version info in the bottom-right corner.

wchargin-branch: site-fix-prototypes-page-dimensions
2018-11-08 20:39:23 -08:00
Dandelion Mané 8666f9ac1a
cleanup: remove unused field on ScoredConnection (#994)
This resolves an outstanding TODO in pagerankNodeDecomposition to remove
the unused sourceScore field.

I have removed it, and it was indeed unused.

Test plan: `yarn test` passes.
2018-11-01 18:55:32 -07:00
William Chargin beccac822f
MapUtil: provide exact output from `toObject` (#993)
Summary:
The `MapUtil` map–object conversion functions used inexact objects for
both input and output. They are in fact stronger than that: they can
accept arbitrary inexact objects and return arbitrary exact outputs.
(Recall that exact objects are subtypes of their inexact counterparts,
so this is the maximally permissive combination.)

Test Plan:
Unit tests added. The “can return an exact object” test fails Flow
before this change. The other tests would have passed already.

wchargin-branch: maputil-exact-output
2018-11-01 18:55:14 -07:00
Dandelion Mané 252d8d5c99
Move repoIdRegistry to core (#992)
RepoIdRegistry is used across the project, but not in the explorer. So
it makes very little sense that it live in the explorer module. It's now
moved to core.

Test plan: `yarn test --full` passes
2018-11-01 18:11:48 -07:00
Dandelion Mané 6b8cb66013
Remove cred feedback url configurability (#991)
We added a configurable cred feedback url on the theory that we would
create a separate discourse post to collect feedback for each project in
particular.

We've now realized that no one is using this, so it's just vestigial
complexity now. So I'm removing the logic for configuring the feedback
url on a per-project basis.

Instead, we will always link to a Google form for collecting feedback.

Test plan: `yarn test --full` passes, and I manually checked the links.
2018-11-01 17:43:37 -07:00
Dandelion Mané 29065f44d6
Remove the repository select from explorer/ (#988)
Historically, a single cred explorer instance could load many different
repositories. This turned out to be an anti-feature: we'd rather have a
particular url hardlink to exploring the cred for a particular project.

This commit removes the repository select from the explorer, and instead
mandates that the explorer always has the RepoId passed down from above.
Besides providing a better UX, this also greatly simplifies the logic
for the explorer, since we no longer have an "initializing state" that
doesn't have any RepoId.

This builds on the work in #984, and swaps out the old "prototype" page
(which has been rendered non-functional by this change) for the new
"prototypes" page. Note that it stays at the same route, so links to
sourcecred.io/prototype will continue to function.

Test plan: Ran `yarn test --full`, and verified that `yarn start`
produces a working site.
2018-11-01 16:10:01 -07:00
William Chargin 738853cd02
homepage: render project-specific prototype pages (#984)
Summary:
Currently, we render simply render a placeholder. Soon, we’ll remove the
repository selector dropdown from the cred explorer, and render
project-specific cred attributions.

Test Plan:
Run `yarn start`. Navigate to `/prototypes/` and observe:

![Screenshot of `/prototypes/`](https://user-images.githubusercontent.com/4317806/47877810-03227900-ddda-11e8-9a17-28398d83059f.png)

Note that the links point to URLs like
`/prototypes/sourcecred/example-github`:

![Screenshot of a project page](https://user-images.githubusercontent.com/4317806/47877888-35cc7180-ddda-11e8-95db-9f5099e146a8.png)

Then, check that `yarn test --full` passes.

wchargin-branch: homepage-project-pages
2018-11-01 15:19:52 -07:00
William Chargin 665bb67e33
homepage: add prototypes listing (#983)
Test Plan:
Apply the following patch:

```diff
diff --git a/src/homepage/routeData.js b/src/homepage/routeData.js
index 32d3eb65..aac7fc9a 100644
--- a/src/homepage/routeData.js
+++ b/src/homepage/routeData.js
@@ -38,7 +38,10 @@ const routeData /*: $ReadOnlyArray<RouteDatum> */ = [
     path: "/prototypes/",
     contents: {
       type: "PAGE",
-      component: () => require("./PrototypesPage").default([]),
+      component: () =>
+        require("./PrototypesPage").default([
+          {owner: "sourcecred", name: "example-github"},
+        ]),
     },
     title: "SourceCred prototypes",
     navTitle: null, // for now
```

Then, load <http://localhost:8080/prototypes/> and see that there is an
entry in the list, and that it links to
<http://localhost:8080/prototypes/sourcecred/example-github/>. Note that
clicking the link raises a console error because there is no such route.

wchargin-branch: homepage-prototypes-page
2018-11-01 13:30:36 -07:00
Dandelion Mané a9db2b0919
webpack: expose repo registry at build time (#981)
Summary:
We want to remove the repository selector dropdown on the cred explorer
homepage and instead render a separate web page for each project. To do
this, we need to know which pages to render statically. We choose to
ingest this information from the state of the repository registry at
build time.

This commit adds an environment variable `REPO_REGISTRY` whose contents
are the stringified version of the repository registry, or `null` if
SourceCred has been built for the backend. This variable is defined with
Webpack’s `DefinePlugin`, so any code bundled by Webpack can refer to it
via `process.env.REPO_REGISTRY` both on the server and in the browser.

Paired with @wchargin.

Test Plan:
Sharness tests modified; running `yarn test --full` suffices.
2018-11-01 12:38:18 -07:00
Dandelion Mané 244c01d764
Move plugin choice from explorer to homepage (#979)
The explorer no longer ships with a set of default plugins. (This made
an inappropriate dependency from explorer/ to plugins/, and complicated
explorer's contract as a generic component.) Instead, the homepage
module is responsible for choosing the plugins to display on the
homepage.

Test plan: `yarn test --full` passes, and `yarn start` reveals a
functioning homepage and prototype.
2018-11-01 11:39:07 -07:00
Dandelion Mané 0ad1e0557f
Move `version.js` to core (#977)
Currently version is located in `homepage/`, which doesn't make much
sense, since it's versioning the whole project.

We move it to core.

Test plan: `yarn test --full`
2018-11-01 11:33:03 -07:00
William Chargin 64500f53cb
mirror: remove "demo" module (#978)
Summary:
This was used for ad hoc testing of the Mirror module before it was
integrated into SourceCred. We haven’t kept it up to date with schema
changes, and it is no longer needed: you can just run `sourcecred load`.

This was also the only untested code in the `graphql/` package, so it is
nice to remove it.

Test Plan:
Running `yarn test --full` passes.

wchargin-branch: remove-mirror-demo
2018-11-01 11:29:57 -07:00
William Chargin d19227c268
github: use blacklists to unblock twbs/bootstrap (#973)
Summary:
This adds object IDs to the GitHub GraphQL blacklist such that the
`twbs/bootstrap` repository can be loaded.

Ingesting the Mirror-extracted data into the RelationalView yields the
warnings

```
IssueComment[MDEyOklzc3VlQ29tbWVudDEwNTI4Mzk4Ng==].reactions: unexpected null value
IssueComment[MDEyOklzc3VlQ29tbWVudDI0NTQ3OTM3OA==].reactions: unexpected null value
IssueComment[MDEyOklzc3VlQ29tbWVudDMwNDE4NzIzMg==].reactions: unexpected null value
```

because we have nulled out these `Reaction`s in their enclosing
connections. This is expected.

Test Plan:
Run `yarn backend` and `node ./bin/sourcecred.js load twbs/bootstrap`.
Run `yarn start` and note that the cred attribution renders properly.

(Loading the GitHub data may take an hour or two. The resulting SQLite3
database is 172MB. Ingesting it into the `RelationalView` still takes
just a few seconds, and the cred attribution is rendered quickly.)

wchargin-branch: github-use-blacklists
2018-11-01 11:08:17 -07:00
William Chargin fe50ca83f6
mirror: allow blacklisting objects by ID (#972)
Summary:
This enables us to deal with GraphQL remotes that violate their contract
guarantees and provide a node of the wrong type. An instance in which
the GitHub GraphQL API does this is documented here:
<https://gist.github.com/wchargin/a2b8561b81bcc932c84e493d2485ea8a>

A Mirror cache is only valid for a fixed set of blacklisted IDs. This is
necessary to ensure consistency. If the set of blacklisted IDs changes,
simply remove the existing database and download again from scratch.

Test Plan:
Unit tests added, with full coverage.

wchargin-branch: mirror-blacklist
2018-11-01 11:04:49 -07:00
Dandelion Mané 69989256f6
Rename `app/` to `homepage/` (#974)
Now that we've moved the explorer out of app, it is more concisely
described as the homepage.

Test plan: Rename only. Run `yarn test`.
2018-11-01 10:19:51 -07:00
Dandelion Mané 604db14879
Pull `credExplorer` into its own top-level module (#971)
Currently, the cred explorer is a submodule of `app`. This is somewhat
confusing, as `app` is essentially our homepage, and the explorer is a
standalone React application which happens to get embedded in our
homepage. This commit pulls the explorer from `app/credExplorer/` into
`explorer/`, which is a better organization.

The `app/adapters` were actually only used by the cred explorer, so
those files have been moved to `explorer/adapters`. We should rename
them from "App Adapters" to "Explorer Adapters", but I didn't do that in
this commit so as to minimize the (already substantial) size of the
change.

Also, we should rename `app/` to `homepage/` in a subsequent commit.

I encountered a nasty Flow bug, which I fixed with help from @wchargin.
The result is extra annotations on the demo and fallback dynamic
adapters (so that the `static()` method is type annotated).

Test plan: This change is massive, but it's just a rename. `yarn test`
suffices.
2018-11-01 10:16:42 -07:00
Dandelion Mané c997f4e1ec
Collect web-related utilities in `webutil/` (#970)
I'm planning to pull `credExplorer` out of `app` and into its own
top-level module. This is a bit awkward, as `credExplorer` depends on
a lot of little modules that are currently collected in `app/`.

To resolve this, I pull all of these little utility modules into
`webutil/`. It's not a totally principled grouping, but it's quite
convenient and keeps these rarely changing modules out of the way.

Test plan: It's a file move, `yarn test` suffices.
2018-10-31 21:24:25 -07:00
Dandelion Mané b077bd8179
Move `weightsToEdgeEvaluator` to `analysis` (#969)
The logic for converting weights into an edge evaluator should not be
coupled to the frontend application.

Progress towards #967.

Test plan: Very straightforward rename; `yarn test` suffices.
2018-10-31 20:14:30 -07:00
Dandelion Mané 48a66c8118
Declare the fallback plugin in analysis (#968)
Now that the `analysis` module owns the Node and Edge types, it should
own the "fallback plugin" too. (Note that it's not actually a plugin,
though it somewhat acts like one.)

We now declare the fallback type in `analysis`, along with a fallback
analysis adapter. `app/adapters` then declares a fallback app adapter.

Test plan: `yarn test`

Progress towards #967.
2018-10-31 20:02:38 -07:00
Dandelion Mané f3ddb84cbd
Move weight logic to analysis (#966)
There's a folder called `app/credExplorer/weights` which contains the
type specification for weights (for PageRank configurability), and also
contains frontend code for specifying those weights. This commit creates
a `weights` module under `analysis` which will contain just the logic
for specifying and using the weights, without any frontend
consideration.

It's mostly a port of the existing logic in `credExplorer/weights`, with
the caveat that app adapter related concepts have been removed, in favor
of referencing the declaration instead.

We then remove the duplicated logic and re-route imports.

Test plan: `yarn test`
2018-10-31 13:27:03 -07:00
Dandelion Mané 86a5b532f8
Add the demo plugin (#965)
* Add the demo plugin

This ports the ad-hoc demo adapter defined in
`src/app/adapters/demoAdapters.js` into its own demo plugin.

This has the benefit that the demo plugin can now be depended on outside
the app module, e.g. for the analysis module as well. Correspondingly,
I've added a demo analysis adapter.

Test plan: `yarn test`. Note that no unit tests were added, as the demo
plugin is trivial.

* Delete `src/app/adapters/demoAdapters.js`

Now that we have an explicit demo plugin at `src/plugins/demo/`, we can
remove the legacy declaration of that plugin within the `app` module.

This commit deletes the old version, and re-writes all references to
point to the standalone plugin.

Test plan: `yarn test`
2018-10-31 13:23:46 -07:00
William Chargin 6b789d61d6
github: remove legacy continuations code (#964)
Summary:
It is time. (Replaced with #622.)

Test Plan:
Running `yarn flow` suffices. Running `yarn test --full` also passes.

wchargin-branch: remove-legacy-graphql
2018-10-31 12:45:59 -07:00
William Chargin 2e0b17cef7
mentionsAuthorReference: remove legacy GraphQL dep (#963)
Summary:
This test has data in the old format, and uses the `RelationalView`
method that automatically translates it. As we prepare to delete that
code, we upgrade the underlying format of this test data. The end code
is nicer to read, too (e.g., we don’t need the `connection` helper
function).

Recommend reviewing with `git show -b`.

Test Plan:
Running `yarn test` suffices.

wchargin-branch: mentionsAuthorReference-remove-legacy-graphql
2018-10-31 12:36:48 -07:00
William Chargin b77db72c1d
github: remove some deps on github/graphql.js (#962)
Summary:
A number of modules depended on the legacy `github/graphql.js` module
solely to get at the `Reactions` enum object. As of #961, that object is
exposed from the much lighter-weight `graphqlTypes.js`. This patch
switches over the relevant imports, reducing our dependencies on this
legacy module and its large bundle size.

Test Plan:
It suffices to run `yarn flow` and verify that the two values being
imported are identical.

wchargin-branch: github-use-generated-enums
2018-10-31 12:24:06 -07:00
William Chargin 2377f1980f
schema: generate runtime constants for enum values (#961)
Summary:
We have a `const Reactions` convenience enum in `github/graphql.js`.
That value is useful, but that module is slated to die. This commit
extends our Flow type generation script to include these values.

Test Plan:
Existing unit tests suffice.

wchargin-branch: schema-generate-enums
2018-10-31 12:22:58 -07:00
Dandelion Mané 9bccb7661d
Change the GitHub default TTL to one week (#960)
While we wait for explicit configurability, one week is a better
default for the many SourceCred demos I maintain.

Test plan: n/a
2018-10-31 12:21:08 -07:00
Dandelion Mané b86b0b32ec
Add `analysisAdapter` (#950)
For #704, we're adding plugin adapters that are specific to the needs of
the analysis module. They have a simple scope: they just provide a way
to get the declaration, and to load the corresponding graph.

Adapters for the `git` and `github` plugins have been implemented, along
with unit tests.

Test plan: `yarn test` suffices.
2018-10-31 11:36:30 -07:00
William Chargin 8706fa9771
git: don’t warn when rendering unknown commits (#957)
Summary:
Fixes #953. See that issue for context.

Test Plan:
Unit tests updated. To see the change in action, load the SourceCred
data and expand @decentralion’s commits-authored to find commits that
were merged into non-`master` branches. Note that these commits are
rendered correctly (in the same way that they were before this patch),
and that there is no console error (new as of this patch).

![Screenshot](https://user-images.githubusercontent.com/4317806/47805669-1f98b580-dcf5-11e8-8683-8ee91f7f478a.png)

wchargin-branch: git-no-warn-on-unknown-commit
2018-10-31 10:45:25 -07:00
William Chargin f9bb75ef71
release: v0.2.0 (#952)
Test Plan:
Remove the SourceCred output directory, run `yarn backend`, and load
data for `sourcecred/example-github` and `sourcecred/sourcecred`. Then,
run `yarn start` and note that the cred explorer still works. Finally,
note that `yarn test --full` passes.

wchargin-branch: release-v0.2.0
2018-10-30 15:18:19 -07:00
Dandelion Mané c4e2ec8839
Rename `PluginAdapter` to `AppAdapter` (#948)
Now that we're planning to add adapters for the `analysis` module, we
should rename the `PluginAdapter` to make it clear that it's scoped for
`app`.

Test plan: `yarn test` suffices
2018-10-30 00:26:22 +00:00
Dandelion Mané cb30023a02
Factor out plugin declarations (#947)
The plugin adapters are specific to `app/` and have logic for fetching
data from the backend, producing React nodes for descriptions, et
cetera.

However, they also have information that is generic to the plugin
itself: its name, its node/edge prefixes, and its types.

This method factors out the generic info into a `PluginDeclaration`,
which is a type (rather than an interface). Then, the plugin adapter has
a `declaration` method which returns the declaration.

Current users of the plugin adapters get additional mechanical
complexity because they need to call `.declaration().property` rather
than `.property()`, but this is not too significant.

The main benefit is that #704 is unblocked, as the cli `analyze` command
will be able to get necessary information from the declaration. As an
added benefit, the organization of plugin code is somewhat improved.

Test plan: `yarn test` sufficies, but I also ran `yarn start` and
checked the UI behavior to be safe.
2018-10-30 00:13:11 +00:00
Dandelion Mané 5f2cc56172
Move `{Node,Edge}Type` to src/analysis/types.js (#946)
Test plan: `yarn test`

Part of ongoing work on #704
2018-10-29 23:49:53 +00:00
Dandelion Mané 917b793aca
Move some files from core/attribution/ to analysis/ (#944)
The `core/attribution/` folder has some code that really is "core" in
that it deals with very basic concepts around converting graphs to
markov chains and running PageRank on them, and some code that is less
"core", like for normalizing scores and doing analysis on them.

To make progresson #704, we need an intermediary directory that has
analysis-related code that is e.g. aware of Node and Edge types, and
weights on those types, and can use them to run weight-informed
PageRank. That code shouldn't live in the app directory (since it is not
coupled to the frontend rendering), but also shouldn't live in core
(since "core" is basically finalized code with fully baked abstractions,
and per #710, this is not true of the node/edge type system).

Thus, I've decided to create the `analysis` directory. To get that
directory started, I've moved the non-core code in `core/attribution/`
to `analysis/`.

Test plan: `yarn test` passes, which is all we need, since this is a
straightforward file rename.
2018-10-29 22:54:15 +00:00
Dandelion Mané 542e2f9723
Add skeleton of `sourcecred analyze` (#942)
The `analyze` command is the first step towards #704 and #703. When
fully implemented, it will run PageRank for a loaded repository,
generating a complete graph and cred attribution.

For now, this just adds a scaffold. It does basic argument parsing, and
has help text, but the actual command is not yet implemented.

Test plan:
Unit tests verify that the analyze command is hooked into `sourcecred`
and `sourcecred help`, and that it responds to the `--help` command and
parses its arguments appropriately.
2018-10-29 22:27:06 +00:00
William Chargin 08219f98bf
fetchGithubRepo: use Mirror pipeline (#937)
Summary:
As of this commit, `node ./bin/sourcecred.js load` uses the Mirror code,
and the legacy continuation-fetching code is not included in the
`sourcecred.js` bundle.

We do not yet perform the commit prefetching described in #923. The code
should be plenty fast for repositories that merge pull requests at least
occasionally.

Test Plan:
Running `yarn test --full` passes. Loading `sourcecred/sourcecred` works
and generates a reasonable credit attribution. Loading it again
completes immediately.

wchargin-branch: fetchGithubRepo-mirror
2018-10-28 12:03:06 -07:00
William Chargin e2c99c418b
relationalView: use new data format (#934)
Summary:
This makes significant progress toward #923. As of this commit, it is
possible to use the Mirror module for the whole loading pipeline. This
process may be slow for repositories that do not use pull requests at
all (more precisely, that have large connected commit subgraphs none of
whose nodes is the merge commit of a pull request; see #920 for details)
so it is not yet the default codepath.

Test Plan:
Existing unit tests should suffice. For extra testing, I’ve added a
script that fetches a repository both via the old continuations logic
and the new Mirror logic, then constructs relational views and checks
whether the data is the same. For `example-github`, the views are
identical. For `sourcecred`, they are not: the old continuations logic
erroneously omits two commits, which the Mirror logic includes.

You can run the test like this:

```
$ node ./bin/testContinuations.js \
> sourcecred sourcecred MDEwOlJlcG9zaXRvcnkxMjAxNDU1NzA= \
> /tmp/continuations.json /tmp/mirror.json \
> 2> >(jq . >&2)
{
  "child": "0d38dde23a6de831315f3643a7d2bc15e8df7678",
  "parent": "cb8ba0eaa1abc1f921e7165bb19e29b40723ce65",
  "type": "UNKNOWN_PARENT_OID"
}
{
  "child": "d152f48ce4c2ed1d046bf6ed4f139e7e393ea660",
  "parent": "de7a8723963d9cd0437ef34f5942a071b850c0e7",
  "type": "UNKNOWN_PARENT_OID"
}
Different. Saving to disk...
```

Use `diff -u <(jq . /tmp/continuations.json) <(jq . /tmp/mirror.json)`
to inspect the differences, and note that exactly the two missing
commits have been added and that there are no other changes. (The diff
is small: just 51 lines of nicely formatted JSON.) The full log is here:
<https://gist.github.com/wchargin/e159cac9dcf3cc3b1efbd54f59e24e0b>

I also generated the `sourcecred/sourcecred` cred attribution and viewed
it with `yarn start`, which seems to work fine.

wchargin-branch: relationalview-new-data-format
2018-10-23 16:42:49 -07:00
William Chargin 58e98124ac
relationalView: make snapshots order-invariant (#933)
Summary:
An upcoming commit will happen to change the order in which commits are
ingested. This is not an observable change, and should not cause a
snapshot failure.

Test Plan:
Inspection.

wchargin-branch: relationalview-snapshots-order-invariant
2018-10-23 10:50:18 -07:00
William Chargin 993de9303a
github: translate old format to structured format (#930)
Summary:
This implements the translation module described in #923. See that issue
for context.

Test Plan:
This is a mostly straightforward translation from one strongly typed
data structure to another, so Flow handles most of it.

As a check on the snapshot, run:

```
$ grep -e oid -e target -e mergeCommit \
> src/plugins/github/__snapshots__/translateContinuations.test.js.snap
      "target": Object {
        "oid": "6bd1b4c0b719c22c688a74863be07a699b7b9b34",
            "oid": "c430bd74455105f77215ece51945094ceeee6c86",
                "oid": "6d5b3aa31ebb68a06ceb46bbd6cf49b6ccd6f5e6",
                    "oid": "0a223346b4e6dec0127b1e6aa892c4ee0424b66a",
                        "oid": "ec91adb718a6045b492303f00d8e8beb957dc780",
                        "oid": "ecc889dc94cf6da17ae6eab5bb7b7155f577519d",
                            "oid": "ec91adb718a6045b492303f00d8e8beb957dc780",
        "mergeCommit": Object {
          "oid": "0a223346b4e6dec0127b1e6aa892c4ee0424b66a",
              "oid": "ec91adb718a6045b492303f00d8e8beb957dc780",
              "oid": "ecc889dc94cf6da17ae6eab5bb7b7155f577519d",
                  "oid": "ec91adb718a6045b492303f00d8e8beb957dc780",
        "mergeCommit": Object {
          "oid": "6d5b3aa31ebb68a06ceb46bbd6cf49b6ccd6f5e6",
              "oid": "0a223346b4e6dec0127b1e6aa892c4ee0424b66a",
                  "oid": "ec91adb718a6045b492303f00d8e8beb957dc780",
                  "oid": "ecc889dc94cf6da17ae6eab5bb7b7155f577519d",
                      "oid": "ec91adb718a6045b492303f00d8e8beb957dc780",
        "mergeCommit": null,
```

Cross-check this against [the example-github commits][commits] thus:

  - Note that commit `6bd1b4c` is the head commit, and is thus the root
    commit of the `target` chain.
  - Note that commits `0a22334` and `6d5b3aa`, which were merged via
    pull request, appear twice each: once in the history from head, and
    once as the merge commit of a pull request.
  - Note that commit `0a22334` has two parents at each occurrence.
  - Note that the unmerged pull request’s merge commit is `null`.

[commits]: https://github.com/sourcecred/example-github/commits/master

To run this on real-world data, apply the following patch:

```diff
diff --git a/src/plugins/github/fetchGithubRepo.js b/src/plugins/github/fetchGithubRepo.js
index 6ac201af..b14ca760 100644
--- a/src/plugins/github/fetchGithubRepo.js
+++ b/src/plugins/github/fetchGithubRepo.js
@@ -11,6 +11,7 @@ import {stringify, inlineLayout, type Body} from "../../graphql/queries";
 import {createQuery, createVariables, postQueryExhaustive} from "./graphql";
 import type {GithubResponseJSON} from "./graphql";
 import type {RepoId} from "../../core/repoId";
+import translateContinuations from "./translateContinuations";

 /**
  * Scrape data from a GitHub repo using the GitHub API.
@@ -44,6 +45,11 @@ export default function fetchGithubRepo(
     payload
   ).then((x: GithubResponseJSON) => {
     ensureNoMorePages(x);
+    console.warn("Translating continuations...");
+    for (const w of translateContinuations(x).warnings) {
+      console.warn(w);
+    }
+    console.warn("Done.");
     return x;
   });
 }
```

Then run:

```
$ yarn backend >/dev/null 2>/dev/null; echo $?
0
$ node ./bin/sourcecred.js load sourcecred/sourcecred --plugin github 2>&1 |
> ts -s '%.s'
55.015740 Translating continuations...
55.037217 { type: 'UNKNOWN_PARENT_OID',
55.037273   child: '0d38dde23a6de831315f3643a7d2bc15e8df7678',
55.037290   parent: 'cb8ba0eaa1abc1f921e7165bb19e29b40723ce65' }
55.037309 { type: 'UNKNOWN_PARENT_OID',
55.037336   child: 'd152f48ce4c2ed1d046bf6ed4f139e7e393ea660',
55.037359   parent: 'de7a8723963d9cd0437ef34f5942a071b850c0e7' }
55.037383 Done.
```

Note that the two commits in question were each merged into a non-master
branch, in #28 and #329 respectively. Note also that translating these
continuations took just 22 milliseconds.

wchargin-branch: github-translate-continuations
2018-10-22 10:01:49 -07:00
William Chargin 6499df6b6b
github: fix misc. errors in old GraphQL system (#929)
Summary:
This fixes the following issues:

  - Pull request reviews actually do not have reactions.
  - We must fetch the `id` of a `Ref`.
  - We must fetch the `id` of a `Commit`, `Tree`, `Blob`, or `Tag`, and
    should also fetch its `oid`.
  - Repository owners cannot be bots.
  - Commit and reaction authors cannot be bots, organizations, or
    `undefined`.

Test Plan:
Running `yarn test --full` passes, and the snapshot diff is clearly
correct.

wchargin-branch: github-fix-up-continuations
2018-10-22 09:50:24 -07:00
William Chargin 889febb7f6
github: add GraphQL schema and Flow types (#928)
Summary:
The included schema is forked from the one in `graphql/demo.js`.
Primitive types have been added, and the `parents` connection has been
added to commit objects per #920. (We do not include this in the demo
script because without prefetching it would take a long time to load.)

Test Plan:
Unit tests added; run `yarn unit`. Then run `yarn backend` and verify
that `node ./bin/generateGithubGraphqlFlowTypes.js` generates exactly
the same output as in the types file:

```
$ node ./bin/generateGithubGraphqlFlowTypes.js |
> diff -u - ./src/plugins/github/graphqlTypes.js
$ echo $?
0
```

Change the `graphqlTypes.js` file and verify that `yarn unit` fails.

As the build config has been changed, a `yarn test --full` is warranted.
It passes.

Finally, I have manually verified that the schema is consistent with the
documentation at <https://developer.github.com/v4/object/repository/>
and related pages.

wchargin-branch: github-schema-flow-types
2018-10-19 09:04:54 -07:00