243 Commits

Author SHA1 Message Date
William Chargin
3e70edb3be
Use typed arrays for PageRank (#267)
Summary:
This takes `AddressMap` access, and therefore JSON stringification, off
the critical path, resulting in a significant performance increase. The
resulting code is much faster than the original TFJS implementation. On
my laptop, we can run about 300 iterations of PageRank per second on a
graph with 10 000 nodes and 18 000 edges (namely, the SourceCred graph).

Paired with @decentralion.

Test Plan:
Run `yarn start` and note that the cred attribution for SourceCred is
roughly the same as before… but is created faster.

wchargin-branch: pagerank-typed-arrays
2018-05-11 13:22:36 -07:00
William Chargin
7e97ba6bf3
Rewrite basic PageRank without TFJS (#266)
Summary:
We’re not convinced that using TFJS at this time is worth it, for two
reasons. First, our matrix computations can be expressed using sparse
matrices, which will improve the performance by orders of magnitude.
Sparse matrices do not appear to be supported by TFJS (the layers API
makes some use of them, but it is not clear that they have much support
their, either). Second, having to deal with GPU memory and WebGL has
already been problematic. When WebGL PageRank is running, the machine is
mostly unusable, and other applications’ video output is negatively
affected (!).

This commit rewrites the internals of `basicPagerank.js` while retaining
its end-to-end public interface. We also add a test file with a trivial
test. The resulting code is not faster yet—in fact, it’s a fair amount
slower. But this is because our use of `AddressMap`s puts JSON
stringification on the critical path, which is obviously a bad idea. In
a subsequent commit, we will rewrite the internals again to use typed
arrays.

Paired with @decentralion.

Test Plan:
The new unit test is not sufficient. Instead, run `yarn start` and
re-run PageRank on SourceCred; note that the results are roughly
unchanged.

wchargin-branch: pagerank-without-tfjs
2018-05-11 13:11:14 -07:00
Dandelion Mané
2a52ff85f8 Cred Explorer: Modify color based on table depth
Test plan: Open the cred explorer, and expand some nodes.

Paired with @wchargin
2018-05-10 17:05:01 -07:00
Dandelion Mané
880b0099e9 Remove unnecessary reversals in sort routine
We were sorting low-to-high, and then reversing. We can just sort
high-to-low.

Test plan: No behavior change (confirm by interacting with the Cred
Explorer).

Paired with @wchargin
2018-05-10 17:05:01 -07:00
Dandelion Mané
8a4b9592b1 Sort within recursive neighbor enumeration
Test plan: Open the cred explorer, and use the + sign to expand the
neighbors. Observe that those neighbors are now sorted by cred.

Paired with @wchargin
2018-05-10 17:05:01 -07:00
Dandelion Mané
121b83717f Cred Explorer: Recursively show neighbor nodes
Test plan: Open the cred explorer, and try clicking the + signs. They
will expand a recursive table showing the neighbor nodes and their cred.

Paired with @wchargin
2018-05-10 17:05:01 -07:00
Dandelion Mané
4bea0133db Set expanded state, update via +/− buttons
Test plan: Try clicking on the buttons and see that they toggle between
plus and minus. They don't do anything else.

Paired with @wchargin
2018-05-10 17:05:01 -07:00
Dandelion Mané
6714d4b95e Reorder cred explorer columns
Now the node description is first.

Test plan: Observe that the behavior is unchanged, except for the
columnar order.

Paired with @wchargin
2018-05-10 17:05:01 -07:00
Dandelion Mané
0dae0c995f Factor out the non-recursive RecursiveTable
Test plan: Behavior is unchanged; manually verify.

Paired with @wchargin
2018-05-10 17:05:01 -07:00
Dandelion Mané
2f0f523065 s/typeFilter/topLevelFilter/g
Test plan: No change, it's just a variable rename.

Paired with @wchargin
2018-05-10 17:05:01 -07:00
Dandelion Mané
7fc31f6a26
Add PagerankTable for exploring PageRank results (#264)
`PagerankTable` is forked from `ContributionList`.

Test plan: I took it for a spin and it seemed OK. I'm not inclined to
write formal tests because we are in rapid prototyping mode stil.
2018-05-10 14:49:00 -07:00
William Chargin
bb2ec756a4
Save explorer’s repo settings in localStorage (#263)
Test Plan:
Load the cred explorer for the first time to see two empty boxes.
Refresh to see the same thing. Then, add some content to each box.
Refresh and see the same content.

wchargin-branch: cred-explorer-localstorage
2018-05-10 14:42:54 -07:00
Dandelion Mané
ed1f17f8ca
Add nodeDescription for GitHub nodes (#261)
`nodeDescription` gives a short, readable description of the content at
a given node.

Test plan: View the included snapshot test.
2018-05-10 14:08:06 -07:00
William Chargin
d21ad1312b
Add git/render.js with nodeDescription (#262)
Test Plan:
Unit tests added, with full coverage of reachable cases.

wchargin-branch: git-render
2018-05-10 14:02:19 -07:00
Dandelion Mané
2a88bbc091
Reorganize GitHub porcelain tests (#260)
This re-organizes the GitHub porcelain tests to be:
- organized by each method signature, rather than having blocks that
test many different methods on each wrapper
- make extensive use of snapshot tests for convenience

Test plan: Inspect the new unit tests, and the corresponding snapshots.
It should be relatively easy to do this because you can copy+paste the
urls to verify the properties.
2018-05-10 12:53:32 -07:00
William Chargin
47bec6cc10
Make ensure-flow.sh more precise and accurate (#259)
Summary:
This fixes two problems in the previous version:
  - A new JS file not checked into git, but with a `@flow` directive,
    would cause `ensure-flow` to fail, because one list of files was
    from `git grep` and the other was from `find`.
  - Only the hard-coded directories `src config scripts` were searched.

Now, we search all JS files checked into Git, except for some hard-coded
exceptions, namely `flow-typed`.

Test Plan:
  1. Add `foo.js`, not checked into Git. Note that `ensure-flow` passes.
  2. Add `@flow` to `foo.js`, and note that `ensure-flow` still passes.
  3. Remove `@flow` from `.eslintrc.js`, and note that `ensure-flow`
     fails and nicely prints the filename. (Note: this file is at the
     repository root.)
  4. Create a file `echo stuff >$'naughty\nfilename.js'`, and note that
     `ensure-flow` has the correct behavior in both positive and
     negative cases.

wchargin-branch: ensure-flow-improvements
2018-05-10 12:38:39 -07:00
William Chargin
1901d471f3
Add @flow to .eslintrc.js (#258)
Summary:
Even though it’s not really a source file, and it lives at the
repository root, it might as well have typing to make sure that we don’t
do anything really dumb.

Test Plan:
`yarn flow` still passes.

wchargin-branch: flow-eslintrc
2018-05-10 12:27:46 -07:00
Dandelion Mané
0d6742be39
Add parent() relationships for GitHub porcelain (#255)
Test plan: Unit tests were added.
(Note: I haven't tested the error case, when there are an invalid number
of parents. I think this is OK for now, but I'm disclosing this so
reviewers can easily take issue with it. I'm planning to re-organize the
test code to be by method rather than by wrapper type (so the wrappers
section doesn't keep being a kitchen sink) and will hopefully
put that test in.)
2018-05-10 11:41:19 -07:00
Dandelion Mané
04390e5609
Add CONTAINS edges from Repositories to Issues/PRs (#253)
Also updates the GitHub porcelain.
Existing observable behavior is unchanged, except that performance may
be improved for issueOrPrByNumber.

A bug that would afflict a multi-repository graph (namely, that calling
`repo.issues()` would get all issues for all repositories) is
pre-emptively removed. No test cases were added as we do not yet support
multi-repository graphs.

Test plan: existing unit test coverage is sufficient.
2018-05-10 11:34:53 -07:00
Dandelion Mané
9d24190c03
GH Porcelain: add Repository.from (#257)
I think the absence of this method when I added the `Repository` class
was a bug.

Test plan: There are new unit tests.
2018-05-10 11:24:58 -07:00
Dandelion Mané
5c44dd0373
GH Porcelain: Move authors top-level Porcelain (#254)
Currently, the `authors` method is attached at the Repository level.
This is incorrect; it actually finds all the authors in the graph, not
all the authors of that repository. This commit moves the method to the
correct class.

Test plan: This function is only used in test code. The tests still
pass.
2018-05-10 11:24:24 -07:00
William Chargin
61d3cb3f52
Implement basic PageRank analysis (#252)
Summary:
We don’t expect the results to be of good quality right now. Rather,
this gives us a starting point from which to iterate the algorithm.

The convergence criterion also needs to be adjusted. (In particular, it
should almost certainly not be a constant.)

Test Plan:
Run `yarn start`. Select a graph, like `sourcecred/example-github`. Open
the JS console and click “Run basic PageRank”. Watch the console.

wchargin-branch: basic-pagerank
2018-05-10 11:21:18 -07:00
William Chargin
8e4668cc91
Fetch generated graphs on the frontend (#251)
Summary:
This commit enables the cred explorer to fetch pre-generated graphs. The
form has poor UX, but gets the job done. (To do later: saving in
localStorage, allowing fetching the list of possible graphs, linking the
submit button so that it’s triggered by `Enter`, etc.).

Test Plan:
Set up with `node ./bin/sourcecred.js graph sourcecred sourcecred`, then
run `yarn start` and check the following cases:
  - success case: `sourcecred`/`sourcecred`
  - error case: `sarcecrod`/`sarcecrod` (nice console error)
  - error case: the repository owner or name is an empty string or has
    invalid characters, like `../../secret.txt` (nice console error)

wchargin-branch: fetch-graphs
2018-05-09 12:47:56 -07:00
Dandelion Mané
6ca4f77b6d
Add dependency on tfjs-core (#250) 2018-05-09 10:22:48 -07:00
William Chargin
cb1339a0a7
Start a production server from sourcecred start (#247)
Summary:
This commit changes `yarn start` to run a production version of the API
server, which serves static files from a pre-built directory and also
handles API requests.

Test Plan:
```shell
$ yarn build
$ yarn backend
$ node ./bin/sourcecred.js start -d /tmp/srccrd &
$ mkdir -p /tmp/srccrd
$ echo hello >/tmp/srccrd/world
$ curl -s localhost:4000/ | file -
/dev/stdin: HTML document, ASCII text, with very long lines, with no line terminators
$ curl -s localhost:4000/api/v1/data/world
hello
```

wchargin-branch: cli-start-prod-server
2018-05-08 22:00:36 -07:00
Dandelion Mané
ac8d0ff66c
Setup credExplorer app scaffold (#249)
Test plan: Run `yarn start`, and observe that the Cred Explorer is now
included in the nav bar, and can be selected, in which case a short
message is displayed.
2018-05-08 17:05:56 -07:00
Dandelion Mané
0c59435a2b
Move testUtil.js into src/app (#248)
testUtil contains some useful configuration endpoints for our frontend
testing. This commit moves it out of the artifact editor and up to
`src/app`.

Test plan: `yarn travis` passes.
2018-05-08 17:03:01 -07:00
Dandelion Mané
d9b4673dbd
Factor out github.porcelain.asEntity (#246)
@wchargin suggested that the entity-wrapping logic in porcelain
reference handling should be factored out as its own method. This was a
great suggestion; it will be very useful for plugin consumers, and it
also results in better test coverage.

Test plan: The new unit tests are nice. For your own safety, do not
question or quibble with the magnificent foo plugin.
2018-05-08 16:33:19 -07:00
William Chargin
9ea1f981aa
Proxy Webpack dev server through to an API server (#245)
Summary:
This way, our frontend can talk to a backend that can read from the
filesystem (among other things).

Paired with @decentralion.

Test Plan:
```
$ yarn backend
$ SOURCECRED_DIRECTORY=/tmp/srccrd yarn start
$ # verify that the browser looks good
$ mkdir /tmp/srccrd
$ echo hello >/tmp/srccrd/world
$ curl localhost:3000/api/v1/data/world
hello
$ curl localhost:4000/api/v1/data/world
hello
```

wchargin-branch: webpack-proxy
2018-05-08 16:09:37 -07:00
Dandelion Mané
62b9f70d00
Remove the old experiments directory (#244) 2018-05-08 15:32:37 -07:00
Dandelion Mané
3166c2a56c
Turn on flow for config/env.js (#243)
It was doing some clever array construction that added possible booleans
to the array, then filtered them out. To make the typing simpler for
Flow's inspection, we now only put string elements in the array.

Test plan: `yarn travis --full` passed, and the CLI still works.
2018-05-08 14:56:06 -07:00
William Chargin
2e8653a3ee
Turn on flow for scripts/start.js (#242)
Summary:
  - The value of `process.stdout.isTTY` is either `true` or `undefined`.
    Flow (reasonably) dislikes this, so we add an explicit check.
  - More `package.json` burnination.

Test Plan:
Note that `require("./package.json").proxy === undefined` in the Node
console, and that `yarn start` works.

wchargin-branch: flow--scripts-start
2018-05-08 14:51:55 -07:00
Dandelion Mané
824df7e916
Turn on flow for scripts/{backend,build,test}.js (#241)
- scripts/backend.js: We incorrectly set an environment variable to
a boolean, when in fact it must be a string. Fixed it to set a string
value "true", and updated usage in config/babel.js
- scripts/test.js: No changes
- scripts/build.js: Removed a call to printHostingInstructions, so that
we don't need to require the package.json.

Test plan:
`yarn travis --full` passes, and the SourceCred cli still works.
2018-05-08 14:35:56 -07:00
Dandelion Mané
1647c1abac
Fix flow errors in fetchAndPrintGithubRepo.js (#240)
Fixing the flow error corresponded to (correctly) documenting that the
GitHub token is mandatory, not optional.

Test plan: `yarn travis --full` still passes.
2018-05-08 14:24:11 -07:00
Dandelion Mané
d34503799c
Enable flow: sourcecred.js and editor/App.test.js (#239)
They were already correct from a typing perspective, so no other changes
needed.
2018-05-08 14:21:26 -07:00
Dandelion Mané
d221a933d8
Fix flow errors in paths.js (#238)
- Fix accidental string-to-NaN coercion in ensureSlash
- Don't dynamically require package.json; to determine public url, just
use the environment variable or "/"

Test plan: `yarn start` and travis still work
2018-05-08 14:21:19 -07:00
William Chargin
7d9a98128d
Extract a generic LocalStore module (#235)
Summary:
This way, different plugins can have `LocalStore`s with different cache
keys.

Test Plan:
Note that the app (`yarn start`) preserves the local store values from
before this change, and that updates continue to work.

wchargin-branch: generic-localstore
2018-05-08 13:40:48 -07:00
Dandelion Mané
372f8f9bd6 Setup routing within App.js
This commit modifies App.js to use routing, such that it's possible to
navigate between a home screen, and the Artifact Editor.

Test plan: Run `yarn start`, and navigate between the home screen and
artifact editor. Verify that the artifact editor still works as
expected.
2018-05-08 12:55:38 -07:00
Dandelion Mané
0149d74971 Add react-router-dom
This commit adds a npm and flow-typed dependency, with no functional
change.

Test plan: `yarn travis` passes.
2018-05-08 12:55:38 -07:00
Dandelion Mané
e1808d1126 Add src/app/App.js
This commit adds src/app/App.js, which proxies in the frontend from
src/plugins/artifact/editor/App.js. The observed behavior (run `yarn
start`; see Artifact Editor) is unchanged.

Test plan: Observe that `yarn start` has the same behavior, and travis
passes.
2018-05-08 12:55:38 -07:00
Dandelion Mané
63351e6149 Move app scaffolding to src/app
This commit executes a micro-refactor to move all top-level app setup
code out of src/plugins/artifact/editor and into src/app. The observed
behavior from `yarn start`, which is to show the artifact editor, is
unchanged.
2018-05-08 12:55:38 -07:00
Dandelion Mané
c2fb88b11a Turn on flow for index.js
Test plan: `yarn travis` passes
2018-05-08 12:55:38 -07:00
William Chargin
57682065fd
Add sourcecred start (#234)
Summary:
We need a way for our web applications to interact with data on the
filesystem. In this commit, we introduce a webserver that serves
statically from two directory trees: first, the result of a live-updated
Webpack build; second, the SourceCred data directory.

Test Plan:
Run `yarn backend` and `node ./bin/sourcecred.js start`. When ready,
navigate to the server’s root route in a web browser. Note that a nice
React app is displayed. Then, change something in that React app source.
Note that the server console displays Webpack’s update messages, and
that refreshing the page in the browser renders the new version of the
app. Finally, visit

    /__data__/graphs/sourcecred/example-github/graph.json

in the browser to see the graph for the example repository, assuming
that you had generated its graph previously.

wchargin-branch: start
2018-05-07 20:10:49 -07:00
William Chargin
18ddbfff3e
Add dependency on express (#233)
wchargin-branch: express
2018-05-07 20:05:52 -07:00
Dandelion Mané
93e2798f37
Ensure that flow is used in all js files (#232)
This script ensures that either //@flow or //@no-flow is present in
every js file. Every existing js file that would fail this check has
been given //@no-flow, we should work to remove all of these in the
future.

Test plan:
I verified that `yarn travis` fails before fixing the other js files,
and passes afterwards.
2018-05-07 20:02:19 -07:00
Dandelion Mané
ed1adc7b37
Rename src/plugins/github/{api,porcelain} (#231)
I also added a module-level docstring for the porcelain.
2018-05-07 19:18:01 -07:00
Dandelion Mané
9b3019434d
Create github.Porcelain: whole-graph porcelain (#230)
Now that we have repository nodes (#171), it makes sense that the Github
porcelain should provide a way to wrap the entire graph, and provide
easy access for the various repositories. This adds a `Porcelain` class
to fulfill that need.

The `Porcelain` is very straightforward: it takes in the whole graph,
and gives a way to get all the Repositories, or to request a particular
Repository by owner/name. In the odd case wherein a graph contains
multiple repository nodes with the same owner and name, an error is
thrown. Per standard JS map semantics (bleh), it can return undefined if
there is no matching repository.

Test plan:
See that the unit tests now use the standard behavior, and a test
verifies behavior for non-existant repositories. I don't have a test
case where there are multiple repo nodes, but that itself would be an
error, so throwing an error in that case is just defensive programming.
2018-05-07 17:56:33 -07:00
Dandelion Mané
f219636a56
Create "REPOSITORY" nodes in GitHub plugin graph (#229)
This commit creates a new node type in the GitHub graph: the REPOSITORY
node. The REPOSITORY node has the following payload properties:
- url (string)
- name (string)
- owner (string)

Things this commit does:
- Add new node type and payload type (RepositoryNodePayload)
- Update parser to instantiate the new node type
- Update api.js to have Repository wrap the new node type (thus
Repository is a GitHub entity)
- Update snapshots
- Update users of GitHub node types to ensure they are exhaustive

Things that will come in a followon commit:
- Add CONTAINS edges from the repository to all its PRs and Issues
- Update the Repository porcelain to use those edges, rather than
scanning the graph for every possible Issue/PR (eventually those might
belong to other Repositories)
- Create a GitHubGraph abstraction in the porcelain, which makes it easy
to find all of the Repositories in a graph

Note that retrieving the repository owner technically involved fetching
the whole owner representation (as a GitHub user). I could have chosen
to add that user to the graph, with a "OWNS" edge pointing to the
repository. For simplicity's sake, I've declined to do that, and instead
just parse the owner's name directly.

Test plan:
Added tests to verify that the Repository porcelain entity has the right
properties. Combined with the snapshot tests, that should be sufficient.
2018-05-07 17:28:47 -07:00
Dandelion Mané
9d4ae8b901
Deleted users no longer break GitHub parser (#228)
When a GitHub user delete their account, all of their comments remain,
but with a `null` author. Previously, our code did not account for this
possibility. Now it does (by simply not adding an AUTHORSHIP edge).
Conveniently, our porcelain API already represents authors as a list, so
this doesn't require any change in porcelain API usage.

Test plan:
I did not add any unit tests, simply because
creating-and-deleting a GitHub user to make a repro seemed like a bit of
a pain. However, it is very unlikely that this bug will re-occur,
because the nullability of AuthorJSON is now enforced at the type level
inside graphql.js.

Also, running `node bin/sourcecred.js src-d go-git` now succeedsinstead
of failing.
2018-05-07 17:06:26 -07:00
William Chargin
0cae9d742d
Extract a common SOURCECRED_DIRECTORY flag (#227)
Summary:
This solves two problems:

 1. The “output directory” argument to `sourcecred graph` is also the
    input directory to other commands, like `sourcecred analyze`
    (hypothetically). In such cases, it would be nice for the flag to
    have the same name, but clearly `--output-directory` does not always
    make sense.

 2. In addition to storing graphs, we’ll need to store other kinds of
    data: settings, intermediate data for plugins to cache results, etc.
    We should store these under a single umbrella.

With both of these problems in mind, it makes sense to create a
`SOURCECRED_DIRECTORY` flag under which we store all relevant data.

Test Plan:
Run:
```shell
$ yarn backend
$ node bin/sourcecred.js help graph
$ node bin/sourcecred.js graph sourcecred example-github
$ node bin/sourcecred.js graph sourcecred example-github -d /tmp/sorcecrod
$ SOURCECRED_DIRECTORY=/tmp/srccrd node bin/sourcecred.js graph sourcecred example-github
$ for dir in /tmp/{sourcecred,sorcecrod,srccrd}; do find "${dir}"; done
```

wchargin-branch: graph-directory
2018-05-07 17:05:58 -07:00