Commit Graph

246 Commits

Author SHA1 Message Date
William Chargin e5472752ac
Allow converting transition matrix to sparse chain (#272)
Summary:
This function is mostly useful for easily describing Markov chains in
test cases.

Test Plan:
Unit tests added. Run `yarn test`.

wchargin-branch: sparseMarkovChainFromTransitionMatrix
2018-05-11 21:22:10 -07:00
William Chargin 3bd449d1c3
Create a `markovChain.js` module (#271)
Summary:
For now, this module has just two types: `Distribution` and
`SparseMarkovChain`. We’ll gradually pull code from `basicPagerank` into
this module, adding unit tests along the way.

Test Plan:
None required.

wchargin-branch: markov-chain-module
2018-05-11 21:14:56 -07:00
William Chargin e9e001b894
Switch AddressMap implementation to nested maps (#278)
Summary:
See #66 for more context. This yields the following performance
improvements for me, on the SourceCred graph with 11 072 nodes and
20 250 edges:

  - Loading a graph from disk is improved overall from 1172 ms to 292 ms
    (4.0× improvement).

  - The full pipeline for basic PageRank, from button press to final
    render, is improved from 8.44 s to 4.39 s (1.9× improvement).

  - The PageRank preprocessing step, which involves turning the graph
    into a typed array Markov chain, is improved from 2430 ms to 573 ms
    (4.2× improvement).

  - The PageRank postprocessing step, which involves turning the typed
    array distribution into an `AddressMap` distribution, is improved
    from 83.53 ms to 4.81 ms (17× improvement).

  - The `PagerankTable` React component `render` method (constructing
    the virtual tree only, not diffing or embedding into the DOM) is
    improved from 1708 ms to 332 ms (5× improvement).

The core matrix computations of PageRank are unaffected, because they do
not use the `AddressMap` abstraction.

Test Plan:
Existing unit tests suffice.

wchargin-branch: fast-addressmap
2018-05-11 19:46:54 -07:00
William Chargin 3e70edb3be
Use typed arrays for PageRank (#267)
Summary:
This takes `AddressMap` access, and therefore JSON stringification, off
the critical path, resulting in a significant performance increase. The
resulting code is much faster than the original TFJS implementation. On
my laptop, we can run about 300 iterations of PageRank per second on a
graph with 10 000 nodes and 18 000 edges (namely, the SourceCred graph).

Paired with @decentralion.

Test Plan:
Run `yarn start` and note that the cred attribution for SourceCred is
roughly the same as before… but is created faster.

wchargin-branch: pagerank-typed-arrays
2018-05-11 13:22:36 -07:00
William Chargin 7e97ba6bf3
Rewrite basic PageRank without TFJS (#266)
Summary:
We’re not convinced that using TFJS at this time is worth it, for two
reasons. First, our matrix computations can be expressed using sparse
matrices, which will improve the performance by orders of magnitude.
Sparse matrices do not appear to be supported by TFJS (the layers API
makes some use of them, but it is not clear that they have much support
their, either). Second, having to deal with GPU memory and WebGL has
already been problematic. When WebGL PageRank is running, the machine is
mostly unusable, and other applications’ video output is negatively
affected (!).

This commit rewrites the internals of `basicPagerank.js` while retaining
its end-to-end public interface. We also add a test file with a trivial
test. The resulting code is not faster yet—in fact, it’s a fair amount
slower. But this is because our use of `AddressMap`s puts JSON
stringification on the critical path, which is obviously a bad idea. In
a subsequent commit, we will rewrite the internals again to use typed
arrays.

Paired with @decentralion.

Test Plan:
The new unit test is not sufficient. Instead, run `yarn start` and
re-run PageRank on SourceCred; note that the results are roughly
unchanged.

wchargin-branch: pagerank-without-tfjs
2018-05-11 13:11:14 -07:00
Dandelion Mané 2a52ff85f8 Cred Explorer: Modify color based on table depth
Test plan: Open the cred explorer, and expand some nodes.

Paired with @wchargin
2018-05-10 17:05:01 -07:00
Dandelion Mané 880b0099e9 Remove unnecessary reversals in sort routine
We were sorting low-to-high, and then reversing. We can just sort
high-to-low.

Test plan: No behavior change (confirm by interacting with the Cred
Explorer).

Paired with @wchargin
2018-05-10 17:05:01 -07:00
Dandelion Mané 8a4b9592b1 Sort within recursive neighbor enumeration
Test plan: Open the cred explorer, and use the + sign to expand the
neighbors. Observe that those neighbors are now sorted by cred.

Paired with @wchargin
2018-05-10 17:05:01 -07:00
Dandelion Mané 121b83717f Cred Explorer: Recursively show neighbor nodes
Test plan: Open the cred explorer, and try clicking the + signs. They
will expand a recursive table showing the neighbor nodes and their cred.

Paired with @wchargin
2018-05-10 17:05:01 -07:00
Dandelion Mané 4bea0133db Set expanded state, update via +/− buttons
Test plan: Try clicking on the buttons and see that they toggle between
plus and minus. They don't do anything else.

Paired with @wchargin
2018-05-10 17:05:01 -07:00
Dandelion Mané 6714d4b95e Reorder cred explorer columns
Now the node description is first.

Test plan: Observe that the behavior is unchanged, except for the
columnar order.

Paired with @wchargin
2018-05-10 17:05:01 -07:00
Dandelion Mané 0dae0c995f Factor out the non-recursive RecursiveTable
Test plan: Behavior is unchanged; manually verify.

Paired with @wchargin
2018-05-10 17:05:01 -07:00
Dandelion Mané 2f0f523065 s/typeFilter/topLevelFilter/g
Test plan: No change, it's just a variable rename.

Paired with @wchargin
2018-05-10 17:05:01 -07:00
Dandelion Mané 7fc31f6a26
Add `PagerankTable` for exploring PageRank results (#264)
`PagerankTable` is forked from `ContributionList`.

Test plan: I took it for a spin and it seemed OK. I'm not inclined to
write formal tests because we are in rapid prototyping mode stil.
2018-05-10 14:49:00 -07:00
William Chargin bb2ec756a4
Save explorer’s repo settings in localStorage (#263)
Test Plan:
Load the cred explorer for the first time to see two empty boxes.
Refresh to see the same thing. Then, add some content to each box.
Refresh and see the same content.

wchargin-branch: cred-explorer-localstorage
2018-05-10 14:42:54 -07:00
Dandelion Mané ed1f17f8ca
Add `nodeDescription` for GitHub nodes (#261)
`nodeDescription` gives a short, readable description of the content at
a given node.

Test plan: View the included snapshot test.
2018-05-10 14:08:06 -07:00
William Chargin d21ad1312b
Add `git/render.js` with `nodeDescription` (#262)
Test Plan:
Unit tests added, with full coverage of reachable cases.

wchargin-branch: git-render
2018-05-10 14:02:19 -07:00
Dandelion Mané 2a88bbc091
Reorganize GitHub porcelain tests (#260)
This re-organizes the GitHub porcelain tests to be:
- organized by each method signature, rather than having blocks that
test many different methods on each wrapper
- make extensive use of snapshot tests for convenience

Test plan: Inspect the new unit tests, and the corresponding snapshots.
It should be relatively easy to do this because you can copy+paste the
urls to verify the properties.
2018-05-10 12:53:32 -07:00
William Chargin 47bec6cc10
Make `ensure-flow.sh` more precise and accurate (#259)
Summary:
This fixes two problems in the previous version:
  - A new JS file not checked into git, but with a `@flow` directive,
    would cause `ensure-flow` to fail, because one list of files was
    from `git grep` and the other was from `find`.
  - Only the hard-coded directories `src config scripts` were searched.

Now, we search all JS files checked into Git, except for some hard-coded
exceptions, namely `flow-typed`.

Test Plan:
  1. Add `foo.js`, not checked into Git. Note that `ensure-flow` passes.
  2. Add `@flow` to `foo.js`, and note that `ensure-flow` still passes.
  3. Remove `@flow` from `.eslintrc.js`, and note that `ensure-flow`
     fails and nicely prints the filename. (Note: this file is at the
     repository root.)
  4. Create a file `echo stuff >$'naughty\nfilename.js'`, and note that
     `ensure-flow` has the correct behavior in both positive and
     negative cases.

wchargin-branch: ensure-flow-improvements
2018-05-10 12:38:39 -07:00
William Chargin 1901d471f3
Add `@flow` to `.eslintrc.js` (#258)
Summary:
Even though it’s not really a source file, and it lives at the
repository root, it might as well have typing to make sure that we don’t
do anything really dumb.

Test Plan:
`yarn flow` still passes.

wchargin-branch: flow-eslintrc
2018-05-10 12:27:46 -07:00
Dandelion Mané 0d6742be39
Add `parent()` relationships for GitHub porcelain (#255)
Test plan: Unit tests were added.
(Note: I haven't tested the error case, when there are an invalid number
of parents. I think this is OK for now, but I'm disclosing this so
reviewers can easily take issue with it. I'm planning to re-organize the
test code to be by method rather than by wrapper type (so the wrappers
section doesn't keep being a kitchen sink) and will hopefully
put that test in.)
2018-05-10 11:41:19 -07:00
Dandelion Mané 04390e5609
Add CONTAINS edges from Repositories to Issues/PRs (#253)
Also updates the GitHub porcelain.
Existing observable behavior is unchanged, except that performance may
be improved for issueOrPrByNumber.

A bug that would afflict a multi-repository graph (namely, that calling
`repo.issues()` would get all issues for all repositories) is
pre-emptively removed. No test cases were added as we do not yet support
multi-repository graphs.

Test plan: existing unit test coverage is sufficient.
2018-05-10 11:34:53 -07:00
Dandelion Mané 9d24190c03
GH Porcelain: add `Repository.from` (#257)
I think the absence of this method when I added the `Repository` class
was a bug.

Test plan: There are new unit tests.
2018-05-10 11:24:58 -07:00
Dandelion Mané 5c44dd0373
GH Porcelain: Move `authors` top-level Porcelain (#254)
Currently, the `authors` method is attached at the Repository level.
This is incorrect; it actually finds all the authors in the graph, not
all the authors of that repository. This commit moves the method to the
correct class.

Test plan: This function is only used in test code. The tests still
pass.
2018-05-10 11:24:24 -07:00
William Chargin 61d3cb3f52
Implement basic PageRank analysis (#252)
Summary:
We don’t expect the results to be of good quality right now. Rather,
this gives us a starting point from which to iterate the algorithm.

The convergence criterion also needs to be adjusted. (In particular, it
should almost certainly not be a constant.)

Test Plan:
Run `yarn start`. Select a graph, like `sourcecred/example-github`. Open
the JS console and click “Run basic PageRank”. Watch the console.

wchargin-branch: basic-pagerank
2018-05-10 11:21:18 -07:00
William Chargin 8e4668cc91
Fetch generated graphs on the frontend (#251)
Summary:
This commit enables the cred explorer to fetch pre-generated graphs. The
form has poor UX, but gets the job done. (To do later: saving in
localStorage, allowing fetching the list of possible graphs, linking the
submit button so that it’s triggered by `Enter`, etc.).

Test Plan:
Set up with `node ./bin/sourcecred.js graph sourcecred sourcecred`, then
run `yarn start` and check the following cases:
  - success case: `sourcecred`/`sourcecred`
  - error case: `sarcecrod`/`sarcecrod` (nice console error)
  - error case: the repository owner or name is an empty string or has
    invalid characters, like `../../secret.txt` (nice console error)

wchargin-branch: fetch-graphs
2018-05-09 12:47:56 -07:00
Dandelion Mané 6ca4f77b6d
Add dependency on tfjs-core (#250) 2018-05-09 10:22:48 -07:00
William Chargin cb1339a0a7
Start a production server from `sourcecred start` (#247)
Summary:
This commit changes `yarn start` to run a production version of the API
server, which serves static files from a pre-built directory and also
handles API requests.

Test Plan:
```shell
$ yarn build
$ yarn backend
$ node ./bin/sourcecred.js start -d /tmp/srccrd &
$ mkdir -p /tmp/srccrd
$ echo hello >/tmp/srccrd/world
$ curl -s localhost:4000/ | file -
/dev/stdin: HTML document, ASCII text, with very long lines, with no line terminators
$ curl -s localhost:4000/api/v1/data/world
hello
```

wchargin-branch: cli-start-prod-server
2018-05-08 22:00:36 -07:00
Dandelion Mané ac8d0ff66c
Setup credExplorer app scaffold (#249)
Test plan: Run `yarn start`, and observe that the Cred Explorer is now
included in the nav bar, and can be selected, in which case a short
message is displayed.
2018-05-08 17:05:56 -07:00
Dandelion Mané 0c59435a2b
Move testUtil.js into src/app (#248)
testUtil contains some useful configuration endpoints for our frontend
testing. This commit moves it out of the artifact editor and up to
`src/app`.

Test plan: `yarn travis` passes.
2018-05-08 17:03:01 -07:00
Dandelion Mané d9b4673dbd
Factor out `github.porcelain.asEntity` (#246)
@wchargin suggested that the entity-wrapping logic in porcelain
reference handling should be factored out as its own method. This was a
great suggestion; it will be very useful for plugin consumers, and it
also results in better test coverage.

Test plan: The new unit tests are nice. For your own safety, do not
question or quibble with the magnificent foo plugin.
2018-05-08 16:33:19 -07:00
William Chargin 9ea1f981aa
Proxy Webpack dev server through to an API server (#245)
Summary:
This way, our frontend can talk to a backend that can read from the
filesystem (among other things).

Paired with @decentralion.

Test Plan:
```
$ yarn backend
$ SOURCECRED_DIRECTORY=/tmp/srccrd yarn start
$ # verify that the browser looks good
$ mkdir /tmp/srccrd
$ echo hello >/tmp/srccrd/world
$ curl localhost:3000/api/v1/data/world
hello
$ curl localhost:4000/api/v1/data/world
hello
```

wchargin-branch: webpack-proxy
2018-05-08 16:09:37 -07:00
Dandelion Mané 62b9f70d00
Remove the old `experiments` directory (#244) 2018-05-08 15:32:37 -07:00
Dandelion Mané 3166c2a56c
Turn on flow for config/env.js (#243)
It was doing some clever array construction that added possible booleans
to the array, then filtered them out. To make the typing simpler for
Flow's inspection, we now only put string elements in the array.

Test plan: `yarn travis --full` passed, and the CLI still works.
2018-05-08 14:56:06 -07:00
William Chargin 2e8653a3ee
Turn on flow for scripts/start.js (#242)
Summary:
  - The value of `process.stdout.isTTY` is either `true` or `undefined`.
    Flow (reasonably) dislikes this, so we add an explicit check.
  - More `package.json` burnination.

Test Plan:
Note that `require("./package.json").proxy === undefined` in the Node
console, and that `yarn start` works.

wchargin-branch: flow--scripts-start
2018-05-08 14:51:55 -07:00
Dandelion Mané 824df7e916
Turn on flow for scripts/{backend,build,test}.js (#241)
- scripts/backend.js: We incorrectly set an environment variable to
a boolean, when in fact it must be a string. Fixed it to set a string
value "true", and updated usage in config/babel.js
- scripts/test.js: No changes
- scripts/build.js: Removed a call to printHostingInstructions, so that
we don't need to require the package.json.

Test plan:
`yarn travis --full` passes, and the SourceCred cli still works.
2018-05-08 14:35:56 -07:00
Dandelion Mané 1647c1abac
Fix flow errors in fetchAndPrintGithubRepo.js (#240)
Fixing the flow error corresponded to (correctly) documenting that the
GitHub token is mandatory, not optional.

Test plan: `yarn travis --full` still passes.
2018-05-08 14:24:11 -07:00
Dandelion Mané d34503799c
Enable flow: sourcecred.js and editor/App.test.js (#239)
They were already correct from a typing perspective, so no other changes
needed.
2018-05-08 14:21:26 -07:00
Dandelion Mané d221a933d8
Fix flow errors in paths.js (#238)
- Fix accidental string-to-NaN coercion in ensureSlash
- Don't dynamically require package.json; to determine public url, just
use the environment variable or "/"

Test plan: `yarn start` and travis still work
2018-05-08 14:21:19 -07:00
William Chargin 7d9a98128d
Extract a generic `LocalStore` module (#235)
Summary:
This way, different plugins can have `LocalStore`s with different cache
keys.

Test Plan:
Note that the app (`yarn start`) preserves the local store values from
before this change, and that updates continue to work.

wchargin-branch: generic-localstore
2018-05-08 13:40:48 -07:00
Dandelion Mané 372f8f9bd6 Setup routing within App.js
This commit modifies App.js to use routing, such that it's possible to
navigate between a home screen, and the Artifact Editor.

Test plan: Run `yarn start`, and navigate between the home screen and
artifact editor. Verify that the artifact editor still works as
expected.
2018-05-08 12:55:38 -07:00
Dandelion Mané 0149d74971 Add react-router-dom
This commit adds a npm and flow-typed dependency, with no functional
change.

Test plan: `yarn travis` passes.
2018-05-08 12:55:38 -07:00
Dandelion Mané e1808d1126 Add src/app/App.js
This commit adds src/app/App.js, which proxies in the frontend from
src/plugins/artifact/editor/App.js. The observed behavior (run `yarn
start`; see Artifact Editor) is unchanged.

Test plan: Observe that `yarn start` has the same behavior, and travis
passes.
2018-05-08 12:55:38 -07:00
Dandelion Mané 63351e6149 Move app scaffolding to src/app
This commit executes a micro-refactor to move all top-level app setup
code out of src/plugins/artifact/editor and into src/app. The observed
behavior from `yarn start`, which is to show the artifact editor, is
unchanged.
2018-05-08 12:55:38 -07:00
Dandelion Mané c2fb88b11a Turn on flow for index.js
Test plan: `yarn travis` passes
2018-05-08 12:55:38 -07:00
William Chargin 57682065fd
Add `sourcecred start` (#234)
Summary:
We need a way for our web applications to interact with data on the
filesystem. In this commit, we introduce a webserver that serves
statically from two directory trees: first, the result of a live-updated
Webpack build; second, the SourceCred data directory.

Test Plan:
Run `yarn backend` and `node ./bin/sourcecred.js start`. When ready,
navigate to the server’s root route in a web browser. Note that a nice
React app is displayed. Then, change something in that React app source.
Note that the server console displays Webpack’s update messages, and
that refreshing the page in the browser renders the new version of the
app. Finally, visit

    /__data__/graphs/sourcecred/example-github/graph.json

in the browser to see the graph for the example repository, assuming
that you had generated its graph previously.

wchargin-branch: start
2018-05-07 20:10:49 -07:00
William Chargin 18ddbfff3e
Add dependency on express (#233)
wchargin-branch: express
2018-05-07 20:05:52 -07:00
Dandelion Mané 93e2798f37
Ensure that flow is used in all js files (#232)
This script ensures that either //@flow or //@no-flow is present in
every js file. Every existing js file that would fail this check has
been given //@no-flow, we should work to remove all of these in the
future.

Test plan:
I verified that `yarn travis` fails before fixing the other js files,
and passes afterwards.
2018-05-07 20:02:19 -07:00
Dandelion Mané ed1adc7b37
Rename `src/plugins/github/{api,porcelain}` (#231)
I also added a module-level docstring for the porcelain.
2018-05-07 19:18:01 -07:00
Dandelion Mané 9b3019434d
Create `github.Porcelain`: whole-graph porcelain (#230)
Now that we have repository nodes (#171), it makes sense that the Github
porcelain should provide a way to wrap the entire graph, and provide
easy access for the various repositories. This adds a `Porcelain` class
to fulfill that need.

The `Porcelain` is very straightforward: it takes in the whole graph,
and gives a way to get all the Repositories, or to request a particular
Repository by owner/name. In the odd case wherein a graph contains
multiple repository nodes with the same owner and name, an error is
thrown. Per standard JS map semantics (bleh), it can return undefined if
there is no matching repository.

Test plan:
See that the unit tests now use the standard behavior, and a test
verifies behavior for non-existant repositories. I don't have a test
case where there are multiple repo nodes, but that itself would be an
error, so throwing an error in that case is just defensive programming.
2018-05-07 17:56:33 -07:00