Commit Graph

61 Commits

Author SHA1 Message Date
William Chargin 513820c177
deps: upgrade `flow-bin@^0.80.0` (#791)
Summary:
This upgrade didn’t require fixing any new errors, but Flow is a good
dependency to keep on top of.

Test Plan:
Running `yarn flow` suffices.

wchargin-branch: flow-v0.80.0
2018-09-06 13:37:29 -07:00
William Chargin 92514ad559
deps: remove some unused packages (#789)
Summary:
Mostly Webpack loaders that have become unused through various config
changes.

Test Plan:
Check that these packages are not used anywhere except as transitive
dependencies:

```shell
$ git show --format= package.json |
>     sed '1,4d' | grep '^-' | cut -d\" -f2 | git grep -cf -
yarn.lock:3
```

Also, `yarn && yarn test --full` works, and `yarn start` works, and
`yarn backend && node ./bin/sourcecred.js load sourcecred/example-git`
works.

wchargin-branch: remove-unused-deps
2018-09-06 12:08:45 -07:00
William Chargin c48597651e
backend: invoke Webpack directly in package script (#782)
Summary:
This commit removes the `config/backend.js` script and replaces it with
a direct invocation of Webpack. This enables us to use command-line
arguments to Webpack, like `--output-path`.

Test Plan:
Note that `rm -rf bin; yarn backend` still works, and that the resulting
applications work (`node bin/sourcecred.js load`). Note that `yarn test`
and `yarn test --full` still work.

wchargin-branch: backend-webpack-direct
2018-09-05 12:28:27 -07:00
William Chargin 7e66886a70
dep: remove `eslint-loader` (#777)
Summary:
As of #775, this is no longer used.

Test Plan:
A `git grep eslint-loader` shows no results, and `yarn test --full`
passes.

wchargin-branch: remove-eslint-loader
2018-09-05 11:33:47 -07:00
William Chargin ddc93826be
Rename `makeWebpackConfig` to `webpack.config.web` (#770)
Summary:
The distinction was useful while `makeWebpackConfig` was being developed
(between #562 and #570), but is now confusing: we have a web config and
a backend config, and it is clearer if we name them as such.

Test Plan:
All of `yarn start`, `yarn build`, and `yarn test --full` work.

wchargin-branch: webpack-config-web
2018-09-04 21:45:10 -07:00
William Chargin 86c1b2e068
webpack: empty build dir instead of removing it (#768)
Test Plan:
Run `mkdir /tmp/out; cd /tmp/out; python -m SimpleHTTPServer`. In
another shell, run `./scripts/build_static_site.sh --target /tmp/out`.
Then, `curl localhost:8000`. Before this commit, this would have yielded
an `OSError` because the cwd of the Python process had been removed.
As of this commit, it works fine.

Also, run `git grep -c rimraf` and note only `yarn.lock:15`.

wchargin-branch: webpack-empty-build-directory
2018-09-04 21:44:09 -07:00
William Chargin 0a08783424
Remove OClif entirely (#745)
Test Plan:
Note that `yarn backend; node bin/sourcecred.js help` still works.
Note that `git grep -i oclif` returns no results.
Rejoice.

wchargin-branch: remove-oclif
2018-09-02 16:16:00 -07:00
William Chargin e71264f5cc
Replace `oclif` with `cli` (#744)
Summary:
This commit changes the CLI to use the code in `cli` instead of `oclif`.
A subsequent commit will remove the dependency on OClif altogether.
Resolves #580.

Test Plan:
Note that `yarn backend; node bin/sourcecred.js help` works. Note that
the documentation in the README is still correct.

wchargin-branch: cli-replace-oclif
2018-09-02 16:11:56 -07:00
William Chargin 7f81337d74
Store GitHub data gzipped at rest (#751)
Summary:
We store the relational view in `view.json.gz` instead of `view.json`,
taking advantage of the isomorphic `pako` library for gzip encoding and
decoding.

Sample space savings (note that post bodies are included; i.e., #747 has
not been applied):

       SAVE     OLD (B)     NEW (B) REPO
      89.7%       25326        2617 sourcecred/example-github
      82.9%     3257576      555948 sourcecred/sourcecred
      85.2%    11287621     1665884 ipfs/js-ipfs
      88.0%    20953425     2520358 gitcoinco/web
      84.4%    38196825     5951459 ipfs/go-ipfs
      84.9%   205770642    31101452 tensorflow/tensorflow

<details>
<summary>Script to generate space savings output</summary>

```shell
savings() {
    printf '% 7s % 11s % 11s %s\n' 'SAVE' 'OLD (B)' 'NEW (B)' 'REPO'
    for repo; do
        file="${SOURCECRED_DIRECTORY}/data/${repo}/github/view.json.gz"
        if ! [ -f "${file}" ]; then
            printf >&2 'warn: no such file %s\n' "${file}"
            continue
        fi
        script="$(sed -e 's/^ *//' <<EOF
            repo = '${repo}'
            pre_size = $(<"${file}" gzip -dc | wc -c)
            post_size = $(<"${file}" wc -c)
            percentage = '%0.1f%%' % (100 * (1 - post_size / pre_size))
            p = '% 7s % 11d % 11d %s' % (percentage, pre_size, post_size, repo)
            print(p)
EOF
        )"
        python3 -c "${script}"
    done
}
```

</details>

Closes #750.

Test Plan:
Comparing the raw old version with the decompressed new version shows
that they are identical:

```
$ <~/tmp/sourcecred/data/sourcecred/example-github/github/view.json \
> shasum -a 256 -
63853b9d3f918274aafacf5198787e18185a61b9c95faf640a1e61f5d11fa19f  -
$ <~/tmp/sourcecred/data/sourcecred/example-github/github/view.json.gz \
> gzip -dc | shasum -a 256
63853b9d3f918274aafacf5198787e18185a61b9c95faf640a1e61f5d11fa19f  -
```

Additionally, `yarn test --full` passes, and `yarn start` still loads
data and runs PageRank properly.

wchargin-branch: gzip-relational-view
2018-09-01 10:42:30 -07:00
William Chargin 0c2908dbfb
Retry GitHub queries with exponential backoff (#699)
Summary:
This patch adds independent exponential backoff to each individual
GitHub GraphQL query. We remove the fixed `GITHUB_DELAY_MS` delay before
each query in favor of this solution, which requires no additional
configuration (thus resolving a TODO in the process).

We use the NPM module `retry` with its default settings: namely, a
maximum of 10 retries with factor-2 backoff starting at 1000ms.
Empirically, it seems very unlikely that we should require much more
than 2 retries for a query. (See Test Plan for more details.)

This is both a short-term unblocker and a good kind of thing to have in
the long term.

Test Plan:
Note that `yarn test --full` passes, including `fetchGithubRepoTest.sh`.
Consider manual testing as follows.

Add `console.info` statements in `retryGithubFetch`, then load a large
repository like TensorFlow, and observe the output:

```shell
$ node bin/sourcecred.js load --plugin github tensorflow/tensorflow 2>&1 | ts -s '%.s'
0.252566 Fetching repo...
0.258422 Trying...
5.203014 Trying...
[snip]
1244.521197 Trying...
1254.848044 Will retry (n=1)...
1260.893334 Trying...
1271.547368 Trying...
1282.094735 Will retry (n=1)...
1283.349192 Will retry (n=2)...
1289.188728 Trying...
[snip]
1741.026869 Ensuring no more pages...
1742.139978 Creating view...
1752.023697 Stringifying...
1754.697116 Writing...
1754.697772 Done.
```

This took just under half an hour, with 264 queries total, of which:
  - 225 queries required 0 retries;
  - 38 queries required exactly 1 retry;
  - 1 query required exactly 2 retries; and
  - 0 queries required 3 or more retries.

wchargin-branch: github-backoff
2018-08-22 11:37:29 -07:00
William Chargin ad0e98ac2c
Add `createRelativeHistory` history implementation (#666)
Summary:
See #643 and the module docstring on `createRelativeHistory.js` for
context and explanation.

This patch adds `history@^3.0.0` as an explicit dependency—previously,
we were depending on it only implicitly through `react-router` (which
was fine then, but is not now). The dependency is chosen to match the
version specified in `react-router`’s `package.json`.

Test Plan:
Extensive unit tests included, with full coverage; `yarn test` suffices.

wchargin-branch: createRelativeHistory
2018-08-15 12:01:27 -07:00
William Chargin 3eb2b6eec6
Add a favicon (#637)
Summary:
In addition to the obvious benefit of having a favicon, this gets rid of
a 404 Not Found error on our home page, tremendously boosting our hacker
cred.

Test Plan:
The favicon is displayed in both `yarn start` and the static site (as a
result of the build script). The added build test fails before this
change.

wchargin-branch: add-favicon
2018-08-10 13:15:49 -07:00
William Chargin 8f2d2cd5cd
Remove service workers entirely (#635)
Summary:
This is a follow-up to #514, wherein we disabled new service workers and
instructed any existing service workers to self-destruct. (See that PR
for the rationale.) This commit removes them from our codebase entirely,
enabling us to slim down our build process and our build output.

Test Plan:
Running `yarn start` still works. Building the static site and exploring
it works, too.

wchargin-branch: remove-sw
2018-08-10 12:49:45 -07:00
William Chargin baa0cbff1b
Add `sharness` for shell-based testing (#597)
Summary:
We will shortly want to perform testing of shell scripts; it makes the
most sense to do so via the shell. We could roll our own testing
framework, but it makes more sense to use an existing one. By choosing
Sharness, we’re in good company: `go-ipfs` and `go-multihash` use it as
well, and it’s derived from Git’s testing library. I like it a lot.

For now, we need a dummy test file; our test runner will fail if there
are no tests to run. As soon as we have a real test, we can remove this.

This commit was generated by following the “per-project installation”
instructions at https://github.com/chriscool/sharness, and by
additionally including that repository’s `COPYING` file as
`SHARNESS_LICENSE`, with a header prepended. I considered instead adding
Sharness as a submodule, which is supported and has clear advantages
(e.g., you can update the thing), but opted to avoid the complexity of
submodules for now.

Test Plan:
Create the following tests in the `sharness` directory:

```shell
$ cat sharness/good.t
#!/bin/sh
test_description='demo of passing tests'
. ./sharness.sh
test_expect_success "look at me go" true
test_expect_success EXPENSIVE "this may take a while" 'sleep 2'
test_done
# vim: ft=sh
$ cat sharness/bad.t
#!/bin/sh
test_description='demo of failing tests'
. ./sharness.sh
test_expect_success "I don't feel so good" false
test_done
# vim: ft=sh
```

Note that `yarn sharness` and `yarn test` fail appropriately. Note that
`yarn sharness-full` fails appropriately after taking two extra seconds,
and `yarn test --full` runs the latter. Each failure message should
print the name of the failing test case, not just the suite name, and
should indicate that the passing tests passed.

Then, remove `sharness/bad.t`, and note that the above commands all
pass, with the `--full` variants still taking longer.

Finally, remove `sharness/good.t`, and note that the above commands all
pass (and all pass quickly).

wchargin-branch: add-sharness
2018-08-06 12:56:25 -07:00
William Chargin 48275590ba
Remove `clean-webpack-plugin` (#578)
Summary:
As of #577, this is no longer needed.

Test Plan:
Running `yarn && yarn test --full` suffices.

wchargin-branch: remove-clean-webpack-plugin
2018-07-31 19:10:03 -07:00
William Chargin 480bdf1bc7
Refine the build directory cleaning logic (#577)
Summary:
We were asking the `clean-webpack-plugin` to remove the `build/`
directory in all cases. However, Webpack accepts a command-line
parameter `--output-path`. When such a parameter is passed, we would be
removing the wrong directory.

The proper behavior is to remove “whatever the actual output path is”.
Webpack exposes this information, but it appears that the
`clean-webpack-plugin` does not take advantage of it. Therefore, this
commit includes a small Webpack plugin to do the right thing.

Test Plan:
Test that the behavior is correct when no output directory is specified:
```
mkdir -p build && touch build/wat && yarn build && ! [ -e build/wat ]
```

Test that the behavior is correct with an explicit `--output-path`:
```
outdir="$(mktemp -d)" && touch "${outdir}/wat" && \
    yarn build --output-path "${outdir}" && \
    ! [ -e "${outdir}/wat" ]
```

Test that the plugin refuses to remove the root directory:

```
! yarn build --output-path . && \
    sed -i '/path: /d' config/makeWebpackConfig.js && ! yarn build
```

(Feel free to comment out the actual `rimraf.sync` line in the plugin
when testing this.)

wchargin-branch: clean-actual-build-directory
2018-07-31 15:27:32 -07:00
William Chargin 3b5ad594bd
package.json: reorganize test commands (#571)
Summary:
Running `yarn test` (equiv. `npm test` or `npm run test`) now runs all
checks. It takes the place of the former `yarn travis`. This is more in
line with the expectation of a top-level `test` command: if it passes,
your code is good.

The `unit` command now runs Jest once, not in watch mode. It takes the
place of the former `ci-test`. To run tests in watch mode, run any of
the following:

  - `yarn unit --watch`, or
  - `npm run unit -- --watch`, or
  - `npm unit -- --watch`.

This behavior is more consistent with the standard behavior of commands
like `make test`. It is also empirically what @wchargin and
@decentralion want most of the time.

Test Plan:
Verify that each of the scripts `test`, `unit`, and `coverage` passes.
Verify that each of the aforementioned `--watch` invocations works.
Verify that `.travis.yml` has the correct `script:` command.

wchargin-branch: reorganize-test-command
2018-07-31 10:53:10 -07:00
William Chargin 0128df8c18
new-webpack: replace old scripts in `package.json` (#569)
Test Plan:
Run `yarn start` and note that everything checks out.

Run `yarn build && (cd build/ && python -m SimpleHTTPServer)` and note
that everything checks out, except that the static assets are of course
not included in the build.

wchargin-branch: webpack-replace-scripts
2018-07-30 18:10:54 -07:00
William Chargin b45ef739fe
new-webpack: clean `build/` before prod (#568)
Summary:
In our current system, we build by invoking `scripts/build.js`, which
begins by removing the `build/` directory. This behavior is nice,
because it prevents cross-contamination between builds. In this commit,
we add a plugin to achieve the same result from directly within Webpack.

Test Plan:
Run

```
mkdir -p ./build
touch ./build/wat
NODE_ENV=production node ./node_modules/.bin/webpack \
    --config config/makeWebpackConfig.js
```

and ensure that `./build/wat` does not exist after the build completes.

wchargin-branch: webpack-clean-build
2018-07-30 18:01:47 -07:00
William Chargin 873eca6350
Upgrade Flow to v0.76.0 (#546)
Summary:
In addition to a routine libdef update, we also need to work around a
particularly nasty new bug in Flow, which requires `any`-casts that are
even more unsafe than usual. That said, I think that it’s worth that
cost to remain up to date with Flow, so that we can amortize future such
issues.

Test Plan:
Running `yarn travis --full` passes.

wchargin-branch: upgrade-flow-v0.76.0
2018-07-27 15:54:59 -07:00
Dandelion Mané ec5b76a83d
Upgrade oclif packages to latest (#542)
This required adding a [files property] to the package.json,
otherwise oclif started complaining.

Test plan: I manually tested both CLI commands, and they seem fine.

[files property]: https://docs.npmjs.com/files/package.json#files
2018-07-27 14:32:30 -07:00
Dandelion Mané deaba09d00
Upgrade react and react-dom to latest (#541)
Test plan: `yarn travis --full`, as well as thorough manual frontend
testing
2018-07-27 13:27:19 -07:00
Dandelion Mané 28a31cfcd6
Reorganize dependencies and devDependencies (#540)
This commit is a good faith effort to separate our dependencies (code
that SourceCred app or CLI require to run) from devDependencies (all
other deps) in our package.json.

We don't have any actual dependents, so it's hard to test this
distinction. Hence, it's a good faith effort.

Test plan:
`rm -r node_modules && yarn && yarn travis` works.
2018-07-27 12:38:40 -07:00
Dandelion Mané 6b13ab64a0
Remove unused tensorflowjs dependency (#539)
Test plan: Travis
2018-07-27 12:29:04 -07:00
Dandelion Mané 797fb6331d
Upgrade jest to 23.4.1 (#537)
Flow types regenerated via flow-typed install jest@23.4.1

Test plan: Travis
2018-07-27 12:28:04 -07:00
Dandelion Mané e06b88364e
Add jest-fetch-mock as dev dependency (#528)
Also add config/jest/setupJest.js so we can configure jest-fetch-mock

Test plan: I have verified that mocked fetch works as expected in a
downstream commit.
2018-07-26 15:08:14 -07:00
William Chargin df76975fae Add static-site-generator-webpack-plugin
wchargin-branch: add-ssgwp
2018-07-23 13:29:48 -07:00
William Chargin 04acce4a4c Remove dependency on react-router-dom
Test Plan:
Note that nothing breaks, because we don’t actually have any routing
right now.

wchargin-branch: remove-rrv4
2018-07-23 13:29:48 -07:00
William Chargin 767a64bbe7 Add dependency on react-router^3.2.1
Summary:
We’re switching from RRv4 to RRv3. We’ll remove RRv4, which is currently
unused, in the next commit.

wchargin-branch: add-rrv3
2018-07-23 13:29:48 -07:00
William Chargin a5d19c80aa Remove `babel-plugin-flow-react-proptypes` (#457)
Summary:
Pending the resolution of brigand/babel-plugin-flow-react-proptypes#201,
we’re removing this plugin from our build, because it results in
incorrect code generation. We’ll be happy to add it back if the bug is
fixed.

Test Plan:
Fingers crossed.

wchargin-branch: remove-bpfrpt
2018-06-29 17:11:04 -07:00
William Chargin 0cc2907e9e
Add dependency on `commonmark` (#440)
Summary:
We plan to use this to more intelligently extract references from GitHub
text content. See #432.

Test Plan:
In a Node shell, running

```js
const cm = require("commonmark");
var parser = new cm.Parser();
var ast = parser.parse("Hello\nworld");
var html = new cm.HtmlRenderer({softbreak: " "}).render(ast);
console.log(html);
```

prints `<p>Hello world</p>`.

wchargin-branch: commonmark
2018-06-28 17:01:31 -07:00
William Chargin 55d5c58e05
Add a `coverage` package script (#365)
Summary:
We’re not mandating anything about coverage right now, but by making it
easier to track coverage perhaps people will organically become more
motivated to write good tests.

Test Plan:
Run `yarn coverage`, and then open `coverage/lcov-report/index.html`.

wchargin-branch: coverage
2018-06-08 11:32:27 -07:00
Dandelion Mané 6585700c0c
Upgrade flow to 0.73 (#338)
It still passes :)

Test plan: Travis
2018-06-04 14:22:10 -07:00
Dandelion Mané 3ac051b16c
Upgrade prettier to 1.13 (#335)
Release notes: https://prettier.io/blog/2018/05/27/1.13.0.html
2018-06-04 13:02:17 -07:00
William Chargin ab10e1746c
Remove `lint-staged` pre-commit hook (#300)
Summary:
This just slows down commits by a few seconds. We `check-pretty` in
Travis, so this doesn’t actually catch anything—and, anecdotally, it has
never caught anything for me because I automatically run `prettier` on
save and also (almost) always run Travis before pushing.

Test Plan:
Run `git commit --amend --no-edit` and note that it is now fast!

wchargin-branch: no-lint-on-commit
2018-05-25 19:25:43 -07:00
Dandelion Mané 180c3454af
Upgrade babel-plugin-flow-react-proptypes to 23 (#294)
Will allow us to use opaque types in #292 without breaking the build
in #293.

Test plan:
If travis passes, we're good.
2018-05-21 11:11:21 -07:00
William Chargin f31d2c517d
Upgrade Flow to v0.72.0 (#285)
Summary:
A few changes were made to code that is correct (as far as I can tell),
but for which Flow can no longer infer a type parameter. The change is a
bit more annoying than it otherwise would be, because this particular
file is run directly via node and so must use Flow’s comment syntax for
type annotations, but Prettier breaks such comments in the cases that we
need. We work around this by rewriting the original code to avoid the
need for comments.

Test Plan:
In addition to standard CI, run `yarn build` and then run a server from
`build/`, to see that the production build produces a working bundle.
(That the app loads and renders is sufficient.)

wchargin-branch: upgrade-flow-v0.72.0
2018-05-15 17:09:29 -07:00
Dandelion Mané 6ca4f77b6d
Add dependency on tfjs-core (#250) 2018-05-09 10:22:48 -07:00
Dandelion Mané 0149d74971 Add react-router-dom
This commit adds a npm and flow-typed dependency, with no functional
change.

Test plan: `yarn travis` passes.
2018-05-08 12:55:38 -07:00
William Chargin 18ddbfff3e
Add dependency on express (#233)
wchargin-branch: express
2018-05-07 20:05:52 -07:00
Dandelion Mané 72ca52f579
Change package.json name to sourcecred (#215) 2018-05-05 00:04:42 -07:00
Dandelion Mané fa4082c95b
Minimal toy oclif integration (#214)
This commit adds [oclif] as a command-line framework. It is successfully
integrated with webpack.

[oclif]: https://github.com/oclif/oclif

Usage:
`yarn backend` to build the cli.
`node bin/sourcecred.js` to launch the CLI and see usage
`node bin/sourcecred.js example` for one example command
`node bin/sourcecred.js goodbye` for another example command
2018-05-04 19:28:37 -07:00
William Chargin e9dbdeca96
Target latest Node for backend applications (#213)
Summary:
Consequently, Babel won’t transform classes to their roughly equivalent
ES5 counterparts, etc.

Test Plan:
Create `src/classy.js` with `class X {}; console.log(X);`. Then, add a
build target for `classy: resolveApp("src/classy.js"),` in `paths.js`.
Use `yarn backend` and inspect the contents of `bin/classy.js`; in
particular, look at the definition of `X` (whatever the argument to
`console.log` is). Before this commit, the result will be a big
complicated mess. After this commit, it will be `class X {}`.

Note also that `yarn travis --full` passes, indicating that the two
manual tests, which call out to the utilities in `bin/`, still work.

wchargin-branch: target-node
2018-05-04 19:22:39 -07:00
William Chargin b5e894bbb4
Fork `babel-preset-react-app` into config/ (#212)
Summary:
We want to change this configuration so that our compilation of backend
applications can target latest Node. This commit forks the current
configuration so that we can modify it easily.

Test Plan:
Both `yarn start` and `yarn travis` work. The generated backend
applications work, too.

wchargin-branch: fork-babel-config
2018-05-04 19:19:45 -07:00
Dandelion Mané de5542de6a
Exclude node modules from backend build (#211)
Setup following directions from [webpack-node-externals]

[webpack-node-externals]: https://www.npmjs.com/package/webpack-node-externals

This unblocks #210.

Test plan: `yarn backend` still succeeds, and the binary scripts still
work. The resultant binaries are much smaller, as seen below (note build
time is the same).

before:
```
❯ yarn backend
yarn run v1.5.1
$ node scripts/backend.js
Building backend applications...
Compiled successfully.

File sizes after gzip:

  231.37 KB  bin/printCombinedGraph.js
  199.5 KB   bin/fetchAndPrintGithubRepo.js
  46.41 KB   bin/cloneAndPrintGitGraph.js
  21.48 KB   bin/createExampleRepo.js
  17.71 KB   bin/loadAndPrintGitRepository.js

Build completed; results in 'bin'.
Done in 4.46s.
```

after:
```
❯ yarn backend
yarn run v1.5.1
$ node scripts/backend.js
Building backend applications...
Compiled successfully.

File sizes after gzip:

  27.78 KB  bin/printCombinedGraph.js
  12.73 KB  bin/cloneAndPrintGitGraph.js
  12.41 KB  bin/fetchAndPrintGithubRepo.js
  6.03 KB   bin/loadAndPrintGitRepository.js
  5.52 KB   bin/createExampleRepo.js

Build completed; results in 'bin'.
Done in 4.28s.
```
2018-05-04 16:31:39 -07:00
William Chargin 38f4121ce9
Implement a custom CI script (#189)
Summary:
This CI script accomplishes two tasks:
 1. It speeds up our build by parallelizing where possible.
 2. It opens the possibility for running Travis cron jobs.

Currently, this script by default does the same amount of work as our
current CI script. However, I’d like to move `yarn backend` into the
list of basic actions: a backend build failure should fail CI.

Note: this script is written to be executable directly by Node, so we
can’t use Flow types with the standard syntax. Instead, we use the
comment syntax: https://flow.org/en/docs/types/comments/

Test Plan:
The following should pass with useful output:
  - `npm run travis`
  - `GITHUB_TOKEN="your_github_token" npm run travis -- --full`

The following should fail with useful output:
  - `npm run travis -- --full` (fail)

To test different failure modes, it can be helpful to add
```js
    {id: "doomed", cmd: ["false"], deps: []},
    {id: "orphan", cmd: ["whoami"], deps: ["who", "are", "you"]},
```
to the list of `basicTasks` in `travis.js`.

To test performance:
```shell
$ time node ./config/travis.js >/dev/null 2>/dev/null

real    0m8.306s
user    0m20.336s
sys     0m1.364s

$ time bash -c \
>     'npm run check-pretty && npm run lint && npm run flow && CI=1 npm run test' \
>     >/dev/null 2>/dev/null

real    0m12.427s
user    0m13.752s
sys     0m0.804s
```
A 50% savings is not bad at all—and the raw time saved should only
improve from here on, as the individual steps start taking more time.

wchargin-branch: custom-ci
2018-05-02 16:10:03 -07:00
William Chargin 5af5748ed7
Convert in-memory Git repos to cred graphs (#169)
Test Plan:
This snapshot test is too unwieldy to actually read—it’s 1000 lines of
opaque SHAs and thrice-stringified JSON objects—so it should be
interpreted as a regression test only. The programmatic tests should
suffice.

wchargin-branch: wip-git-create-graph
2018-04-30 15:23:37 -07:00
William Chargin f3a440244e
Fix all lint errors, adding a lint CI step (#175)
Test Plan:
Run `yarn lint` and `yarn travis` and observe success. Add something
that triggers a lint warning, like `const zzz = 3;`; re-run and observe
failures.

wchargin-branch: lint
2018-04-30 14:52:28 -07:00
Dandelion Mané 28e686c369
Remove `address.sortedByAddress` (#161)
Previously, the address module exported `sortedByAddress`, a utility
function that sorts an array of `Addressable`s. This function was only
used in test code.

This commit replaces it with generic usage of `lodash.sortBy`. This
reduces the API surface area of the module, and removes test-only code
from the exported api.

New dependency added: `lodash.sortby`
https://www.npmjs.com/package/lodash.sortby
2018-04-27 14:29:49 -07:00
William Chargin 418b745d7c
Load Git repositories into memory (#139)
Summary:
In this newly added module, we load the structural state of a git
repository into memory. We do not load into memory the contents of any
blobs, so this is not enough information to perform any analysis
requiring file diffing. However, it is sufficient to develop a notion of
“this file was changed in this commit”, by simply diffing the trees.

Test Plan:
Unit tests added; `yarn test` suffices. Reading these snapshots is
pretty easy, even though they’re filled with hashes:
  - First, read over the commit specifications on lines 69–83 of
    `loadRepository.test.js`, so you know what to expect.
  - In the snapshot file, keep handy the time-ordered list of commit
    SHAs at the bottom of the file, so that you know which commit SHA is
    which.
  - To verify that the large snapshot is correct: for each commit, read
    the corresponding tree object and make sure that the structure is
    correct.
  - To verify the small snapshot, just check that it’s the correct
    subset of the large snapshot.
  - If you want to verify that the SHA for a blob is correct, open a
    terminal and run `git hash-object -t blob --stdin`; then, enter the
    content of the blob and press `<C-d>`. The result is the blob SHA.

To run a sanity-check on a large repository: apply the following patch:

<details>
<summary>Patch to print out statistics about loaded repository</summary>

```diff
diff --git a/config/paths.js b/config/paths.js
index d2f25fb..8fa2023 100644
--- a/config/paths.js
+++ b/config/paths.js
@@ -62,5 +62,6 @@ module.exports = {
     fetchAndPrintGithubRepo: resolveApp(
       "src/plugins/github/bin/fetchAndPrintGithubRepo.js"
     ),
+    loadRepository: resolveApp("src/plugins/git/loadRepository.js"),
   },
 };
diff --git a/src/plugins/git/loadRepository.js b/src/plugins/git/loadRepository.js
index a76b66c..9380941 100644
--- a/src/plugins/git/loadRepository.js
+++ b/src/plugins/git/loadRepository.js
@@ -106,3 +106,7 @@ function findTrees(git: GitDriver, rootTrees: Set<Hash>): Tree[] {
   }
   return result;
 }
+
+const result = loadRepository(...process.argv.slice(2));
+console.log("commits", result.commits.size);
+console.log("trees", result.trees.size);
```
</details>

Then, run `yarn backend` and put the following script in `test.sh`:

<details>
<summary>Contents for `test.sh`</summary>

```shell
#!/bin/bash
set -eu

repo="$1"
ref="$2"

via_node() {
    node bin/loadRepository.js "${repo}" "${ref}"
}

via_git() (
    cd "${repo}"
    printf 'commits '
    git rev-list "${ref}" | wc -l
    printf 'trees '
    git rev-list "${ref}" |
        while read -r commit; do
            git rev-parse "${commit}^{tree}"
            git ls-tree -rt "${commit}" \
                | grep ' tree ' \
                | cut -f 1 | cut -d ' ' -f 3
        done | sort | uniq | wc -l
)

echo
printf 'Running directly via git...\n'
time a="$(via_git)"

echo
printf 'Running Node script...\n'
time b="$(via_node)"

diff -u <(cat <<<"${a}") <(cat <<<"${b}")
```
</details>

Finally, run `./test.sh /path/to/some/repo origin/master`, and verify
that it exits successfully (zero diff). Here are some timing results on
SourceCred and TensorBoard:

  - SourceCred: 0.973s via Node, 0.327s via git.
  - TensorBoard: 30.836s via Node, 6.895s via git.

For TensorFlow, running via git takes 7m33.995s. Running via Node fails
with an out-of-memory error after 39 minutes, with 10GB RAM and 4GB
swap. See details below.

<details>
<summary>
Full timing details, commit SHAs, and OOM error message
</summary>

```
+ ./test.sh /home/wchargin/git/sourcecred 01634aabcc

Running directly via git...

real	0m0.327s
user	0m0.016s
sys	0m0.052s

Running Node script...

real	0m0.973s
user	0m0.268s
sys	0m0.176s
+ ./test.sh /home/wchargin/git/tensorboard 7aa1ab9d60671056b8811b7099eec08650f2e4fd

Running directly via git...

real	0m6.895s
user	0m0.600s
sys	0m0.832s

Running Node script...

real	0m30.836s
user	0m3.216s
sys	0m10.588s
+ ./test.sh /home/wchargin/git/tensorflow 968addadfd4e4f5688eedc31f92a9066329ff6a7

Running directly via git...

real	7m33.995s
user	5m21.124s
sys	1m5.476s

Running Node script...
FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory
 1: node::Abort() [node]
 2: 0x121a2cc [node]
 3: v8::Utils::ReportOOMFailure(char const*, bool) [node]
 4: v8::internal::V8::FatalProcessOutOfMemory(char const*, bool) [node]
 5: v8::internal::Factory::NewFixedArray(int, v8::internal::PretenureFlag) [node]
 6: v8::internal::DeoptimizationInputData::New(v8::internal::Isolate*, int, v8::internal::PretenureFlag) [node]
 7: v8::internal::compiler::CodeGenerator::PopulateDeoptimizationData(v8::internal::Handle<v8::internal::Code>) [node]
 8: v8::internal::compiler::CodeGenerator::FinalizeCode() [node]
 9: v8::internal::compiler::PipelineImpl::FinalizeCode() [node]
10: v8::internal::compiler::PipelineCompilationJob::FinalizeJobImpl() [node]
11: v8::internal::Compiler::FinalizeCompilationJob(v8::internal::CompilationJob*) [node]
12: v8::internal::OptimizingCompileDispatcher::InstallOptimizedFunctions() [node]
13: v8::internal::Runtime_TryInstallOptimizedCode(int, v8::internal::Object**, v8::internal::Isolate*) [node]
14: 0x12dc8b08463d
```
</details>

wchargin-branch: load-git-repositories

# Please enter the commit message for your changes. Lines starting
# with '#' will be kept; you may remove them yourself if you want to.
# An empty message aborts the commit.
#
# Date:      Mon Apr 23 23:02:14 2018 -0700
#
# HEAD detached at origin/wchargin-load-git-repositories
# Changes to be committed:
#	modified:   package.json
#	new file:   src/plugins/git/__snapshots__/loadRepository.test.js.snap
#	new file:   src/plugins/git/loadRepository.js
#	new file:   src/plugins/git/loadRepository.test.js
#
# Untracked files:
#	out
#	runtests.sh
#	src/plugins/artifact/editor/ArtifactSetInput.js
#	src/plugins/git/repository.js
#	test.sh
#	todo
#
2018-04-24 13:57:10 -07:00