Summary:
This way, our frontend can talk to a backend that can read from the
filesystem (among other things).
Paired with @decentralion.
Test Plan:
```
$ yarn backend
$ SOURCECRED_DIRECTORY=/tmp/srccrd yarn start
$ # verify that the browser looks good
$ mkdir /tmp/srccrd
$ echo hello >/tmp/srccrd/world
$ curl localhost:3000/api/v1/data/world
hello
$ curl localhost:4000/api/v1/data/world
hello
```
wchargin-branch: webpack-proxy
Fixing the flow error corresponded to (correctly) documenting that the
GitHub token is mandatory, not optional.
Test plan: `yarn travis --full` still passes.
Summary:
This way, different plugins can have `LocalStore`s with different cache
keys.
Test Plan:
Note that the app (`yarn start`) preserves the local store values from
before this change, and that updates continue to work.
wchargin-branch: generic-localstore
This commit modifies App.js to use routing, such that it's possible to
navigate between a home screen, and the Artifact Editor.
Test plan: Run `yarn start`, and navigate between the home screen and
artifact editor. Verify that the artifact editor still works as
expected.
This commit adds src/app/App.js, which proxies in the frontend from
src/plugins/artifact/editor/App.js. The observed behavior (run `yarn
start`; see Artifact Editor) is unchanged.
Test plan: Observe that `yarn start` has the same behavior, and travis
passes.
This commit executes a micro-refactor to move all top-level app setup
code out of src/plugins/artifact/editor and into src/app. The observed
behavior from `yarn start`, which is to show the artifact editor, is
unchanged.
Summary:
We need a way for our web applications to interact with data on the
filesystem. In this commit, we introduce a webserver that serves
statically from two directory trees: first, the result of a live-updated
Webpack build; second, the SourceCred data directory.
Test Plan:
Run `yarn backend` and `node ./bin/sourcecred.js start`. When ready,
navigate to the server’s root route in a web browser. Note that a nice
React app is displayed. Then, change something in that React app source.
Note that the server console displays Webpack’s update messages, and
that refreshing the page in the browser renders the new version of the
app. Finally, visit
/__data__/graphs/sourcecred/example-github/graph.json
in the browser to see the graph for the example repository, assuming
that you had generated its graph previously.
wchargin-branch: start
This script ensures that either //@flow or //@no-flow is present in
every js file. Every existing js file that would fail this check has
been given //@no-flow, we should work to remove all of these in the
future.
Test plan:
I verified that `yarn travis` fails before fixing the other js files,
and passes afterwards.
Now that we have repository nodes (#171), it makes sense that the Github
porcelain should provide a way to wrap the entire graph, and provide
easy access for the various repositories. This adds a `Porcelain` class
to fulfill that need.
The `Porcelain` is very straightforward: it takes in the whole graph,
and gives a way to get all the Repositories, or to request a particular
Repository by owner/name. In the odd case wherein a graph contains
multiple repository nodes with the same owner and name, an error is
thrown. Per standard JS map semantics (bleh), it can return undefined if
there is no matching repository.
Test plan:
See that the unit tests now use the standard behavior, and a test
verifies behavior for non-existant repositories. I don't have a test
case where there are multiple repo nodes, but that itself would be an
error, so throwing an error in that case is just defensive programming.
This commit creates a new node type in the GitHub graph: the REPOSITORY
node. The REPOSITORY node has the following payload properties:
- url (string)
- name (string)
- owner (string)
Things this commit does:
- Add new node type and payload type (RepositoryNodePayload)
- Update parser to instantiate the new node type
- Update api.js to have Repository wrap the new node type (thus
Repository is a GitHub entity)
- Update snapshots
- Update users of GitHub node types to ensure they are exhaustive
Things that will come in a followon commit:
- Add CONTAINS edges from the repository to all its PRs and Issues
- Update the Repository porcelain to use those edges, rather than
scanning the graph for every possible Issue/PR (eventually those might
belong to other Repositories)
- Create a GitHubGraph abstraction in the porcelain, which makes it easy
to find all of the Repositories in a graph
Note that retrieving the repository owner technically involved fetching
the whole owner representation (as a GitHub user). I could have chosen
to add that user to the graph, with a "OWNS" edge pointing to the
repository. For simplicity's sake, I've declined to do that, and instead
just parse the owner's name directly.
Test plan:
Added tests to verify that the Repository porcelain entity has the right
properties. Combined with the snapshot tests, that should be sufficient.
When a GitHub user delete their account, all of their comments remain,
but with a `null` author. Previously, our code did not account for this
possibility. Now it does (by simply not adding an AUTHORSHIP edge).
Conveniently, our porcelain API already represents authors as a list, so
this doesn't require any change in porcelain API usage.
Test plan:
I did not add any unit tests, simply because
creating-and-deleting a GitHub user to make a repro seemed like a bit of
a pain. However, it is very unlikely that this bug will re-occur,
because the nullability of AuthorJSON is now enforced at the type level
inside graphql.js.
Also, running `node bin/sourcecred.js src-d go-git` now succeedsinstead
of failing.
Summary:
This solves two problems:
1. The “output directory” argument to `sourcecred graph` is also the
input directory to other commands, like `sourcecred analyze`
(hypothetically). In such cases, it would be nice for the flag to
have the same name, but clearly `--output-directory` does not always
make sense.
2. In addition to storing graphs, we’ll need to store other kinds of
data: settings, intermediate data for plugins to cache results, etc.
We should store these under a single umbrella.
With both of these problems in mind, it makes sense to create a
`SOURCECRED_DIRECTORY` flag under which we store all relevant data.
Test Plan:
Run:
```shell
$ yarn backend
$ node bin/sourcecred.js help graph
$ node bin/sourcecred.js graph sourcecred example-github
$ node bin/sourcecred.js graph sourcecred example-github -d /tmp/sorcecrod
$ SOURCECRED_DIRECTORY=/tmp/srccrd node bin/sourcecred.js graph sourcecred example-github
$ for dir in /tmp/{sourcecred,sorcecrod,srccrd}; do find "${dir}"; done
```
wchargin-branch: graph-directory
Our SourceCred CLI tool now ipmlements printCombinedGraph and
cloneAndPrintGitGraph, but with more principled implementations and
interfaces :)
Test plan:
`yarn travis --full` passes, so I didn't delete any needed test infra.
Previously, if a CLI command had an unhandled promise rejection, this
would result in a spurious success and zero exit value.
This commit causes all of our CLI commands to instead fail if they have
an unhandled promise rejection.
Test plan: Previously, `sourcecred graph src-d go-git` would claim to
succeed, although it actually fails due to an unrelated bug. After this
change is applied, it correctly fails to retrieve the GitHub graph (and
hte combine step is never run).
This commit pulls new information from GitHub about the url, name, and
owner of a GitHub repository. Part of #171.
Test plan: example-github-repo.json has been updated. `yarn travis
--full` passes. This should be sufficient.
Currently, we generate a `RepositoryJSON` object via querying GitHub.
That `RepositoryJSON` object has a `repository` field... which is weird,
and suggests we got the names slightly wrong.
This commit renames the top-level response to `GithubResponseJSON`, and
factors the `repository` field out as `RepositoryJSON`. Correspondingly,
the `addData` and `addRepository` methods on the parser are now
distinct.
This is a precursor for #171.
Test plan: This is a simple refactor; the fact that yarn travis passes
should be sufficient.
Test Plan:
Run `node bin/sourcecred.js graph sourcecred example-github` and note
the new output:
```
Storing graphs into: /tmp/sourcecred/sourcecred/example-github
Starting tasks
GO create-git
GO create-github
DONE create-git
DONE create-github
GO combine
DONE combine
Full results
DONE create-git
DONE create-github
DONE combine
Overview
Final result: SUCCESS
```
wchargin-branch: plugin-graph-label
Summary:
The `"PASS"` label only makes sense for tests. This commit makes the
labels configurable, so that the verbiage can make more sense in other
contexts, too.
Test Plan:
Apply a patch like
```diff
diff --git a/config/travis.js b/config/travis.js
index af0996b..b0ab3b6 100644
--- a/config/travis.js
+++ b/config/travis.js
@@ -10,7 +10,11 @@ function main() {
process.argv.includes("--full")
? "FULL"
: "BASIC";
- execDependencyGraph(makeTasks(mode)).then(({success}) => {
+ execDependencyGraph(makeTasks(mode), {
+ taskLaunchLabel: " YO ",
+ taskPassLabel: "WHEE",
+ taskFailLabel: "UHOH",
+ }).then(({success}) => {
process.exitCode = success ? 0 : 1;
});
}
```
and note that `GITHUB_TOKEN=none yarn travis --full` exhibits all
desired messages.
wchargin-branch: configurable-labels
Summary:
This commit implements the `sourcecred` command-line utility, which has
three subcommands:
- `plugin-graph` creates one plugin’s graph;
- `combine` combines multiple on-disk graphs; and
- `graph` creates all plugins’ graphs and combines them.
As an implementation detail, the `into.sh` script is very convenient,
avoiding needing to do any pipe management in Node (which is Not Fun).
When we build for release, we may want to factor that differently.
Test Plan:
To see it all in action, run `yarn backend`, and then try:
```
$ export SOURCECRED_GITHUB_TOKEN="your_token_here"
$ node ./bin/sourcecred.js graph sourcecred sourcecred
Using output directory: /tmp/sourcecred/sourcecred
Starting tasks
GO create-git
GO create-github
PASS create-github
PASS create-git
GO combine
PASS combine
Full results
PASS create-git
PASS create-github
PASS combine
Overview
Final result: SUCCESS
$ ls /tmp/sourcecred/sourcecred/
graph-github.json graph-git.json graph.json
$ jq '.nodes | length' /tmp/sourcecred/sourcecred/*.json
1000
7302
8302
```
The `node sourcecred.js graph` command takes 9.8s for me.
(The salient point of the last command is that the two small graphs have
node count adding up to the node count of the big graph. Incidentally,
we are [almost][1] at a nice round number of nodes in the GitHub graph.)
[1]: https://xkcd.com/1000/
wchargin-branch: cli
Summary:
To be honest, I have no idea what exactly this does or why it’s
necessary, but if we don’t do this then it is not possible to `import`
the exported member from a Webpack-bundled script. I’ve seen this
pattern before; one day I’ll actually figure out what it does. :-)
Test Plan:
Note that `yarn travis` (success) and `yarn travis --full` (failure; no
GitHub token) both have the expected behaviors.
wchargin-branch: execdependencygraph-export
This commit adds [oclif] as a command-line framework. It is successfully
integrated with webpack.
[oclif]: https://github.com/oclif/oclif
Usage:
`yarn backend` to build the cli.
`node bin/sourcecred.js` to launch the CLI and see usage
`node bin/sourcecred.js example` for one example command
`node bin/sourcecred.js goodbye` for another example command
Summary:
We’d like to use the same abstraction for creating multiple cred graphs
and then combining them together. This will enable us to do that.
Test Plan:
Run `yarn travis` to test the success case, and `yarn travis --full`
(without setting a `GITHUB_TOKEN`) to test the failure case.
wchargin-branch: execdepgraph
`printCombinedGraph` loads and prints a cross-plugin combined
contribution graph for a given GitHub repository.
It is a simple executable wrapper around `src/tools/loadCombinedGraph`.
Example usage:
`node bin/printCombinedGraph.js sourcecred example-git $GITHUB_TOKEN`
`cloneAndPrintGitGraph` clones a git repository, and generates a Git
object graph for that repository.
This can be run as follows:
```
yarn backend;
node bin/cloneAndPrintGitGraph sourcecred example-git
```
This commit also adds two utility modules:
* `cloneAndLoadRepository` , which clones a Git repository to a tmpdir,
parses the `Repository` data out, and then cleans up.
* `cloneGitGraph`, which calls `cloneAndLoadRepository` and `createGraph`
Test plan: These don't fit well into our CI, because they require
network access to clone repositories from GitHub. I verified that the
functions work via the demo script above.
fetchGithubGraph is a tiny module which is responsible for fetching
GitHub contribution data, and parsing it into a graph.
Test plan:
The function is trivial and does not merit explicit testing.
Summary:
If a commit causes a tree entry to change hash while keeping the same
name, we now add a BECOMES edge between the corresponding entries.
Test Plan:
Snapshot changes are readable enough to manually verify. Programmatic
tests also added.
wchargin-branch: graph-becomes-edges
Test Plan:
For the snapshot: verify that two of the BECOMES edges are the same as
those tested in `findBecomesEdgesForCommits` and that they have the
right commit hashes; then, verify that the remaining edge is correct.
wchargin-branch: high-level-becomes-repository
This adds MERGED_AS edges which link from a PullRequest to a Commit. It
adds a corresponding `mergedCommitHash` method on the porcelain PR that
returns the hash of the merged commit (if available).
I would have preferred to return a porcelain wrapper over the commit,
but since we don't have a porcelain Git api, it seemed preferrable to
return the hash as a string. Returning a Node would both break
consistency in the porcelain api, and be problematic as the node does
not necessarily exist in the api. To ensure that the hash is available
without parsing Addresses, I used the edge payload. :)
Test plan:
Inspect the snapshot changes in the graph (they are fairly readable) and
the api testing in api.test.js.
This commit adds a few fields to the PullRequest query fragment so that
we now retrieve merge commit shas. In cases where there is no merge
commit (ie the PR did not merge), the field is null. Observe that this
is the case for our example unmerged pull request.
Test plan: Inspect the changes to the demo data, and verify that they
are appropriate.
For the GitHub plugin to create edges pointing to commits from the Git
plugin, it needs a way to create the appropriate address given the
commit's hash. This commit exposes that functionality by moving
`makeAddress` out of the "createGraph" module and into a new "address"
module, and using it to implement `commitAddress`.
Test plan: The code is so trivial that I don't think it merits testing.
Our repository containing example GitHub data has been renamed from
"sourcecred/example-repo" to "sourcecred/example-github". This commit
updates all snapshots and tests to reflect this rename.
See [#190] for context.
The change is almost entirely straightforward; the only "interesting"
decision I made was to move the repo owner and repo name into the string
id for the Artifact Plugin addresses, as the id would otherwise not be
unique.
[#190]: https://github.com/sourcecred/sourcecred/issues/190#issuecomment-386362870
Summary:
Previously, a tree entry had exactly one `HAS_CONTENTS` edge, unless the
tree entry corresponded to a submodule commit, in which case we had no
information at all. Now, submodule commit tree entries point to zero or
more `SUBMODULE_COMMIT` nodes. In the vast majority of cases, there will
be exactly one such node—however, it is possible that a particular tree
entry could correspond to multiple submodules (clone two identical
submodules with different URLs) or none at all (some manually edited
`.gitmodules` or other corruption).
Test Plan:
The snapshot changes are easy enough to read and verify (two new nodes
and five new edges). Additionally, the path-to-submodule `createGraph`
test has been updated.
wchargin-branch: graph-submodule-urls
Summary:
In Git, a tree may point to a commit directly. In our graph, we’d like
to represent “submodule commits” explicitly, because, a priori, we do
not know the repository to which the commit belongs. A submodule commit
node will store the hash of the referent commit, as well as the URL to
the subproject as listed in the .gitmodules file. In this commit, we
load the list of those URLs into the in-memory repository.
Shout-out to `git` for having an excellent command-line API:
the `--blob` argument to `git-config` is perfect for this situation.
Test Plan:
Snapshot updates are readable and sufficient.
wchargin-branch: load-submodule-urls
Summary:
Prior to this commit, given a `Tree` node with an edge to a `TreeEntry`
node, there was no way to tell what the entry name was other than
parsing the ID (which should never be required). This adds appropriate
data to the payload of a `TreeEntry`, and also to the inclusion edge (so
that if you only have the edge, you don’t have to fetch the entry).
Test Plan:
Snapshot changes are readable.
wchargin-branch: treeentry-metadata
Summary:
This can be useful for speed, but it can also be important for
correctness (at least theoretically): if we run both these scripts
concurrently, then we don’t want one of them to squash the `bin`
directory while the other is about to invoke an executable therein.
One might note that the diffs to the two files in this commit are
virtually identical, and indeed the files themselves are quite similar.
I’d prefer to keep the duplication for now; if we really need a Bash
snapshot testing framework, we can factor one out.
Test Plan:
Run each script with `--help`, with `--build` and `--no-build`, and with
and without `-u`.
wchargin-branch: optional-rebuild
Summary:
To avoid confusion, we simultaneously remove unused capturing groups.
This is not strictly necessary, but it makes the code less brittle.
Test Plan:
The newly added test fails before the change to `findReferences.js`.
wchargin-branch: url-punctuation
This commit adds the `addReferenceEdges()` method to the GitHub parser,
which examines all of the posts in the parsed graph and adds References
edges when it detects references between posts. As an example, `Hey
@wchargin, take a look at #1337` would generate two references.
We currently parse the following kinds of references:
- Numeric references to issues/PRs.
- Explicit in-repository url references (to any entity)
- @-author references
We do not parse:
- Cross-repository urls
- Cross-repository shortform (e.g. `sourcecred/sourcecred#100`)
`Parser.parse` calls `addReferenceEdges()`, so no change is required by
consumers to have reference edges added to their graphs.
The GitHub porcelain API layer now includes methods for retreiving the
entities referenced by a post.
Test plan:
This commit is tested both via snapshot tests, and explicit testing at
api layer. (Actually, the creation of the porcelain API layer was
prompted by wanting a cleaner way to test this commit.) I recommend
inspecting the snapshot tests for sanity, but mostly relying on the
tested behavior in api.test.js.
I added a lot of new comments that have url references to different
types of GitHub entities, e.g. to pull request review comments.
The commit was generated by running the example repo fetcher, and
running yarn test and updating snapshots.