Summary:
This implements the following functions for the unified addresses:
- assertions: `assertValid`, `assertValidParts`
- injection: `fromParts`
- projection: `toParts`
(These functions depend on each other for testing, so we implement them
together.)
Test Plan:
Unit tests included. Run `yarn travis`.
wchargin-branch: address-assertion-injection-projection
Summary:
This commit implements all the code needed for the top-level
`makeAddressModule` function, without implementing any of the address
functions. This mostly comprises handling of errors in the module
options.
Test Plan:
Unit tests added. Run `yarn travis`.
wchargin-branch: address-error-handling
Summary:
We have `NodeAddress` and `EdgeAddress`, which are opaque aliases of
`string` each with separate associated functions. We really want to keep
this setup: having the address types be structurally distinct is very
nice. But currently the implementation is extremely repetitive. Core
functionality is implemented twice, once for nodes and once for edges.
The test code is even worse, as it is forced to include ugly, hacky,
parametric generalizations to test both implementations—which are really
the same code, anyway!
In this commit, we introduce a system to unify the _implementations_
while keeping the _APIs_ separate. That is, users still see separate
opaque types `NodeAddressT` and `EdgeAddressT`. Users now also see
separate modules `NodeAddress` and `EdgeAddress`, each of which
implements the same interface for its appropriate type. These modules
are each implemented by the same address module factory.
To get this to work, we clearly need to parameterize the module type
over the address type. The problem is getting this to work in a way that
interacts nicely with the opaque types. The trick is to let the factory
always return a transparent module at type `string`, but to then
specialize the type of the resulting module in the module in which the
underlying type of the opaque type is known.
This commit includes specifications for all functions that are in the
current version of the API, but includes only as much implementation
code as is needed to convince me that tests and Flow are actually
running (i.e., very little). I’ll send implementations in separate PRs
for easier review.
The preliminary modules created in this commit _are_ exported from the
graph module, even though they are incomplete. This is so that we can be
sure that nothing will catch fire in Flow-land when we try to export
them (which is plausible, given that we have nontrivial interactions
between opaque types and parametric polymorphism).
Test Plan:
Unit tests included. Run `yarn travis`.
wchargin-branch: address-unified-foundations
Summary:
In actual code, we almost always call `neighbors` with a specified
direction. Usually, you want to inspect some relation like “parents of
this commit” or “GitHub nodes referenced by this comment”, and so the
edge direction matters. In each of the above cases, forgetting to
include the direction would introduce a bug: you’d get parents _and
children_ of a commit, or GitHub nodes referenced by _or that refer to_
a comment. It’s easy to forget this when writing the code, so we prefer
to make an explicit direction required, and allow people to specify
`Direction.ANY` in the case that that is what they want.
(In other words: we want to make the common thing easy, the uncommon
thing possible, and the wrong thing impossible.)
A similar situation holds for filters. By forcing clients to think about
what kinds of edges they want to follow to what kinds of nodes, we
encourage them to write more robust code. As before, if clients do want
to consider all nodes or all edges, they can pass the appropriate empty
address (`nodeAddress([])` or `edgeAddress([]`), which is a prefix of
every address.
Therefore, we require that clients pass an `options` object to
`neighbors`, and we furthermore require that each of the three options
be present.
Paired with @decentralion, in spirit.
Test Plan:
None; this changes the API for a function that has no implementation or
clients.
wchargin-branch: neighbors-options-required
Summary:
This saves clients from having to pollute their global namespace with
`IN` and `OUT` (and, soon, `ANY`). Calls now look like:
```js
myNode.neighbors({direction: Direction.OUT});
```
Callers who prefer to pollute their namespaces are of course still
welcome to do so, at the cost of one `const {IN, OUT} = Direction`.
Test Plan:
New unit tests for frozenness included.
wchargin-branch: direction-enum
Summary:
These functions can be used for address filtering: finding all nodes
owned by a plugin, or all edges of a certain plugin-type, or similar.
Clients can implement this in terms of `toParts`, but when the
underlying type of the opaque type is known there exists a simpler and
more efficient implementation.
Test Plan:
Unit tests added. Run `yarn travis`.
wchargin-branch: address-prefix
Summary:
This fixes an organizational error on my part: the assertions were not
in `it`-blocks, so (a) no test cases were listed, and (b) assertion
failure would have been considered an error in the test suite itself.
(This wouldn’t have led to tests spuriously passing, though.)
Test Plan:
Note the new test case structure:
```
nodeToString
errors on
✓ null base input
✓ undefined base input
✓ wrong kind
works on
✓ the empty address
✓ the address with one empty component
✓ a normal address
edgeToString
errors on
✓ null base input
✓ undefined base input
✓ wrong kind
works on
✓ the empty address
✓ the address with one empty component
✓ a normal address
```
wchargin-branch: tostring-test-case-structure
On larger repos (e.g. `ipfs/js-ipfs` and `ipfs/go-ipfs`) our GitHub
query tends to fail with an opaque (possibly timeout) error message.
Based on a discussion with @mikeal, it sounds like this is probably
GitHub killing queries that "take too long". This commit reduces our
default page sizes so that we make more smaller queries. Empirically, it
seems to help, and it's very unlikely to break anything.
Test plan:
Run
```
yarn backend
node bin/sourcecred.js plugin-graph --plugin=github ipfs js-ipfs \
> graph.json
```
before and after this change. Before, it fails; after, it succeeds.
This commit implements the following edge related methods on graph:
- `Graph.addEdge(edge)`
- `Graph.hasEdge(address)`
- `Graph.edge(address)`
- `Graph.edges()`
We've decided to enforce an invariant that for every edge, its `src` and
`dst` nodes must be present in the graph. As such, `Graph.addEdge` may
error if this condition is not true for the proposed edge, and
`Graph.removeNode` may error if any edges depend on the node being
removed. This invariant is documented via comments, and checked by the
test code.
Test plan:
Extensive unit tests have been added. Run `yarn travis`.
Paired with @wchargin
Summary:
If you just print out an address, depending on output context the NUL
separators may just disappear, which is really bad. Using `stringify` is
better, but causes these characters to appear as `\u0000`, which is not
very pretty.
This commit adds pretty-printing functions that clearly display the
addresses. A node address is printed as:
nodeAddress(["array","of","parts"])
An edge address is printed as:
edgeAddress(["array","of","parts"])
An edge in the graph is printed as:
{address: edgeAddress(["one"]), src: nodeAddress(["two"]), dst: nodeAddress(["three"])}
Note that each of these is a valid JavaScript expression that evaluates
to the argument that was originally converted to string.
Paired with @decentralion.
Test Plan:
Unit tests added. Run `yarn travis`.
wchargin-branch: node-edge-tostring
Summary:
Now, a client calling `assertNodeAddress` can indicate the context to
clients, like `assertNodeAddress(foo.bar.node, "foo.bar.node")`.
Paired with @decentralion.
Test Plan:
Unit tests updated. Existing graph code does not need to be updated,
because the default values work fine.
wchargin-branch: assertion-context-parameters
Summary:
Using `throw err;` was not useful when the error’s message was an
object: `[object Object]` would be printed to the console, in lieu of
any useful information. We now additionally use `console.error` to print
the full object contents.
Test Plan:
Apply the following patch:
```diff
diff --git a/src/v1/cli/commands/combine.js b/src/v1/cli/commands/combine.js
index b60f91e..2fc4061 100644
--- a/src/v1/cli/commands/combine.js
+++ b/src/v1/cli/commands/combine.js
@@ -24,6 +24,7 @@ export default class CombineCommand extends Command {
" where each GRAPH is a JSON file generated by plugin-graph";
async run() {
+ Promise.reject({a: {b: {c: {d: {e: {f: "gee"}}}}}});
const {argv} = this.parse(CombineCommand);
combine(argv);
}
```
Then, run `yarn backend && node bin/sourcecred.js combine`, and note
that the full contents of the object are printed. (If the `util.inspect`
is replaced with a simple `console.error`, then the object’s
representation will be truncated, so the `util.inspect` is important.)
wchargin-branch: rejection-error-full-contents
Summary:
For now, this contains the logic to register an `unhandledRejection`
error. I’ve removed all instances of those handlers, and `require`d this
module at every top-level entry point. (The individual CLI commands had
the handler before, but didn’t need it; conversely, the top-level CLI
entry point did not have the handler, but should have.)
Test Plan:
To test that the CLI commands still error on unhandled rejections, apply
the following patch:
```diff
diff --git a/src/v1/cli/commands/combine.js b/src/v1/cli/commands/combine.js
index b60f91e..d55b965 100644
--- a/src/v1/cli/commands/combine.js
+++ b/src/v1/cli/commands/combine.js
@@ -24,6 +24,7 @@ export default class CombineCommand extends Command {
" where each GRAPH is a JSON file generated by plugin-graph";
async run() {
+ Promise.reject("wat");
const {argv} = this.parse(CombineCommand);
combine(argv);
}
```
Then run `yarn backend` and `node bin/sourcecred.js`, and note that the
rejection handler is triggered.
wchargin-branch: unify-entry
Right now, the cred explorer attempts to display every node in the
graph. As graphs easily grow to O(100k) nodes, this is not kind to the
browser.
This commit limits display to the first 100 entries. Since they are
sorted by percieved importance, and it's easy to filter by type (e.g. to
find all users), this limitation is fine in practice.
Test plan:
Run the cred explorer on a larger repo and observe that the performance
is enormously improved. No unit tests added, as the cred explorer is a
prototype which is basically untested (#269)
Summary:
A `RecursiveTable` shows the tree-view from a single node; a
`RecursiveTables` shows multiple `RecursiveTable`s, and is used both in
the recursion and at the top level. This is useful because we’ll want to
make changes to both entry points, for things like pagination. In
particular, this makes #342 nicer.
Test Plan:
Note that the behavior is unchanged. Then, apply
```diff
diff --git a/src/v1/app/credExplorer/pagerankTable.js b/src/v1/app/credExplorer/pagerankTable.js
index 624bd71..e343e95 100644
--- a/src/v1/app/credExplorer/pagerankTable.js
+++ b/src/v1/app/credExplorer/pagerankTable.js
@@ -220,6 +220,7 @@ class RecursiveTables extends React.Component<RecursiveTablesProps> {
const y = pagerankResult.get(b).probability;
return y - x;
})
+ .slice(0, 10)
.map((address) => (
<RecursiveTable
depth={depth}
```
and note that (a) the root view has only ten entries, and (b) any
expanded subview also has only ten entries.
wchargin-branch: recursive-tables
Our GitHub renderer tries to display the author name for a comments or
pull request reviews. However, as we've already discovered (#228), not
every GitHub post has an author available. This breaks attempts to demo
the v1 cred explorer on some interesting repos, e.g. ipfs/js-ipfs.
Since the v1 code is all deprecated, it's effectively a demo branch
pending a rewrite of the cred explorer to use the v3 graph. As such, I
didn't include unit tests.
Test plan:
Attempt to run the cred explorer on `ipfs/js-ipfs` prior to this PR.
Observe that it loudly fails. Attempt to run it after this PR, and
that it succeeds.
Consider a case where a user wants to construct many addresses sharing a
common prefix. For example, every GitHub entity may have an address
starting with the component sequence `["sourcecred", "github"]`.
Currently, users can implement this with `toParts`:
```js
const GITHUB_ADDRESS = Address.nodeAddress(["sourcecred", "github"]);
function createGithubAddress(ids: $ReadOnlyArray<string>): NodeAddress {
return nodeAddress([...Address.toParts(GITHUB_ADDRESS), ...ids]);
}
```
This is unfortunate from a performance standpoint, as we needlessly
perform string and array operations, when under the hood this is
basically a string concatenation.
This commit fixes this by adding functions `nodeAppend` and
`edgeAppend`, which take a node (resp. edge) and some components to
append, returning a new address. Consider how straightforward our
example case becomes:
```js
const GITHUB_NODE_PREFIX = Address.nodeAddress(["sourcecred", "github"]);
const ISSUE_NODE_PREFIX = Address.nodeAppend(GITHUB_NODE_PREFIX, "issue");
// There is no longer any need for an explicit creation function
const anIssueAddress = Address.nodeAppend(ISSUE_NODE_PREFIX, someIssueID);
```
Paired with @wchargin.
Test Plan:
Unit tests added; run `yarn travis`.
When the cron build fails, we want to make sure that we know about it.
This commit ensures that Travis emails @wchargin and @decentralion on
any build failure; our email filters will ensure that cron job failures
get noticed.
Based on documentation at: https://docs.travis-ci.com/user/notifications
This commit implements and tests the following methods on the `Graph`:
- `addNode`
- `removeNode`
- `hasNode`
- `nodes`
Test plan: Unit tests are included.
Paired with @wchargin
Summary:
Closes#336.
Test Plan:
Snapshots updated; changes are easily readable. Existing tests pass.
Running the cred explorer on the `sourcecred/sourcecred` graph shows
pull requests like `#124 (+73/−3): Update the README`, which is good.
wchargin-branch: pr-addition-deletion
This commit re-organizes our addressing code according to the following
conventions:
- All address code is implemented in `core/_address.js`, but clients
should not import that file directly.
- Public address-related methods will be made available via the
`Address` object in `core/graph.js`.
- The inaugural set of public address methods are:
- `nodeAddress`, for constructing a new `NodeAddress` from parts
- `edgeAddress`, for constructing a new `EdgeAddress` from parts
- `toParts`, for turning an address back into parts
- The types `NodeAddress` and `EdgeAddress`are also exported from
`graph.js`
Test plan: The tests cover both runtime and type properties.
Paired with @wchargin.
For more context and design discussions, see:
https://github.com/sourcecred/sourcecred/pull/334
This very minor commit adds the basic data structures to the `Graph`
that will act as underlying storage for nodes and edges. This is similar
to how nodes and edges are stored in `Graph` v1 and v2, except much
simpler, because we now have string keys, no data stored with nodes, and
minimal data stored with edges.
Test plan: The only observable change is that the graph can now be
constructed without error. I added a unit test to verify this behavior
:) as well as that the Graph constructor properly returns an object of
type `Graph` and not an `any`.
(Context: https://github.com/facebook/flow/issues/6400)
Paired with @wchargin
This commit implements Node/Edge addresses, and helper functions for
generating and manipulating them.
Test plan: Unit tests included.
Paired with @wchargin
The newly added `v3/src/core/graph.js` module is the heart of our `v3`
refactor, which aims at having our `Graph` class be just a graph[1] and
not a key-value store to boot.
This commit adds a skeleton of our intended `Graph` API. We've mostly
ported features from our existing `v1` `Graph` class, but with a few
changes:
- `NodeAddress` and `EdgeAddress` are both strings under the hood, which
allows for better performance and use of native JS Maps and Sets.
- Addresses are now hierarchical paths, so the functionality previously expressed
through `pluginName` and `type` can now be encoded into the path
components.
- `NodeAddress` and `EdgeAddress` are distinct types, and have different
serializations, so they can't get mixed up at runtime.
- It is no longer possible to bind arbitrary payloads to nodes or edges.
The expectation is that external data stores will be responsible for
that.
Test plan: No code was implemented; flow and lint are sufficient.
Paired with @wchargin
[1]: Technically, it's a [quiver]
[quiver]: https://en.wikipedia.org/wiki/Quiver_(mathematics)
We want to reset some of our basic assumptions, and make `Graph` into a
pure graph implementation, rather than a hybrid graph and key-value
store.
This is a substantial rewrite, so we want to start from scratch in a v3/
directory and pull code into v3 as necessary. So that we can do this in
a relatively clean fashion, we're first moving the v1 and v2 code into
their own directories.
Paired with @wchargin
Test plan: Travis, and `yarn backend`, `node bin/sourcecred.js start`.
Note that `yarn backend` and `node bin/sourcecred.js start` both use the
v1 versions. We'll migrate those (by changing paths.js) to v3 when
appropriate.
Summary:
This fixes a bug introduced in #317, which only occurred in the cron job
variant of the CI script (`yarn travis --full`): the two scripts run in
the cron job depend on `yarn backend` having previously written to the
`bin/` directory, but this is precisely what we wanted to prevent. To
fix this, we simply add an additional target for `yarn backend` during
the cron job. This is a little bit wasteful in that we compile the
backend applications twice, but it’s not a big deal because (a) it only
runs in cron jobs, so it won’t slow down normal builds, and (b) it only
takes about 5 seconds, anyway.
Test Plan:
Export a `GITHUB_TOKEN` and run `yarn travis --full`, which fails before
this change and passes after it.
wchargin-branch: cron-ci-overwrite-bin
Summary:
The storage format is similar to the V1 storage format. Nodes are sorted
by address and stored in an ordered list. Edges have `srcIndex` and
`dstIndex`, keyed against the node order. Each node stores the name of
its plugin and the result of invoking `toJSON` on its payload.
Test Plan:
Unit tests added; run `yarn travis`. The single added snapshot is easy
to read and verify.
wchargin-branch: v2-json
Summary:
A `Node` includes a ref and a payload. We should say that two nodes at
the same address are equal iff their payloads are equal, and two
payloads at the same address are equal iff their JSON contents are
equal. Importantly, `Node` equality does not depend on the `ref`’s
internal structure: this structure includes a reference to the enclosing
graph, and so using this notion of node equality caused `equals` to
incorrectly return `false` when the inputs were two logically equal
graphs with different internal structure. (It also caused `equals` to be
very slow, performing a deep equality check on the graphs for every
node.)
Test Plan:
A unit test has been strengthened. It fails before this patch, and
passes afterward.
wchargin-branch: v2-node-payload-json-equality
Summary:
Both the implementation and the tests here are pretty straightforward.
The only change from GraphV1 is that we should explicitly check that the
plugins are copied correctly, because graph equality does not check
this.
Test Plan:
New unit tests suffice.
wchargin-branch: copy-v2
Summary:
I dropped the `edgeDecomposition` and `neighborhoodDecomposition` test
cases, because (a) we don’t have a canonical “interesting graph” on
which to run them, compounding that (b) it’s not clear how much value
they add.
Test Plan:
New unit tests suffice.
wchargin-branch: mergeConservative-v2
The implementation is adapted from our previous implementation, but has
been refactored to be more appropriate for our generator-function
approach.
The unit tests comprehensively explore a simple example graph, testing
the following cases:
- Direction filtering (IN/OUT/ANY/unspecified)
- Node types
- Edge types
- Self-loop edges
- Dangling edges
- Removed edges don't appear
Test plan: Carefully inspect the added unit tests.
Paired with @wchargin
Summary:
Based on code originally paired with @decentralion.
Test Plan:
Unit tests added. Running `yarn travis` is sufficient.
wchargin-branch: v2-edges
Summary:
Previously, our CI script would run `yarn backend`, which has the
side-effect of erasing the `bin/` directory. By itself, this is not
great, but not awful. However, this frequently triggers a race condition
in Prettier, causing the `check-pretty` step of the build to fail. (More
details: https://github.com/prettier/prettier/issues/4468.)
This patch changes the CI script to build the backend scripts into a
temporary directory.
Test Plan:
Before applying this patch: `yarn backend` and then `yarn travis`. If
this consistently causes a Travis failure due to `check-pretty`, then
your machine can reproduce the race condition that we‛re trying to
eliminate. (Otherwise, you can try creating a bunch more Git history…
I’m not really sure what to say. It is a race condition, after all.)
Then, apply this patch, and repeat the above steps; note that the error
no longer occurs, and that the build output is to a temporary directory.
wchargin-branch: ci-preserve-bin
Given that we have reified the type `PluginFilter` as a consistent way
to filter nodes and edges by plugin and type, we should also have
re-usable logic for that end.
This commit adds `addressFilterer`, which takes a `PluginFilter` and
returns a function that checks whether a given address matches the
filter. Right now, only the `nodes` function uses it.
For added safety, `addressFilterer` does runtime validation that the
required `plugin` property was set.
Test plan: See included unit tests.
As of this commit, adding the comment `//$ExpectFlowError` in flow-typed
code asserts that the next line must cause a flow error. If it does, no
error or warning is generated. If it does not, then this produces a flow
warning, which is visible to developers running `yarn flow` and
additionally causes travis to fail.
Test plan:
- As committed, `yarn travis` passes.
- I added `//$ExpectFlowError` above some line of flow-checked code which does
not currently throw an error. Afterwards, `yarn travis` failed (and a
helpful message was displayed in console on running `yarn flow`)
- I added the following bad code into one of our files:
```javascript
//$ExpectFlowError
const foo: string = 3;
```
As expected, `yarn flow` and `yarn travis` both passed.
Summary:
It’s critical, even more so than usual, that the “base reference”
property of a `DelegateNodeReference` be a private property, because
this class is designed for inheritance. In ECMAScript 6, we can achieve
this by giving the property a `Symbol` key instead of a string key.
Unfortunately, Flow doesn’t know about `Symbol`s, so we need a few casts
through `any`, but they are localized to as small a scope as possible.
Test Plan:
Unit tests added. Note that they pass both before and after this change.
wchargin-branch: symbol-base-ref
Summary:
The use of `tee /dev/stderr` failed when running as a child process
under Node for some reason. (I haven’t been able to figure out why—it
works fine when run as a standalone script or when run as a child
process under Python.) This is also technically Linux-specific, so I’ve
changed it to use a process substitution. After looking around for a
bit, there doesn’t seem to be a way to do this in a way that is
portable, uses only POSIX shell features, and doesn’t create temporary
files all at the same time, so the script is now run under `bash`.
Test Plan:
Run `yarn travis` and note that the `ensure-flow.sh` output no longer
contains the line `tee: /dev/stderr: No such device or address`.
wchargin-branch: no-tee-devstderr
This commit adds the following methods to the `Graph`:
* `addNode`
* `removeNode`
* `ref`
* `node`
* `nodes`
* `equals`
The graph now supports adding nodes, with thorough testing. Other
methods were implemented as necessary to test that `addNode` was
implemented properly.
Also, we've made a slight change to spec: `nodes` (and other filter
options) accept a `PluginFilter` object, which, if present, must
specify a plugin and may specify a type.
I've taken the opportunity to re-write the graph test code. Instead of
having a complicated `graphDemoData` file that creates a graph with many
different nodes, I've created an `examplePlugin` which makes it trivial
to instantiate new simple Foo and Bar nodes on the fly. Then, test cases
can construct a small graph that is clearly appropriate for whatever
functionality they are testing.
Test plan: Unit tests were added, travis passes.
Test Plan:
Run `yarn flow stop; yarn travis | cat` and note the absence of the
really long line that has ~2500 bytes of “Server is initializing”.
wchargin-branch: quiet-flow-server
Currently, when filtering by type (e.g. in neighbors), we require only a
type string. This is a design flaw, as if two plugins both define an
"ISSUE" type, either plugin may unexpectedly receive the other plugin's
nodes or edges.
We fix the flaw by explicitly binding the plugin name and type field
together as `PluginType`.
Test plan: Flow passes, and so does the following invocation to check
our design doc:
```
cat <(printf '// @flow\n') \
<(awk -v RS='```[a-z]*' '(NR+1)%2' \
src/core2/address_payload_unification_design.md) \
| yarn -s flow check-contents
```
- Add the `plugins` method to graph, and have the constructor take them
- Add in demo `PluginHandlers` to the `graphDemoData`
- Add unit tests for `plugins` methods
Test plan: Travis
Paired with: @wchargin
Summary:
In fact, a constructor that only calls `super` is useful: it specifies
the API for constructing an object of a given class without needing to
traverse its prototype chain. (That constructors are inherited at all is
arguably a design flaw.)
Test Plan:
None.
wchargin-branch: yes-useful-constructor
Also, rename `NodeDelegateReference` to `DelegateNodeReference` in the
design doc.
Test plan:
Didn't add tests, as the implementation is trivial. Flow passes.
src/core2/graph.js has the types and Graph class scaffold as
described in address_payload_unification_design.md.
Also, we decided to modify the design doc to include a type parameter
for Edges (although we aer open to removing if it becomes cumbersome).
Test plan:
`yarn flow` passes, and the design doc still typechecks too.
Paired with @wchargin
Since this document was originally written, we've iterated on the design
a bit. @wchargin and I have re-reviewed the doc, and we think it's now
up-to-date with our plans.
Test plan: I ran the following command to bring the code into my
clipboard:
```
❯ awk -v RS='```[a-z]*' '(NR+1)%2' src/core2/address_payload_unification_design.md | xsel -ib
```
Test plan:
Pasting this into https://flow.org/try produced no errors.
Paired with @wchargin