Commit Graph

1125 Commits

Author SHA1 Message Date
Dandelion Mané b86dcf742e
Make the Discourse plugin robust to errors (#1387)
Currently attempting to load the SourceCred discourse instance fails
with foreign key constraint errors.

Basically, we have a few weird situations:
- A post (which corresponds to the 'psuedo-topic' generated by creating
a new category) is picked up, but its topic is not detected, because
Discourse does not list these 'psuedo-topics' in the latest topic
endpoint. Attempting to add the post breaks the foreign key constraint.

- We have several likes which correspond to posts that don't exist.
Possibly they were deleted? I'm not sure.

Right now, the load process fails entirely when it hits these
exceptions, which is bad. It should print a warning instead, and
continue without the offending interactions. This commit effects that
change in behavior.

Test plan:

Before this commit, loading the SourceCred discourse with a clean cache
fails. After building with this commit, loading the SourceCred discourse
with a clean cache workes and prints the following warnings:

```
$ node bin/sourcecred.js discourse https://discourse.sourcecred.io credbot
  GO   load-discourse.sourcecred.io
  GO   discourse
  GO   discourse/topics
 DONE  discourse/topics: 3m 53s
  GO   discourse/posts
Warning: Encountered error 'FOREIGN KEY constraint failed' while adding
post https://discourse.so urcecred.io/t/214/1.
 DONE  discourse/posts: 2m 38s
  GO   discourse/likes
 DONE  discourse/likes: 50s
 DONE  discourse: 7m 21s
  GO   compute-cred
 DONE  compute-cred: 547ms
 DONE  load-discourse.sourcecred.io: 7m 22s
```

Also, unit tests have been added that verify the specific behavior
changes.
2019-09-20 11:21:53 +02:00
Robin van Boven d5d00aae5a Blacklist techtribe org, thumbsup reaction (#1386)
Fixes #1353

Tested manually by creating a docker image including the changes.
Running the dev-preview @passbolt command until completion.
(once hitting the github rate limit, once till #1354 happens)

No more problematic interactions show up during load.
2019-09-20 11:20:14 +02:00
Robin van Boven d6bbc939b2 Add more bots. (#1383)
Fixes #1381
2019-09-19 17:52:20 +02:00
Dandelion Mané 8f46d7d812
Fix bug when selecting "All users" in explorer (#1388)
This fixes a bug introduced in #1371, where selecting a type other than
"All users" and then trying to reselect "All users" would break the UI.

Test plan: Manual inspection; load an instance, try selecting a
different type, and then go back to "All users". It now works as
expected.
2019-09-19 14:01:17 +02:00
Dandelion Mané 007568d3f0
Add `sourcecred discourse` command (#1374)
This adds a new command, `discourse`, which makes it convenient to load
Discourse servers as standalone SourceCred projects.

For example, you could load the official SourceCred discourse via the
following:

```sh
export SOURCECRED_DISCOURSE_KEY=....
yarn backend
node bin/sourcecred.js discourse https://discourse.sourcecred.io credbot
yarn start
```

I've updated the README with instructions for using the plugin.

Test plan: No automated testing because I see this tool as a temporary
placeholder until we get the SourceCred instances setup. I manually
tested the error cases (e.g. providing an invalid server url) as well as
success cases like the one above. I validated that the weights file
argument is being interpreted correctly (i.e. trying to load invalid
weights produces an expected error message, loading valid weights
results in those weights being present in the UI).
2019-09-19 12:32:49 +02:00
Dandelion Mané 1449935651
GitHub plugin: Expose user addresses (#1382)
Allow getting the node address for a user, given the user's login. This
will be needed by the upcoming identity plugin.

If the login in question corresponds to a bot, then a bot address will
be returned. When we make the bot-set configuration (rather than
hardcoded), we'll need to change the signature of this function; I think
that's fine.

Test plan: Unit tests added. (Also, it's really simple.)
2019-09-18 14:50:52 +02:00
Dandelion Mané ac8ac7051f
add `Graph.contractNodes` (#1380)
This commit adds Graph.contractNodes, which allows collapsing certain
nodes in the graph into each other. This will enable the creation of a
SourceCred "identity" plugin, allowing identity resolution between users
different accounts on different services.

Test plan: Thorough unit tests have been added. `yarn test` passes.

Thanks to @wchargin for [review feedback][1] which significantly
improved this API.

[1]: https://github.com/sourcecred/sourcecred/pull/1380#discussion_r324958055
2019-09-18 13:59:49 +02:00
William Chargin ddf07c6714
Replace `PartialTimelineCredParams` with `$Shape` (#1379)
Summary:
Flow provides a utility type for this purpose; there’s no need to
implement, document, and keep it in sync ourselves:
<https://flow.org/en/docs/types/utilities/#toc-shape>

Test Plan:
As written, `yarn flow` passes. Changing the definition of `params` on
line 77 of `load.test.js` to add a key `foo: "wat"` or change the value
of `weights` to `{hmm: "hmm"}` yield appropriate type errors.

wchargin-branch: use-shape
2019-09-16 19:22:35 -07:00
William Chargin 3cb22565e5
mirror: update EAV primitives (#1342)
Summary:
This commit modifies `_updateOwnData` to write to both the old
type-specific primitives tables as well as the new EAV table. This
establishes the invariant that a node with non-null `last_update` will
always have primitive data (if its object type has primitive fields).

Test Plan:
Existing tests expanded. Commenting out each of the `updateEavPrimitive`
calls (independently) causes a test to fail. Note that every test that
queries an internal `primitives_*` table to inspect the database state
has been expanded to make an equivalent query against the `primitives`
table as well.

wchargin-branch: mirror-eav-update
2019-09-14 17:28:09 -07:00
William Chargin 463f3a073a
mirror: initialize EAV primitives at registration (#1341)
Summary:
This establishes the invariant that every object in the `objects` table
has all relevant rows in the `primitives` table, though those rows’
values are never yet set.

Test Plan:
Unit tests updated. Manually loading `sourcecred/example-github` and
running `.dump primitives` generates reasonable-looking output, with
lots of rows, including entries for nested fields and eggs. Verified
that the set of non-`id` columns on `Issue` equals the set of values for
the `fieldname` column of an `Issue` object, and likewise for `Commit`s,
thus covering each kind of field.

wchargin-branch: mirror-eav-init
2019-09-14 17:24:58 -07:00
William Chargin 0418dfe9dd
mirror: add `primitives` table for EAV migration (#1340)
Summary:
See #1313 for context. The plan is to set up dual-writes with `extract`
calls still reading from the old tables until the new ones are complete
and tested. The primary risk to production would be a fatal exception in
the new write paths, which seems like an acceptable risk.

Test Plan:
Unit tests pass.

wchargin-branch: mirror-eav-schema
2019-09-14 17:21:42 -07:00
William Chargin 976afb6665
mirror: test `registerObject` with nested fields (#1339)
Summary:
Prior to this commit, removing the `addLink.run({id, fieldname})` on
line 487 of `mirror.js` would cause test failures down the pipeline, but
not at the root cause. Such an error is now caught earlier.

Test Plan:
Comment out line 487 of `mirror.js` and observe that the newly added
test case fails, but the other `registerObject` test cases do not.

wchargin-branch: mirror-test-registerobject-nested
2019-09-14 17:16:24 -07:00
Dandelion Mané c58315fe4d
Hackily add support for mixed GitHub/Discourse projects (#1378)
For phase one of the CredSperiment, I need a SourceCred instance which combines GitHub and Discourse servers. I'll also need to be able to give it very specific configuration to collapse certain user identities together.

Shortly after launching the CredSperiment, I plan to come back and totally re-write SourceCred's command line interface and site building system, in a way that will throw away most of the existing codebase.

As such, I found it expedient to add rather hacky and untested support for loading combined GitHub/Discourse instances, so I can land the promised features. This PR does so by:

- adding sourcecred gen-project for constructing project.json files
- adding sourcecred load --project for loading a project.json file
- ensuring that load provides the right plugins based on the project that's in scope
- updating build_static_site so that it can use the new --project flag

Test plan:
I have done some end-to-end testing, but the overall commit stack lacks automated testing. This is a deliberate tradeoff: I'm planning to re-write this section of the codebase, and the testing ergonomics are not great, so I'd rather accept some technical debt, especially since I plan to pay it off soon.

See the pull request on GitHub for the individual constituent commits.
2019-09-12 17:35:21 +02:00
Dandelion Mané 7a0dd49b42
factor loadWeights into Common (#1377)
As suggested by @Beanow in [a review comment][1], this commit factors
loading weights from disk into a cli/common utility method.

The actual method is really generic, and we have a number of similar
constructions across the codebase (grep for `JSON.parse` to find them).
I considered factoring out a generic utility for loading and
deserializing JSON data from disk in general, but it didn't seem
valuable enough at this time.

Test plan: Unit tests added, existing tests pass.

[1]: https://github.com/sourcecred/sourcecred/pull/1374#discussion_r323149740
2019-09-12 15:55:05 +02:00
Dandelion Mané 0a0010f38e
Share default TimelineCredParameters (#1376)
At present, every place in the codebase that needs
TimelineCredParameters constructs them ad-hoc, meaning we don't have any
shared defaults across different consumers.

This commit adds a new type, `PartialTimelineCredParameters`, which
is basically `TimelineCredParameters` with every field marked optional.
Callers can then choose to override any fields where they want
non-default values. A new internal `partialParams` function promotes
these partial parameters to full parameters.

All the public interfaces for using params (namely,
`TimelineCred.compute` and `TimelineCred.reanalyze`) now accept optional
partial params. If the params are not specified, default values are
used; if partial params are provided, all the explicitly provided values
are used, and unspecified values are initialized to default values.

Test plan: A simple unit test was added to ensure that weights overrides
work as intended. `git grep "intervalDecay: "` reveals that there are no
other explicit parameter constructions in the codebase. All existing
unit tests pass.
2019-09-12 15:21:13 +02:00
Dandelion Mané def1fef192
Factor TimelineCredParameters into new module (#1375)
The `timelineCred.js` file is a bit of a beast. One way to start
slimming it down is to pull the parameters into their own file. This is
especially helpful as I'm planning a followon PR that will colocate the
default parameter values with their declaration.

The naming of everything in the `/timeline/` subdirectory is a bit
wonky: it reflects that at the time of creation, "Timeline" designated
an experimental version of SourceCred. Now, it is becoming canonical,
but the cumbersome naming persists. I haven't made any effort to tackle
the name debt here.

Test plan: `yarn test` passes; since this is merely a code
reorganization, this give me great confidence that the change is
correct. I also added a few small tests to the new module. Although the
behavior in question is already tested, I think setting up test files
liberally is a good practice, as the existence of the test file invites
the creation of more tests.
2019-09-12 15:12:17 +02:00
Dandelion Mané e1b9b07cac
group explorer types by plugin (#1373)
Now that we're adding support for the Discourse plugin, we'll start
having >1 plugin present in the frontend again. As such, we should
provide clear grouping of types in the frontend so that it's possible to
distinguish between a GitHub user and a Discourse user. This commit does
just that, by resurrecting code that we used when the GitHub and Git
plugins co-existed in the frontend.

Test plan: Launch the fronted and observe that node types in the filter
selection dropdown are grouped by the name of their plugin. Also,
clicking on the name of a plugin should filter to all nodes from that
plugin.
2019-09-11 02:28:42 +02:00
Dandelion Mané 093955dea1
scores command no longer assumes GitHub plugin (#1372)
Previously, the `sourcecred scores` command assumed that all users are
GitHub users, and assigned users an id based on their GitHub login.

Now, the command returns information on all users, regardless of which
plugin provided them. As such, we need to identify users differently.
Instead of a string id, they now have an array of address parts. That
array contains all of the parts of their corresponding node address.

For example, the GitHub user `@Beanow` would correspond to the address
array `["sourcecred", "github", "USERLIKE", "USER", "Beanow"]`

As a general convention, the first two components of any node's address
contain information about the plugin that owns that node. The first
component is the owner of the plugin, and the second is the name of the
plugin. Afterwards, the plugin may represent nodes in whatever manner it
sees fit.

Thanks to @Beanow and @vsoch for some feedback and discussion on this
design.

Test plan: Snapshots have been updated. `yarn test` passes.
2019-09-10 23:49:45 +02:00
Dandelion Mané b3ffd3758b
TimelineExplorer defaults to showing all users (#1371)
Now instead of always defaulting to GitHub users, it shows all
user-typed nodes. This will make SourceCred work non-hackily when there
is e.g. just a Discourse plugin in scope.

I also fixed an issue where it was loading the GitHub declaration in a
hardcoded way, instead of properly getting it from the TimelineCred's
plugin array.

Test plan: Manual UI inspection.
2019-09-10 22:50:39 +02:00
Dandelion Mané 8de57fdb7b
add TimelineCred.userNodes (#1369)
This is a convenience method that extracts cred for all the user-typed
nodes. It's basically an abstraction over calling `credSortedNodes` with
the right set of prefixes.

I forsee using it in at least two places (score retrieval in the CLI and
score display in the frontend) so I decided to make it a method.

Test plan: A very simple unit test was added. (It's a very simple
wrapper function.)
2019-09-10 20:02:28 +02:00
Dandelion Mané 1079f5ec86
timelineCred.credSortedNodes takes prefixes (#1368)
This lets us filter by a group of prefixes simultaneously, which enables
e.g. seeing all user node types at once.

I also tweaked the API to make it a bit more convenient, you can now
pass no arguments and get all nodes in sorted order.

Test plan: Unit tests updated.
2019-09-10 19:44:03 +02:00
Dandelion Mané 65f22a0a74
Replace TimelineCredConfig with array of plugins (#1367)
The PluginDeclaration has all of the information we need to configure
TimelineCred: it knows all the node and edge types, as well as which
node types are user (or scoring) node types.

Therefore, we can replace the ad-hoc config object with a simple array
of plugin declarations. Since the plugins will be saved as part of the
TimelineCred, it means the UI can configure to only show information for
plugins that are actually in scope.

Test plan: `yarn test` passes, and the prototype still works. Snapshots
updated.
2019-09-10 19:36:12 +02:00
Dandelion Mané dcf4010ff0
discourse: fix fetch failure on 410 (#1366)
When a post or topic is deleted, Discourse fetch will give status 410.
As with 404 and 403, we should just ignore the post and move on.

I took the opportunity to slightly refactor the fetch error handling
while I was there.

Test plan: Previously, doing a load on the SourceCred discourse instance
would fail due to a deleted topic. Now, it doesn't.
2019-09-10 19:13:13 +02:00
Dandelion Mané aecd2864bf
Let plugins specify user types (#1365)
This modifies the pluginDeclaration so that it can specifiy user node
types. This will allow us to replace the TimelineCredConfig type with a
plugin collection instead.

It's expected that the user types will also be present in the node
types, although this isn't validated anywhere at present.

Test plan: `yarn flow`.
2019-09-10 19:09:01 +02:00
Dandelion Mané dbb31a586c
Capitalize Discourse plugin name (#1364)
This ensures consistency with GitHub, and will allow us to use plugin
names in the UI.

Test plan: Not needed, trivial change.
2019-09-10 19:06:05 +02:00
Dandelion Mané e2e6c56650
Enable multiple scoring node types (#1361)
This updates the cred computation logic so that we can have multiple
"scoring node types".

Context: Currently, we designate a single node type (GitHub users) as
the scoring node type, and normalize so that all users have 1000 score
in total.

This commit updates the pipeline to admit using more than one prefix for
scoring, meaning that we could have GitHub users, Discourse users, and
more, and still have all users sum to 1000 score.

We will still need to update the frontend so that it will have a user
pane which aggregates across all users.

Test plan: Unit tests updated. `yarn test` passes.
2019-09-10 19:05:46 +02:00
William Chargin 0d7db99d7f
Blacklist `@allcontributors` bot (#1363)
Summary:
This adds `MDM6Qm90NDY0NDczMjE=` (`@allcontributors`) to the blacklist
to enable loading the `aragon/aragon` repository. See #1362 and #996 for
context.

Test Plan:
Running `node ./bin/sourcecred.js load aragon/aragon` on a clean cache
now completes successfully.

wchargin-branch: blacklist-allcontributors
2019-09-10 08:55:16 -07:00
William Chargin fa6697719a
update_snapshots: fix Discourse key check (#1360)
Summary:
This was doing exactly the wrong thing, attempting to update snapshots
whenever the Discourse API token was _not_ present.

Test Plan:
Running `env -u DISCOURSE_TEST_API_KEY ./scripts/update_snapshots.sh`
now successfully updates non-Discourse snapshots, rather than emitting
an error, “Please set the DISCOURSE_TEST_API_KEY environment variable.”.

wchargin-branch: update-snapshots-discourse
2019-09-07 16:27:29 -07:00
William Chargin a6a291a3cc
test: fix `example-github-load` snapshot test (#1359)
Summary:
Generated with `./scripts/update_snapshots.sh` (with #1360 patched in).
This fixes failures introduced in #1358.

Test Plan:
Running `yarn test --full` now passes. Inspecting the diff (after piping
the old and new snapshots to `jq -S .`) shows that this includes only
additions, which seems appropriate given the precipitating change.

wchargin-branch: fix-1358-failures
2019-09-07 16:22:53 -07:00
Dandelion Mané 545b084146
Change TimelineCred filtering strategy (#1358)
This changes how TimelineCred filtering works. Instead of using the
filterTimelineCred module, which includes all nodes matching
filterPrefixes, we now take all nodes matching scorePrefixes and
additionally the top `k` nodes for every other type.

This ensures that we will have the top comments, pull requests, issues,
etc in the UI, without needing to take every single comment or PR or
issue.

Concurrently, the UI is updated so that every type is included in the
filter dropdown.

CHANGELOG has been updated, since this is user facing.

Test plan: `yarn test` passes, snapshots are updated, and I also tested
the UI manually.
2019-09-08 00:32:10 +02:00
Dandelion Mané f31a92874b
hide `filterTimelineCred` (#1357)
TimelineCred computation is implemented as follows:
- Compute Distribution
- Filter it down to specified node types
- Wrap the filtered results into a TimelineCred

I want to change how the filtering works. The new filtering logic will
depend on logic we've already implemented in TimelineCred; therefore
filtering should be done on the TimelineCred object and not separately.
Specifically, I want to be able to filter down to the highest-scored
nodes by type (dependent on the type).

As a first step, I've refactored the interface to TimelineCred so that
the filtering is an implementation detail, i.e. the TimelineCred
constructor doesn't expect objects defined in `filterTimelineCred`.

Test plan: `yarn test` passes after a snapshot update.
2019-09-08 00:20:34 +02:00
Dandelion Mané 5996dd710a
timeline cred config is stored in JSON (#1356)
This modifies the TimelineCred serialization so that it includes the
CredConfig in the JSON. This means that it's easier to coordinate which
plugins and types are in scope, as the data itself can contain that
information.

Rather than define a new hand-rolled serializer, I just passed the
config directly through for stringification. Unit tests verify that this
still works (round-trip serialization is tested). As an added sanity
check, I generated a new small `cred.json`, and inspected the file via
`cat` to ensure that it's still legible text, and isn't interpreted as a
binary file due to the `NUL` bytes in node addresses.

Every client that previously depended on the `DEFAULT_CRED_CONFIG` now
properly gets its cred configuration from the JSON.

Test plan: Unit tests for serialization already exist. Generated a fresh
`cred.json` file and tested the frontend with it. Also,
`yarn test --full` passes.
2019-09-08 00:04:01 +02:00
Vanessasaurus ab0628a7ce adding small changes to use docker layers for cache (#1338)
Signed-off-by: Vanessa Sochat <vsochat@stanford.edu>
2019-09-06 18:16:28 +02:00
William Chargin 5bcec38e5b
Blacklist more problematic quasar interactions (#1335)
Blacklist more problematic quasar interactions

Summary:
Context: <https://github.com/sourcecred/sourcecred/issues/1256#issuecomment-526252852>

Without also blacklisting the reaction, we hit an invariant violation in
the relational view (reactions are expected to have exactly one author).

Test Plan:
Running `node ./bin/sourcecred.js load quasarframework/quasar-cli` now
completes successfully (in about 2 minutes 40 seconds). It does emit a
warning:

```
Issue[MDU6SXNzdWUzNDg0NjUzNDg=].reactions: unexpected null value
```

…because one of the reactions was blacklisted. But the relational view
handles this correctly, it seems: timeline cred is still computed and
renders without obvious error.

wchargin-branch: blacklist-more-quasar
2019-09-02 08:18:36 -07:00
William Chargin 7d3d24e0ec
mirror: guess typenames and warn on mismatch (#1337)
Summary:
The format of GitHub’s GraphQL object IDs is explicitly opaque, and so
we must not introspect them in any way that would influence our results.
But it seems reasonable to introspect these IDs solely for diagnostic
purposes, enabling us to proactively detect GitHub’s contract violations
while we still have useful information about the root cause.

This commit adds an optional `guessTypename` option to the Mirror
constructor, which accepts a function that attempts to guess an object’s
typename based on its ID. If the guess differs from what the server
claims, we continue on as before, but omit a console warning to help
diagnose the issue more quickly.

Resolves #1336. See that issue for details.

Test Plan:
Unit tests for `mirror.js` updated, retaining full coverage. To test
manually, revert #1335, then load `quasarframework/quasar-cli`. Note
that it emits the following warning before failing:

> Warning: when setting Reaction["MDg6UmVhY3Rpb24zNDUxNjA2MQ=="].user:
> object "MDEyOk9yZ2FuaXphdGlvbjQzMDkzODIw" looks like it should have
> type "Organization", but the server claims that it has type "User"

Unit tests for the GitHub typename guesser added as well.

Running `yarn test --full` passes.

wchargin-branch: mirror-guess-typenames
2019-09-01 01:04:53 -07:00
Vanessasaurus 8f9e967496 Adding docker build (all but master) and deploy (master) to CircleCI (#1320)
Signed-off-by: Vanessa Sochat <vsochat@stanford.edu>
2019-08-28 23:17:50 +02:00
William Chargin e762d2b900
security: upgrade transitive `eslint-utils` to 1.4.2 (#1332)
Summary:
Upgrading past a security fix in that package. Generated by running
`yarn add eslint@^6.2.2 babel-eslint@^10.0.3`: `eslint` to update the
problematic transitive dependency, and `babel-eslint` to avoid
<https://github.com/eslint/eslint/issues/12117>.

Test Plan:
Running `yarn lint` yields no false positives, and does complain on true
positives. Running `yarn list --pattern eslint-utils` lists only v1.4.2.

wchargin-branch: eslint-utils-1.4.2
2019-08-28 07:51:41 -07:00
William Chargin ae8ab0d1bd
Check typesafety of `NullUtil.filterList` (#1328)
Summary:
The current implementation of `NullUtil.filterList` uses an `any`-cast.
This is fine as long as the definition is actually typesafe; we should
take a least a little care to ensure that it is. This commit adds a
typesafe version, commented out but still typechecked, and refines the
type around the `any`-cast to make the cast slightly more robust.

Test Plan:
Note that changing `$ReadOnlyArray<?T>` to `$ReadOnlyArray<?T | number>`
in the declaration of `filterList` caused no Flow error prior to this
commit, but now causes one.

wchargin-branch: filter-list-typecheck
2019-08-26 10:35:08 -07:00
William Chargin 909045a7ec
Rename `NullUtil.filter` to `NullUtil.filterList` (#1327)
Summary:
The old name is misleading. There _is_ a function called `filter` on
options, but its type is `(Option<T>, (T -> boolean)) -> Option<T>`:

  - Java: <https://docs.oracle.com/javase/8/docs/api/java/util/Optional.html#filter-java.util.function.Predicate->
  - Rust: <https://doc.rust-lang.org/std/option/enum.Option.html#method.filter>
  - Haskell: <https://hackage.haskell.org/package/base-4.12.0.0/docs/Control-Monad.html#v:mfilter>
  - OCaml (Core): <https://ocaml.janestreet.com/ocaml-core/latest/doc/base/Base/Option/index.html#val-filter>

This is even inconsistent with SourceCred’s own documentation:
<126332096f/src/util/null.js (L31)>

In general, a function called `foo` on options where `foo` also exists
on lists has the meaning, “interpret an `Option<T>` as a subsingleton
list, apply `foo` to the list, and reinterpret as an option”. To choose
the same name a conflicting function is confusing.

The function that was wanted is really just a special case of `flatMap`.
For instance, in Java:

```
-> import java.util.stream.Collectors;

-> (List.of(Optional.of(3), Optional.empty(), Optional.of(2))
>>     .stream()
>>     .flatMap(Optional::stream)
>>     .collect(Collectors.toList()))
|  Expression value is: [3, 2]
|    assigned to temporary variable $2 of type List<? extends Object>
```

Yet some languages do provide it as a utility function: [`catMaybes`] in
Haskell, or [`List.filter_opt`] in OCaml (Core). For parallelism with
the latter, we define `NullUtil.filterList`.

[`List.filter_opt`]: https://ocaml.janestreet.com/ocaml-core/latest/doc/base/Base/List/#val-filter_opt
[`catMaybes`]: https://hackage.haskell.org/package/base-4.12.0.0/docs/Data-Maybe.html#v:catMaybes

Test Plan:
That `yarn flow` passes suffices.

wchargin-branch: filter-list
2019-08-26 10:16:44 -07:00
Dandelion Mané 12a3321ea7
Fix failing snapshot test (#1329)
PR #1325 introduced a failing snapshot test, which was promptly caught
by @wchargin. This commit fixes it by running
`./scripts/update_snapshots.sh`. Also, I bumped the project JSON version
number, which also should have happened in #1325.

Test plan: `yarn test --full` passes.
2019-08-26 18:23:11 +02:00
Dandelion Mané b4463f2ab7
cli load uses discourse key (#1326)
This commit modifies `cli/load` to appropriately load a Discourse key
from the environment, if it is available.

The mechanics are basically the same as with the GitHub token.

Test plan: Unit tests added. `yarn test` passes.
2019-08-26 13:40:19 +02:00
Dandelion Mané 243437f1cd
api: add support for loading Discourse servers (#1325)
This commit modifies the `Project` type so that it allows settings for a
Discourse server, and ensures that `api/load` will appropriately load
the server in question, and include it in the output graph.

Putting the full Discourse declaration directly into the Project type is
an unsustainable development practice—in general, adding plugins should
not require changing core data types. However, at the moment I'm punting
on polishing the plugin system, in favor of adding the Discourse plugin
quickly, so I just put it into Project alongside the repo ids.

In the future, I expect to refactor the plugins around a much cleaner
interface; it's just not a priority as yet. (Tracking: #1120.)

This commit also makes the GitHub token optional in `api/load`, since
now it's very plausible that a user will want to only load a Discourse
server, and therefore not require a GitHub token.

As of this commit, it's still impossible to load Discourse projects, as
the CLI always sets a null Discourse server; and in any case, the
frontend would not properly display the project in question, as any
Discourse types would get filtered out.

Test plan: Mocking unit tests have been added to `api/load.test.js` to
ensure that the Discourse graph is loaded and merged correctly.
2019-08-26 13:31:52 +02:00
Dandelion Mané 126332096f
NullUtil: add `filter` (#1324)
This adds a new method called `filter` to the `NullUtil` module.
`filter` enables you to filter all the null-like values out of an array
in a convenient typesafe way. (It's really just a wrapper around
`Array.filter((x) => x != null)` with a type signature.)

Test plan: Unit tests added (for both functionality and type safety).
2019-08-26 13:20:40 +02:00
Dandelion Mané ebdd2a05c5
add a loadDiscourse method (#1323)
This is the analogue to `github/loadGraph`, but for Discourse. It
basically pipes together the mechanisms for loading Discourse data and
creating a Discourse graph from them, resulting in a single endpoint for
consumption in the API.

In contrast to github, the method is called `loadDiscourse` and not
`loadGraph`, which seemed more appropriate to me. I haven't changed
the corresponding GitHub method's name. (I'm currently knowingly letting
conceputal debt accumulate around the plugin interface; I expect to do a
full refactor within the next few months.)

Test plan: This is the kind of "pipe together tested APIs involving IO"
code which I have decided not to write explicit tests for. However, it
is still protected by flow, and I have a branch (`discourse-plugin`)
which uses this code to do a full Discourse load.
2019-08-26 12:25:42 +02:00
Dandelion Mané 0e3ce1c531
mirror: output progress to taskReporter (#1322)
It's nice to get some sense of what is happening while waiting for a Discourse load.

Test plan: See attached unit tests.
2019-08-24 12:41:19 +02:00
Dandelion Mané 012f19eb48
discourse fetch: add rate limiting (#1321)
This implements rate limiting to the Discourse fetch logic, so that we
can actually load nontrivial servers without getting a 529 failure.

We could have used retry; I thought it was more polite to actually limit
the rate at which we make requests. However, to avoid seeing 529s in
practice, I left a bit of a buffer: we make only 55 requests per minute,
although 60 would be allowed.

If we want to improve Discourse loading time, we could boost up to the
full 60 request/min, but add in retries. (Or we could switch to retries
entirely.)

Test plan: This logic is untested, however my full discourse-plugin
branch uses it to do full Discourse loads without issue.
2019-08-23 17:51:35 +02:00
Vanessasaurus 08408a9706 Adding Docker container with instructions for running sourcecred (#1288)
Adding docker container recipe and instructions in README for running sourcecred

Signed-off-by: Vanessa Sochat <vsochat@stanford.edu>

Test plan: @decentralion verified that the commands work on a fresh setup prior to merging.
2019-08-23 13:23:35 +02:00
William Chargin 51f37cbf5f
security: upgrade `lodash` to mitigate CVE-2019-10744 (#1306)
Summary:
Generated by manually deleting the three `lodash` paragraphs from the
lockfile and then re-running `yarn`.

Test Plan:
Prior to this commit, running `yarn audit` noted 3011 high-severity
vulnerabilities; now, it notes none. Running `yarn test --full` still
passes.

wchargin-branch: security-upgrade-lodash
2019-08-22 09:04:52 -07:00
William Chargin c162813a5e
Fix Prettier deprecations and typings post upgrade (#1307)
Summary:
In #1194, we upgraded Prettier from 1.13.4 to 1.18.2, but this upgrades
past <https://github.com/prettier/prettier/pull/5647>, which was first
released in Prettier 1.16.0. This commit fixes the uses of deprecated
code introduced as a result. It also upgrades the type definitions to
match, via `flow-typed install prettier@1.18.2`.

Addresses part of #1308.

Test Plan:
Prior to this commit, running `yarn unit` would print

```
    console.warn node_modules/prettier/index.js:7934
      { parser: "babylon" } is deprecated; we now treat it as { parser: "babel" }.
```

in two test cases; it no longer prints any such warnings. Furthermore,
running `git grep 'parser.*babylon'` no longer finds any matches.

wchargin-branch: prettier-deprecations
2019-08-22 09:00:25 -07:00
William Chargin 141a0a23d2
flow: add libdefs for `deep-freeze` (#1310)
Summary:
This dependency was added in #1249 without typedefs, and so is
implicitly `any`-typed.

Depends on #1309 to fix a bug that would otherwise be a true positive
type error.

Addresses part of #1308.

Generated with `flow-typed install deep-freeze@0.0.1`.

Test Plan:
Running `yarn flow` passes, but fails if you remove the `nodePrefix` or
`edgePrefix` attributes of the Discourse plugin declaration.

wchargin-branch: libdefs-deep-freeze
2019-08-22 08:57:16 -07:00