Commit Graph

1406 Commits

Author SHA1 Message Date
Dandelion Mané 8de57fdb7b
add TimelineCred.userNodes (#1369)
This is a convenience method that extracts cred for all the user-typed
nodes. It's basically an abstraction over calling `credSortedNodes` with
the right set of prefixes.

I forsee using it in at least two places (score retrieval in the CLI and
score display in the frontend) so I decided to make it a method.

Test plan: A very simple unit test was added. (It's a very simple
wrapper function.)
2019-09-10 20:02:28 +02:00
Dandelion Mané 1079f5ec86
timelineCred.credSortedNodes takes prefixes (#1368)
This lets us filter by a group of prefixes simultaneously, which enables
e.g. seeing all user node types at once.

I also tweaked the API to make it a bit more convenient, you can now
pass no arguments and get all nodes in sorted order.

Test plan: Unit tests updated.
2019-09-10 19:44:03 +02:00
Dandelion Mané 65f22a0a74
Replace TimelineCredConfig with array of plugins (#1367)
The PluginDeclaration has all of the information we need to configure
TimelineCred: it knows all the node and edge types, as well as which
node types are user (or scoring) node types.

Therefore, we can replace the ad-hoc config object with a simple array
of plugin declarations. Since the plugins will be saved as part of the
TimelineCred, it means the UI can configure to only show information for
plugins that are actually in scope.

Test plan: `yarn test` passes, and the prototype still works. Snapshots
updated.
2019-09-10 19:36:12 +02:00
Dandelion Mané dcf4010ff0
discourse: fix fetch failure on 410 (#1366)
When a post or topic is deleted, Discourse fetch will give status 410.
As with 404 and 403, we should just ignore the post and move on.

I took the opportunity to slightly refactor the fetch error handling
while I was there.

Test plan: Previously, doing a load on the SourceCred discourse instance
would fail due to a deleted topic. Now, it doesn't.
2019-09-10 19:13:13 +02:00
Dandelion Mané aecd2864bf
Let plugins specify user types (#1365)
This modifies the pluginDeclaration so that it can specifiy user node
types. This will allow us to replace the TimelineCredConfig type with a
plugin collection instead.

It's expected that the user types will also be present in the node
types, although this isn't validated anywhere at present.

Test plan: `yarn flow`.
2019-09-10 19:09:01 +02:00
Dandelion Mané dbb31a586c
Capitalize Discourse plugin name (#1364)
This ensures consistency with GitHub, and will allow us to use plugin
names in the UI.

Test plan: Not needed, trivial change.
2019-09-10 19:06:05 +02:00
Dandelion Mané e2e6c56650
Enable multiple scoring node types (#1361)
This updates the cred computation logic so that we can have multiple
"scoring node types".

Context: Currently, we designate a single node type (GitHub users) as
the scoring node type, and normalize so that all users have 1000 score
in total.

This commit updates the pipeline to admit using more than one prefix for
scoring, meaning that we could have GitHub users, Discourse users, and
more, and still have all users sum to 1000 score.

We will still need to update the frontend so that it will have a user
pane which aggregates across all users.

Test plan: Unit tests updated. `yarn test` passes.
2019-09-10 19:05:46 +02:00
William Chargin 0d7db99d7f
Blacklist `@allcontributors` bot (#1363)
Summary:
This adds `MDM6Qm90NDY0NDczMjE=` (`@allcontributors`) to the blacklist
to enable loading the `aragon/aragon` repository. See #1362 and #996 for
context.

Test Plan:
Running `node ./bin/sourcecred.js load aragon/aragon` on a clean cache
now completes successfully.

wchargin-branch: blacklist-allcontributors
2019-09-10 08:55:16 -07:00
William Chargin fa6697719a
update_snapshots: fix Discourse key check (#1360)
Summary:
This was doing exactly the wrong thing, attempting to update snapshots
whenever the Discourse API token was _not_ present.

Test Plan:
Running `env -u DISCOURSE_TEST_API_KEY ./scripts/update_snapshots.sh`
now successfully updates non-Discourse snapshots, rather than emitting
an error, “Please set the DISCOURSE_TEST_API_KEY environment variable.”.

wchargin-branch: update-snapshots-discourse
2019-09-07 16:27:29 -07:00
William Chargin a6a291a3cc
test: fix `example-github-load` snapshot test (#1359)
Summary:
Generated with `./scripts/update_snapshots.sh` (with #1360 patched in).
This fixes failures introduced in #1358.

Test Plan:
Running `yarn test --full` now passes. Inspecting the diff (after piping
the old and new snapshots to `jq -S .`) shows that this includes only
additions, which seems appropriate given the precipitating change.

wchargin-branch: fix-1358-failures
2019-09-07 16:22:53 -07:00
Dandelion Mané 545b084146
Change TimelineCred filtering strategy (#1358)
This changes how TimelineCred filtering works. Instead of using the
filterTimelineCred module, which includes all nodes matching
filterPrefixes, we now take all nodes matching scorePrefixes and
additionally the top `k` nodes for every other type.

This ensures that we will have the top comments, pull requests, issues,
etc in the UI, without needing to take every single comment or PR or
issue.

Concurrently, the UI is updated so that every type is included in the
filter dropdown.

CHANGELOG has been updated, since this is user facing.

Test plan: `yarn test` passes, snapshots are updated, and I also tested
the UI manually.
2019-09-08 00:32:10 +02:00
Dandelion Mané f31a92874b
hide `filterTimelineCred` (#1357)
TimelineCred computation is implemented as follows:
- Compute Distribution
- Filter it down to specified node types
- Wrap the filtered results into a TimelineCred

I want to change how the filtering works. The new filtering logic will
depend on logic we've already implemented in TimelineCred; therefore
filtering should be done on the TimelineCred object and not separately.
Specifically, I want to be able to filter down to the highest-scored
nodes by type (dependent on the type).

As a first step, I've refactored the interface to TimelineCred so that
the filtering is an implementation detail, i.e. the TimelineCred
constructor doesn't expect objects defined in `filterTimelineCred`.

Test plan: `yarn test` passes after a snapshot update.
2019-09-08 00:20:34 +02:00
Dandelion Mané 5996dd710a
timeline cred config is stored in JSON (#1356)
This modifies the TimelineCred serialization so that it includes the
CredConfig in the JSON. This means that it's easier to coordinate which
plugins and types are in scope, as the data itself can contain that
information.

Rather than define a new hand-rolled serializer, I just passed the
config directly through for stringification. Unit tests verify that this
still works (round-trip serialization is tested). As an added sanity
check, I generated a new small `cred.json`, and inspected the file via
`cat` to ensure that it's still legible text, and isn't interpreted as a
binary file due to the `NUL` bytes in node addresses.

Every client that previously depended on the `DEFAULT_CRED_CONFIG` now
properly gets its cred configuration from the JSON.

Test plan: Unit tests for serialization already exist. Generated a fresh
`cred.json` file and tested the frontend with it. Also,
`yarn test --full` passes.
2019-09-08 00:04:01 +02:00
Vanessasaurus ab0628a7ce adding small changes to use docker layers for cache (#1338)
Signed-off-by: Vanessa Sochat <vsochat@stanford.edu>
2019-09-06 18:16:28 +02:00
William Chargin 5bcec38e5b
Blacklist more problematic quasar interactions (#1335)
Blacklist more problematic quasar interactions

Summary:
Context: <https://github.com/sourcecred/sourcecred/issues/1256#issuecomment-526252852>

Without also blacklisting the reaction, we hit an invariant violation in
the relational view (reactions are expected to have exactly one author).

Test Plan:
Running `node ./bin/sourcecred.js load quasarframework/quasar-cli` now
completes successfully (in about 2 minutes 40 seconds). It does emit a
warning:

```
Issue[MDU6SXNzdWUzNDg0NjUzNDg=].reactions: unexpected null value
```

…because one of the reactions was blacklisted. But the relational view
handles this correctly, it seems: timeline cred is still computed and
renders without obvious error.

wchargin-branch: blacklist-more-quasar
2019-09-02 08:18:36 -07:00
William Chargin 7d3d24e0ec
mirror: guess typenames and warn on mismatch (#1337)
Summary:
The format of GitHub’s GraphQL object IDs is explicitly opaque, and so
we must not introspect them in any way that would influence our results.
But it seems reasonable to introspect these IDs solely for diagnostic
purposes, enabling us to proactively detect GitHub’s contract violations
while we still have useful information about the root cause.

This commit adds an optional `guessTypename` option to the Mirror
constructor, which accepts a function that attempts to guess an object’s
typename based on its ID. If the guess differs from what the server
claims, we continue on as before, but omit a console warning to help
diagnose the issue more quickly.

Resolves #1336. See that issue for details.

Test Plan:
Unit tests for `mirror.js` updated, retaining full coverage. To test
manually, revert #1335, then load `quasarframework/quasar-cli`. Note
that it emits the following warning before failing:

> Warning: when setting Reaction["MDg6UmVhY3Rpb24zNDUxNjA2MQ=="].user:
> object "MDEyOk9yZ2FuaXphdGlvbjQzMDkzODIw" looks like it should have
> type "Organization", but the server claims that it has type "User"

Unit tests for the GitHub typename guesser added as well.

Running `yarn test --full` passes.

wchargin-branch: mirror-guess-typenames
2019-09-01 01:04:53 -07:00
Vanessasaurus 8f9e967496 Adding docker build (all but master) and deploy (master) to CircleCI (#1320)
Signed-off-by: Vanessa Sochat <vsochat@stanford.edu>
2019-08-28 23:17:50 +02:00
William Chargin e762d2b900
security: upgrade transitive `eslint-utils` to 1.4.2 (#1332)
Summary:
Upgrading past a security fix in that package. Generated by running
`yarn add eslint@^6.2.2 babel-eslint@^10.0.3`: `eslint` to update the
problematic transitive dependency, and `babel-eslint` to avoid
<https://github.com/eslint/eslint/issues/12117>.

Test Plan:
Running `yarn lint` yields no false positives, and does complain on true
positives. Running `yarn list --pattern eslint-utils` lists only v1.4.2.

wchargin-branch: eslint-utils-1.4.2
2019-08-28 07:51:41 -07:00
William Chargin ae8ab0d1bd
Check typesafety of `NullUtil.filterList` (#1328)
Summary:
The current implementation of `NullUtil.filterList` uses an `any`-cast.
This is fine as long as the definition is actually typesafe; we should
take a least a little care to ensure that it is. This commit adds a
typesafe version, commented out but still typechecked, and refines the
type around the `any`-cast to make the cast slightly more robust.

Test Plan:
Note that changing `$ReadOnlyArray<?T>` to `$ReadOnlyArray<?T | number>`
in the declaration of `filterList` caused no Flow error prior to this
commit, but now causes one.

wchargin-branch: filter-list-typecheck
2019-08-26 10:35:08 -07:00
William Chargin 909045a7ec
Rename `NullUtil.filter` to `NullUtil.filterList` (#1327)
Summary:
The old name is misleading. There _is_ a function called `filter` on
options, but its type is `(Option<T>, (T -> boolean)) -> Option<T>`:

  - Java: <https://docs.oracle.com/javase/8/docs/api/java/util/Optional.html#filter-java.util.function.Predicate->
  - Rust: <https://doc.rust-lang.org/std/option/enum.Option.html#method.filter>
  - Haskell: <https://hackage.haskell.org/package/base-4.12.0.0/docs/Control-Monad.html#v:mfilter>
  - OCaml (Core): <https://ocaml.janestreet.com/ocaml-core/latest/doc/base/Base/Option/index.html#val-filter>

This is even inconsistent with SourceCred’s own documentation:
<126332096f/src/util/null.js (L31)>

In general, a function called `foo` on options where `foo` also exists
on lists has the meaning, “interpret an `Option<T>` as a subsingleton
list, apply `foo` to the list, and reinterpret as an option”. To choose
the same name a conflicting function is confusing.

The function that was wanted is really just a special case of `flatMap`.
For instance, in Java:

```
-> import java.util.stream.Collectors;

-> (List.of(Optional.of(3), Optional.empty(), Optional.of(2))
>>     .stream()
>>     .flatMap(Optional::stream)
>>     .collect(Collectors.toList()))
|  Expression value is: [3, 2]
|    assigned to temporary variable $2 of type List<? extends Object>
```

Yet some languages do provide it as a utility function: [`catMaybes`] in
Haskell, or [`List.filter_opt`] in OCaml (Core). For parallelism with
the latter, we define `NullUtil.filterList`.

[`List.filter_opt`]: https://ocaml.janestreet.com/ocaml-core/latest/doc/base/Base/List/#val-filter_opt
[`catMaybes`]: https://hackage.haskell.org/package/base-4.12.0.0/docs/Data-Maybe.html#v:catMaybes

Test Plan:
That `yarn flow` passes suffices.

wchargin-branch: filter-list
2019-08-26 10:16:44 -07:00
Dandelion Mané 12a3321ea7
Fix failing snapshot test (#1329)
PR #1325 introduced a failing snapshot test, which was promptly caught
by @wchargin. This commit fixes it by running
`./scripts/update_snapshots.sh`. Also, I bumped the project JSON version
number, which also should have happened in #1325.

Test plan: `yarn test --full` passes.
2019-08-26 18:23:11 +02:00
Dandelion Mané b4463f2ab7
cli load uses discourse key (#1326)
This commit modifies `cli/load` to appropriately load a Discourse key
from the environment, if it is available.

The mechanics are basically the same as with the GitHub token.

Test plan: Unit tests added. `yarn test` passes.
2019-08-26 13:40:19 +02:00
Dandelion Mané 243437f1cd
api: add support for loading Discourse servers (#1325)
This commit modifies the `Project` type so that it allows settings for a
Discourse server, and ensures that `api/load` will appropriately load
the server in question, and include it in the output graph.

Putting the full Discourse declaration directly into the Project type is
an unsustainable development practice—in general, adding plugins should
not require changing core data types. However, at the moment I'm punting
on polishing the plugin system, in favor of adding the Discourse plugin
quickly, so I just put it into Project alongside the repo ids.

In the future, I expect to refactor the plugins around a much cleaner
interface; it's just not a priority as yet. (Tracking: #1120.)

This commit also makes the GitHub token optional in `api/load`, since
now it's very plausible that a user will want to only load a Discourse
server, and therefore not require a GitHub token.

As of this commit, it's still impossible to load Discourse projects, as
the CLI always sets a null Discourse server; and in any case, the
frontend would not properly display the project in question, as any
Discourse types would get filtered out.

Test plan: Mocking unit tests have been added to `api/load.test.js` to
ensure that the Discourse graph is loaded and merged correctly.
2019-08-26 13:31:52 +02:00
Dandelion Mané 126332096f
NullUtil: add `filter` (#1324)
This adds a new method called `filter` to the `NullUtil` module.
`filter` enables you to filter all the null-like values out of an array
in a convenient typesafe way. (It's really just a wrapper around
`Array.filter((x) => x != null)` with a type signature.)

Test plan: Unit tests added (for both functionality and type safety).
2019-08-26 13:20:40 +02:00
Dandelion Mané ebdd2a05c5
add a loadDiscourse method (#1323)
This is the analogue to `github/loadGraph`, but for Discourse. It
basically pipes together the mechanisms for loading Discourse data and
creating a Discourse graph from them, resulting in a single endpoint for
consumption in the API.

In contrast to github, the method is called `loadDiscourse` and not
`loadGraph`, which seemed more appropriate to me. I haven't changed
the corresponding GitHub method's name. (I'm currently knowingly letting
conceputal debt accumulate around the plugin interface; I expect to do a
full refactor within the next few months.)

Test plan: This is the kind of "pipe together tested APIs involving IO"
code which I have decided not to write explicit tests for. However, it
is still protected by flow, and I have a branch (`discourse-plugin`)
which uses this code to do a full Discourse load.
2019-08-26 12:25:42 +02:00
Dandelion Mané 0e3ce1c531
mirror: output progress to taskReporter (#1322)
It's nice to get some sense of what is happening while waiting for a Discourse load.

Test plan: See attached unit tests.
2019-08-24 12:41:19 +02:00
Dandelion Mané 012f19eb48
discourse fetch: add rate limiting (#1321)
This implements rate limiting to the Discourse fetch logic, so that we
can actually load nontrivial servers without getting a 529 failure.

We could have used retry; I thought it was more polite to actually limit
the rate at which we make requests. However, to avoid seeing 529s in
practice, I left a bit of a buffer: we make only 55 requests per minute,
although 60 would be allowed.

If we want to improve Discourse loading time, we could boost up to the
full 60 request/min, but add in retries. (Or we could switch to retries
entirely.)

Test plan: This logic is untested, however my full discourse-plugin
branch uses it to do full Discourse loads without issue.
2019-08-23 17:51:35 +02:00
Vanessasaurus 08408a9706 Adding Docker container with instructions for running sourcecred (#1288)
Adding docker container recipe and instructions in README for running sourcecred

Signed-off-by: Vanessa Sochat <vsochat@stanford.edu>

Test plan: @decentralion verified that the commands work on a fresh setup prior to merging.
2019-08-23 13:23:35 +02:00
William Chargin 51f37cbf5f
security: upgrade `lodash` to mitigate CVE-2019-10744 (#1306)
Summary:
Generated by manually deleting the three `lodash` paragraphs from the
lockfile and then re-running `yarn`.

Test Plan:
Prior to this commit, running `yarn audit` noted 3011 high-severity
vulnerabilities; now, it notes none. Running `yarn test --full` still
passes.

wchargin-branch: security-upgrade-lodash
2019-08-22 09:04:52 -07:00
William Chargin c162813a5e
Fix Prettier deprecations and typings post upgrade (#1307)
Summary:
In #1194, we upgraded Prettier from 1.13.4 to 1.18.2, but this upgrades
past <https://github.com/prettier/prettier/pull/5647>, which was first
released in Prettier 1.16.0. This commit fixes the uses of deprecated
code introduced as a result. It also upgrades the type definitions to
match, via `flow-typed install prettier@1.18.2`.

Addresses part of #1308.

Test Plan:
Prior to this commit, running `yarn unit` would print

```
    console.warn node_modules/prettier/index.js:7934
      { parser: "babylon" } is deprecated; we now treat it as { parser: "babel" }.
```

in two test cases; it no longer prints any such warnings. Furthermore,
running `git grep 'parser.*babylon'` no longer finds any matches.

wchargin-branch: prettier-deprecations
2019-08-22 09:00:25 -07:00
William Chargin 141a0a23d2
flow: add libdefs for `deep-freeze` (#1310)
Summary:
This dependency was added in #1249 without typedefs, and so is
implicitly `any`-typed.

Depends on #1309 to fix a bug that would otherwise be a true positive
type error.

Addresses part of #1308.

Generated with `flow-typed install deep-freeze@0.0.1`.

Test Plan:
Running `yarn flow` passes, but fails if you remove the `nodePrefix` or
`edgePrefix` attributes of the Discourse plugin declaration.

wchargin-branch: libdefs-deep-freeze
2019-08-22 08:57:16 -07:00
William Chargin 0367c9e50c
Fix missing prefixes in Discourse plugin (#1309)
Summary:
A `PluginDeclaration` must have a `nodePrefix` and an `edgePrefix`, but
the Discourse plugin declaration was missing these. This was not caught
by Flow because `deep-freeze` was introduced in #1249 without type
definitions; see #1308.

Test Plan:
Apply the following patch:

```diff
diff --git a/src/plugins/discourse/declaration.js b/src/plugins/discourse/declaration.js
index 246a0a28..36ae5f13 100644
--- a/src/plugins/discourse/declaration.js
+++ b/src/plugins/discourse/declaration.js
@@ -1,6 +1,6 @@
 // @flow

-import deepFreeze from "deep-freeze";
+declare function deepFreeze<T>(x: T): T;
 import type {PluginDeclaration} from "../../analysis/pluginDeclaration";
 import type {NodeType, EdgeType} from "../../analysis/types";
 import {NodeAddress, EdgeAddress} from "../../core/graph";
```

Note that, with this patch, `yarn flow` fails before this change but
passes after it. Running `yarn unit` still passes.

wchargin-branch: discourse-plugin-prefixes
2019-08-22 08:52:52 -07:00
William Chargin ecd15ed3c4
flow: add libdefs that have clean updates (#1311)
Summary:
Generated by running `flow-typed install --skip --overwrite` and
reverting a minimal set of libdefs such that the change does not
introduce any Flow errors (except Prettier, which is covered by #1307).

Addresses parts of #1308.

Changes:

  - `chalk`: upgraded v1.x.x to v2.x.x
  - `flow-bin`: no-op; explicit Flow window widening
  - `isomorphic-fetch`: no-op; formatting change
  - `jest`: updates for Flow v0.104.x (explicit inexact objects), and
    also some functional additions
  - `object-assign`: no-op; explicit Flow window widening
  - `rimraf`: added new at v2.x.x

Test Plan:
Flow passes, by construction.

wchargin-branch: libdefs-clean
2019-08-22 08:49:40 -07:00
William Chargin 158dc0b831
flow: update libdefs for `express` (#1312)
Summary:
These can be updated cleanly after applying the SourceCred-specific
patch. I’ve modified the comment on that patch to be clear that it *is*
SourceCred-specific—after updating, I spent a while trying to find why
it was deleted from upstream, before eventually realizing that it never
existed upstream anyway.

Generated by running `flow-typed install express@4.16.3 --overwrite` and
then manually inserting the three “SourceCred-specific hack” comment
blocks.

Addresses part of #1308.

Test Plan:
Running `yarn flow` still passes (but warns if the hacks are removed).

wchargin-branch: libdefs-express
2019-08-22 08:45:52 -07:00
William Chargin c3f242d3af
libdefs: update libdefs for `enzyme` (#1314)
Summary:
These can be updated cleanly now that an upstream pull request has been
merged: <https://github.com/flow-typed/flow-typed/pull/3522>

Generated by running `flow-typed install enzyme@3.3.0 --overwrite`.

Addresses part of #1308.

Test Plan:
Running `yarn flow` still passes.

wchargin-branch: libdefs-enzyme
2019-08-22 08:36:04 -07:00
William Chargin acdcbc29ed
Fix 404s in timeline and legacy-mode interlinks (#1305)
Summary:
All links in SourceCred must use the `Link` component, providing either
an external URL `href={…}` or an internal route `to={…}`. Any uses of a
raw `<a>` element for internal routes will incur 404s when the
application is hosted on a non-root path, as is currently the case on
the production website.

The change to `FileUploader` is not strictly necessary, as the link has
no styled text and uses a `data:` URL, but there’s no reason not to.

Fixes #1304.

Test Plan:
Build the static site:

```
scripts/build_static_site.sh --target cred --project sourcecred/example-github
```

Then run `python3 -m http.server` from the repository root directory—not
the `cred/` subdirectory—and navigate to the timeline cred view:
<http://localhost:8000/cred/timeline/sourcecred/example-github/>

Observe that the “(legacy)” link now has the correct styling and
correctly navigates to the legacy mode page when clicked: prior to this
change, it would navigate to a URL without the proper `/cred/` path
prefix, yielding a 404. On the legacy page, verify that the “timeline
mode” link has the same properties.

Then, visit <http://localhost:8000/cred/test/FileUploader/> and verify
that the inspection test still passes.

Added a regression test to catch further such errors. Note that
reverting the code changes in this commit causes the test to fail, and
that running it with `--verbose` prints the problematic files.

wchargin-branch: fix-bad-routing-404s
2019-08-18 14:43:34 -07:00
William Chargin 0eeda22e0c
Remove console error from default timeline loader (#1303)
Summary:
This is firing on a production page load of the “prototype” link from
the homepage, and does not seem to actually be an error condition.

Test Plan:
Run `yarn start`, navigate to `/timeline/sourcecred/example-github/`,
and observe that the console error has disappeared.

wchargin-branch: defaultloader-console-error
2019-08-18 13:35:02 -07:00
William Chargin 9fc1482d9d
discourse: save superfluous query of `likes` table (#1302)
Summary:
When inserting a “like” action with `INSERT OR IGNORE` semantics, we
also learn whether the action had any effect. We can use this bit to
avoid a separate query checking whether the “like” already exists.

As mentioned here:
<https://github.com/sourcecred/sourcecred/pull/1298#discussion_r314994911>

Test Plan:
Running `yarn test` passes as is, and fails if you change `addLike` to
always return either `changed: true` or `changed: false`.

wchargin-branch: discourse-likes-one-query
2019-08-18 12:00:59 -07:00
William Chargin 5e26424d82
discourse: factor SQL parsing out of loops (#1301)
Summary:
Calling `db.prepare(sql)` parses the text in `sql` and compiles it to a
prepared statement. This takes time, both for the parsing and allocation
itself and for the context switch from JavaScript to C (SQLite).
Prepared statements are designed to be invoked multiple times with
different bound values. This commit factors prepared statement creation
out of loops so that each call to `update` prepares only a constant
number of statements.

In doing so, we naturally factor out some light JS abstractions over the
raw SQL: `addTopic((topic: Topic))`, rather than `addTopicStmt.run(…)`.

In principle, these could be factored out of `update` entirely to
properties set on the class at initialization time, but, as described in
a comment on the GraphQL mirror, we defer this optimization for now as
it introduces additional complexity.

Test Plan:
Running `yarn test --full` passes.

wchargin-branch: discourse-sql-cse
2019-08-18 11:58:44 -07:00
Dandelion Mané bf5c3a953b discourse: tweak default weights
This commit changes the Discourse default weights around, mostly
significantly moving many weights (e.g. LIKES) that have a 0 backward
weight to have a small positive backward weight instead, like 1/16. In
practice, this mitigates an issue where users with few outbound edges
act as "cred sinks" because the cred gets stuck in a loop between the
user and content they've authored.

Test plan: In local experimentation, I've found the new weights produce
more reasonable-seeming cred attribution.
2019-08-18 19:49:08 +02:00
Dandelion Mané 0d6e868324 discourse: distinguish the different AUTHORS edges
I've written the Discourse plugin with distinct edge types for post and
topic authorship; it allows us to have more precise control over how
cred flows (and mitigates the need for #968). However, I gave the two
types the same name, which is confusing in the weight config ui. Now
they are properly distinct.

Test plan: It's a simple string change. In (unpublished) commits with a
full Discourse integration, the new strings show up nicely in the UI.
2019-08-18 19:49:08 +02:00
Dandelion Mané cedc8e8fe5 discourse: fix post urls
The previous code incorrectly constructed a Discourse post url based on
the post's id, rather than its index within the containing topic. This
is now fixed.

Test plan: There isn't actually a snapshot diff, because the post with
id 2 is also the second post in its thread. I'm not too worried about
this, though: this kind of code changes infrequently, and it's pretty
obvious when it's wrong.
2019-08-18 19:49:08 +02:00
Dandelion Mané c082b8faf2
discourse: add likes edges to the graph (#1299)
A very simple commit; we add a type for likes edges, and add them to the
graph.

Test plan: Unit tests added; yarn test passes.
2019-08-18 19:38:32 +02:00
Dandelion Mané bf68a4c01d
discourse mirror updates likes (#1298)
The Discourse mirror class now keeps an up-to-date record of all of the
likes within an instance. It does this by iterating over every user in
the history, and requesting their likes. If at any point we hit a like
we've already seen, we move on to the next user. In the future, we can
improve this so we only query users we haven't checked in a while, or
users who were recently active.

Test plan: Tests verify that we correctly store all the likes, including
after partial updates, and that we don't issue unnecessary queries.
2019-08-18 19:31:52 +02:00
Dandelion Mané e7b1fbd681
mirror: allow fetching all usernames (#1293)
This is a minor change to the Discourse mirror so that it supports a
query to get all users from the server. It will be convenient for a
followon change which makes `update` search for every user's likes.

I also modified createGraph so that it uses the new method, which
results in code that is cleaner and slightly more efficient.

Test plan: Unit tests updated.
2019-08-18 16:27:11 +02:00
Dandelion Mané f3ae0a8415
discourse fetcher: add support for likes (#1294)
For the Discourse plugin, we really want to be able to add a full record
of all of the users' liked posts as edges in the graph. It's a really
high-signal way to move cred, that also gives individual users a lot of
agency and way to engage.

However: we need an API to get this data. Initial searches of API docs
were un-promising; fisrt, we would need to query potentially every post
to get its likes individually (makes it very expensive to find the likes
on old posts), and second, the likes did not come with timestamp
information. For a while, I thought we were at an impasse.

I then went fishing in the Discourse implementation for a solution (yay
open source!). Lots of the API is un-documented, since it's whatever
they happen to add to run Discourse. And it turns out there's a
`user_actions` API ([source]) which can provide all of a user's actions
in order, and having your content liked by someone else is considered an
action. Best of all, these actions come with timestamps.

The upshot is that instead of querying every post to get its likes, we
can query every user to get likes.  Iterating over all users can still
be slow, but it's far better than iterating over all posts; plus we can
implement caching so that we only infrequently check in on inactive
users.

I've added a `likesByUser` method to the Discourse fetch interface that
provides this information. I've also added a snapshot test for it (and
updated all of the snapshots). I also rolled in a slight refactor to
error handling in the fetcher.

The mirror doesn't yet use this information (will come later).

[source]: 82e07cb0f4/app/controllers/user_actions_controller.rb (L3)

Test plan: `yarn test` passes. Snapshots look good.
2019-08-18 16:20:33 +02:00
William Chargin 64f282fff2 discourse: consolidate parallel `MAX` queries (#1296)
Summary:
As mentioned in <https://github.com/sourcecred/sourcecred/pull/1266#discussion_r312345441>.

Test Plan:
Running `yarn test` passes.

wchargin-branch: discourse-single-max-query
2019-08-17 12:58:55 +02:00
Dandelion Mané b50ba67797
add discourse declaration and createGraph (#1292)
This commit adds the logic needed for creating a contribution graph
based on the Discourse data. We first have a declaration with
specifications for the node and edge types in the plugin. We also have a
`createGraph` module which creates a conformant graph from the Mirror
data. The graph creation is thoroughly tested.

Test plan: Inspect unit tests, run `yarn test`. I also have (yet
unpublished) code which loads the graph into the UI, and it appears
fine.
2019-08-17 04:20:27 +02:00
Dandelion Mané 69831d6961
mirror: make a data interface (#1291)
This is a quick fixup so that the coming createGraph module can be
properly tested.

Shout out to @Beanow for anticipating this need in a [review comment].

[review comment]: https://github.com/sourcecred/sourcecred/pull/1266#discussion_r314305108

Test plan: trivial refactor, run `yarn test`
2019-08-15 20:12:20 +02:00
Dandelion Mané 2f8e1c61e4
Add a Discourse API mirror (#1266)
The mirror wraps a SQLite database which will store all of the data we
download from Discourse.

On a call to `update`, it downloads new data from the server and stores
it. Then, when it is asked for information like the topics and posts, it
can just pull from its local copy. This means that we don't need to
re-download the content every time we load a Discourse instance, which
makes the load more performant, more robust to network failures, etc.

Thanks to @wchargin, whose work on the GraphQL mirror for GitHub (#622)
inspired this mirror.

Test plan: I've written unit tests that use a mock fetcher to validate
the update logic. I've also used this to do a full load of the real
SourceCred Discourse instance, and to create a corresponding graph
(using subsequent commits).

Progress towards #865.
2019-08-15 16:08:04 +02:00