This is an alternative to solve #1440, taking my
review comments from #1443, to narrow the error handling
to just 404s from the server and crash on other errors.
@wchargin identified issues with the way we setup and reset the warning
mocks in discourse/mirror.test.js. During testing, we found issues where
an unexpected warning might not cause test failures, or an unexpected
warning could break subsequent tests.
This commit fixes both issues.
Test plan: Besides the fact that `yarn test` passes, we've found that
adding a single unexpected console.warn to a test will cause that test
(and only that test) to fail.
Paired with @wchargin
This fixes the non-recoverable error in #1440; namely SourceCred
crashing when the Discourse server returns 404 for a user's actions. I'm
not sure why this happens (maybe DB is in an inconsistent state?) but
missing the likes for a particular user is less frustrating than not
being able to load cred at all.
I've also added a unit test which verifies this behavior; I've confirmed
that before applying the fix, test test fails.
Test plan: `yarn test`
Summary:
We’ve hitherto only run `yarn test` on each commit, to reduce latency.
This commit introduces an advisory (non-blocking) `yarn test --full`
run. Our GitHub branch protection rule is configured to only require
that the `test` task pass before blessing the PR, which is why the
Docker tag preview job doesn’t also block merging. In the case that a
commit is approved quickly and needs to be merged immediately, this
doesn’t get in your way. In all other cases, this can help prevent
breakages.
Test Plan:
Watch the CI run for this commit. Note that all jobs are running, but
only the `test` job is marked as required; see [screenshot][1].
[1]: https://user-images.githubusercontent.com/4317806/68623255-edce3900-0488-11ea-948f-a0cab5174a35.png
wchargin-branch: ci-advisory-full
Summary:
Generated with `./scripts/update_snapshots.sh`. This fixes failures
introduced in #1431.
Test Plan:
Running `yarn test --full` now passes. Inspecting the diff shows that
this only includes a compat version number change, which is appropriate.
wchargin-branch: fix-1431-failures
Summary:
Most changes due to <https://github.com/prettier/prettier/pull/6694>.
Generated with `yarn add prettier@1.19.1 && yarn prettify`.
Test Plan:
Running `yarn test` suffices.
wchargin-branch: prettier-v1.19.1
This removes all usage of and reference to the admin API key and username. Instead relying on anonymous access of the Discourse API.
This enables anyone to deploy an instance with discourse support, and is much safer, since the admin API key isn't used for this purpose anymore. Once merged I would encourage revoking any admin API keys used in the past.
The only notable remaining reference of the discourse username is in the project file.
Which goes from 0.3.0 to 0.3.1 in a backwards-compatible way here, simply ignoring the username if present. For #1426 I'm expecting a 0.4.0 version, so this is to prevent having to change project files twice.
Test plan: updated the snapshots to their latest anonymous versions. Ran yarn test and anonymous discourse loading from CLI numerous times.
Summary:
The Flow team fixed a lot of bugs related to object spreading recently.
Some of these enable us to simplify our code (`generateGraphqlFlowTypes`
and `mirror`). Some find new genuine errors. Others require suppressions
in place of a larger change.
Test Plan:
Running `yarn flow` now passes.
wchargin-branch: upgrade-flow-v0.111.0
This commit upgrades the legacy explorer to now properly include types
from all loaded plugins, rather than just the GitHub plugin. This makes
the legacy UI much more usable for inspecting SourceCred's own
(multi-plugin) cred.
Test plan: Manual inspection of the frontend. `yarn test` passes.
Part of https://discourse.sourcecred.io/t/fixup-legacy-explorer/316
By keeping the TimelineCred in state instead of the Graph, we can access
the plugin information (and potentially other config) from TimelineCred.
Note that the legacy app does still use old-style cred calculation (no
time weighting).
Test plan: `yarn test`. It's just a refactor.
Part of https://discourse.sourcecred.io/t/fixup-legacy-explorer/316
As suggested in #1420, heretofore the Discourse plugin wasn't actually
picking up mentions. The issue is that the (thoroughly tested) mention
detection logic assumed that mention urls took the form
`$SERVERURL/u/$USERNAME`, but actually they are encoded as a relative link,
as in `/u/$USERNAME`. As such, the logic was internally consistent but
never detected any actual mentions!
It's a good case study in the need for integration tests and not just
unit tests. I've updaded the code so we do have a proper integration
test: references.test.js validates that a topic reference, post
reference, and user mention are all properly detected in the real output
from a Discoures topic.
Test plan: `yarn test` passes; inspect updated snapshots and tests.
Fixes#1420.
I want to have the reference tests depend on real snapshotted data.
Therefore, I'm factoring out the utilities for interacting with the
snapshot data out of fetch.test.js and into snapshotTestUtil.js
Test plan: `yarn test` still passes.
I made a new [test post][1] which has references. The Discourse
snapshots now include it, so we can give a realistic test of reference
and mention detection.
This will allow us to verify whether #1420 is affecting us, and fix it
if so.
Test plan: Commit was generated by running the snapshot updater. Other
snapshots have been updated and look OK. `yarn test` passes.
[1]: https://sourcecred-test.discourse.group/t/a-post-with-references/21
Summary:
The functions `isSqlSafe` and `_nontransactionallyFindUnusedTableName`
are unused, because we no longer need to dynamically generate SQL, and
all operations are clearly safe by construction.
Test Plan:
That `yarn flow` passes suffices.
wchargin-branch: mirror-prune-helpers
Summary:
The Mirror module extraction code calculates the set of transitive
dependencies and stores these results in a temporary table to avoid
unnecessary marshalling between JavaScript and C. We originally chose
the temporary table name dynamically, guaranteeing that it was unused.
However, this is unnecessary:
- The temporary table namespace is unique to each database connection,
so we need only consider possible conflicts in the same connection.
- A `Mirror` instance exercises exclusive ownership of its database
connection, per its constructor docs, so we need only consider
conflicts within this module.
- Temporary tables are only used in the `extract` method, so we need
only consider conflicts in this method.
- The `extract` method makes no open calls nor recursive calls, and
does not yield control back to the event loop, so only one stack
frame can be in `extract` at any time.
- The `extract` method itself only creates the temporary table once.
Thus, the temporary table creation is safe. Furthermore, the failure
mode is simply that we raise an exception and fail cleanly; there is no
risk of data loss or corruption.
This patch replaces the dynamically generated table name with a fixed
name. On top of the work in #1313, this removes the last instance of SQL
queries that are not compile-time constant expressions.
Test Plan:
Running `yarn unit -f graphql/mirror` suffices.
wchargin-branch: mirror-fixed-temp-table
Summary:
The migration is complete; only EAV primitives remain, so they shall be
called simply “primitives”. See #1313 and adjacent commits for context.
Test Plan:
Running `git grep -iw eav` no longer returns any results.
wchargin-branch: mirror-eav-prune-names
Summary:
This logic now abstracts over only one implementation, and is no longer
needed.
Test Plan:
That `yarn unit -f graphql/mirror` passes is sufficient.
wchargin-branch: mirror-eav-prune-test-mux
Summary:
This data is now stored in EAV `primitives` table; see issue #1313 and
adjacent commits for details.
We simultaneously lift the restriction that GraphQL type and field names
be SQL-safe identifiers, as it’s no longer necessary.
Test Plan:
Some test cases queried the legacy primitives tables to check properties
about the database state. These queries have of course been removed;
note that each such removed query was already accompanied by an
equivalent query against the EAV `primitives` table.
Note that `yarn test --full` still passes, and that when manually
loading `sourcecred/example-github` the cache no longer has any of the
legacy tables.
wchargin-branch: mirror-eav-prune-tables
Summary:
GitHub logins may not have underscores, because underscores are not
valid characters in DNS labels. We already have a good-enough regular
expression for validating GitHub usernames; this commit updates the
alias parser to use that.
Discourse usernames are more permissive than what is listed here, but we
leave that unchanged for now.
Test Plan:
Unit tests updated.
wchargin-branch: alias-no-underscore
Summary:
All the documentation and tests seem to be assuming that aliases must be
anchored: `github/torvalds`, but not `some github/torvalds stuff`.
JavaScript regular expressions aren’t anchored by default; this commit
adds explicit anchoring and adds tests.
Test Plan:
Unit tests added.
wchargin-branch: alias-anchor
This commit modifies `discourse/createGraph` so that it finds all of the
same-server Discourse references in Discourse posts, and creates
appropriately typed references edges in response.
The unit tests have been updated with cases for both references that
should exist, and references that shouldn't (e.g. post index out of
bounds, or a reference to the wrong server).
Test plan: `yarn test --full` along with snapshot update.
This is progress towards [Discourse reference and mention detection][1].
[1]: https://discourse.sourcecred.io/t/discourse-reference-mention-detection/270
The `discourse/references` module now has a `linksToReferences` method
which extracts the parsed Discourse references from an array of
hyperlinks. The method is tested.
Test plan: Unit tests added; `yarn test` passes.
This is progress towards [Discourse reference and mention detection][1].
[1]: https://discourse.sourcecred.io/t/discourse-reference-mention-detection/270
Summary:
The notes used to focus on the legacy implementation with a minor note
about the EAV implementation; this change flips that relationship.
Test Plan:
None.
wchargin-branch: mirror-eav-impl-notes
Summary:
This flips the switch for all production `Mirror` reads to use the
single `primitives` EAV table as their source of truth, rather than the
legacy type-specific primitives tables. For context and design
discussion, see issue #1313 and commits adjacent to this one.
Test Plan:
All relevant code paths are already tested (see test plans of commits
adjacent to this one). Running `yarn test --full` passes.
wchargin-branch: mirror-eav-flip
Summary:
This completes the end-to-end EAV mode pipeline, but does not yet set it
as default or use it in production.
A note about indentation: we take care to avoid reindenting the entire
block of `extract` test cases, which is over 900 lines long. As to the
implementation code, reindenting the legacy type-specific primitives
branch is not easily avoidable, but when we remove that branch we won’t
have to reindent the EAV mode branch: we can replace its `if` block with
two scope blocks (which is the right thing to do, anyway).
Test Plan:
We reuse existing tests, which suffice for full coverage in both
implementation branches. Note that these tests cover the case of object
types with no primitive fields (the `Feline` and `Socket` types), which
are more likely to fail in a broken EAV implementation than in a broken
type-specific primitives implementation due to deletion anomalies.
To check that all relevant calls to `mirror.extract(…)` have been
properly replaced with `extract(mirror, …)`, run
yarn coverage -f graphql/mirror -t 'EAV primitives'
and note that the “else” path of the `if (fullOptions.useEavPrimitives)`
branch is not taken; then, run
yarn coverage -f graphql/mirror -t 'legacy type-specific primitives'
and note that the “if” path of the same branch is not taken.
To check that the table hiding logic is working, invert the branch that
checks `if (fullOptions.useEavPrimitives)`, and note that every test
case using the table hiding logic fails (except for some of the error
handling test cases, which do not actually need to read primitive data).
Finally, `yarn test --full` passes after flipping the `useEavPrimitives`
default to `true`.
wchargin-branch: mirror-eav-extract
This is a minor refactor to re-organize the createGraph function in the
Discourse plugin to use a class under the hood. Using a hidden class
makes sense because there is a fair bit of shared state that's needed
while creating the graph.
The proximate cause for this refactor is tha adding reference edges will
bloat the `addPost` section of the function, which was already a little
too complex. Simply shoving in more complexity would make it unweidy. So
I opted for this minor refactor. It's internal-only (no public APIs are
changed).
Test plan: `yarn test` passes. As noted, refactor is internal-only.
This is progress towards [Discourse reference and mention detection][1].
[1]: https://discourse.sourcecred.io/t/discourse-reference-mention-detection/270
This commit adds a `parseLinks` method to a new module,
`plugins/discourse/references`. `parseLinks` allows us to extract the
hyperlinks from `<a>` tags in "cooked" html.
I added `htmlparser2` as a dependency to parse the html. There were a
lot of options to choose from; I chose htmlparser2 because it has a lot
of usage, reasonable performance, and suits our needs. We use this
dependency in a lightweight and local way, so we can always change it
later if needed.
One thing which was a bit odd: I wasn't able to import it using
`import`, and needed a `require` statement instead.
Test plan: Unit tests added; `yarn test` passes.
This is progress towards [Discourse reference and mention detection][1].
[1]: https://discourse.sourcecred.io/t/discourse-reference-mention-detection/270
This modifies the Discourse fetcher and mirror so that we now keep post
contents around, thus enabling future reference detection (and other
things). The post contents are stored and provided as retrieved from the
API, which is in "cooked" HTML form.
Test plan: Unit tests and snapshots updated. Observe that the snapshots
now include Discourse post contents.
This is progress towards [Discourse reference and mention detection][1].
[1]: https://discourse.sourcecred.io/t/discourse-reference-mention-detection/270
We need one tiny change in test code, where Flow (correctly) detects an
error. I've added an error suppression comment because it is truly a
Flow error, but is appropriate as we are testing an error condition.
Test plan: `yarn test`
In #1391, I updated the default alpha, but forgot to regenerate the load
snapshots. This caused a [nightly build failure][1]. This commit fixes
it.
Test plan: `yarn test --full` passes.
[1]: https://circleci.com/gh/sourcecred/sourcecred/2300