sourcecred

Commit Graph

Author	SHA1	Message	Date
Robin van Boven	821be0b46e	UniRef: add MappedReferenceDetector implementation (#1509 ) The current reference detection implementation internal to the GitHub plugin uses a map similar to this. This class being near to that makes it easy to adopt. It's also very simple to use for tests.	2020-01-07 13:26:53 +01:00
Robin van Boven	4bdc7a57b7	UniRef: Declare ReferenceDetector interface (#1508 ) The core declaration of the ReferenceDetector interface. Reason I'm adding an index.js file is to allow (core) classes that implement this interface to have separate files, while keeping redundancy out of the import statements.	2020-01-07 12:27:19 +01:00
Robin van Boven	b860f31a19	Use createProject to set default values for Project (#1492 ) Creation of new Project instances is spread out across the code. So whenever there's a change in it's format, the PR is cluttered with adding a logical default value in many places. It means our default values might be inconsistent as well. For example #1385 adds many `identities: [],` lines. A similar situation would happen with the planned Initiatives plugin, adding many `initiatives: null,` lines. Using this function we can manage what default values to add from a central place. Avoiding noise and code churn.	2020-01-03 15:11:41 +01:00
Robin van Boven	b7b93d2a8d	Add unit tests for projectFromJSON's upgrade support (#1519 )	2020-01-03 14:57:46 +01:00
Robin van Boven	fb16530b8f	Update projectFromJSON & upgrade function signatures (#1518 ) This creates better flow type coverage for the upgrading from older Project types feature. Note projectFromJSON's function signature changes like this: - (Compatible<Project>) => Project + (Compatible<any>) => Project And that makes sense, because we use this function to validate an object we parsed from JSON at runtime. It could actually be anything. Added benefit is that is makes writing unit tests possible. Because now will flow not throw a type error when we provide something other than Compatible<Project> as input, to test upgrading or validation functionality. Note that the underlying utility fromCompat already uses Compatible<any> for the same object.	2020-01-03 14:50:20 +01:00
Robin van Boven	89529ea4fe	Define previous Project type versions (#1517 ) These are important to accurately add types to function signatures of validating and upgrading logic.	2020-01-03 14:36:01 +01:00
Robin van Boven	8f32912270	Initiatives: define internal datatype Initiative (#1417 )	2019-12-23 00:43:36 +01:00
Robin van Boven	1b0eb483ce	Initiatives: create plugin declaration (#1416 )	2019-12-23 00:36:57 +01:00
Robin van Boven	f35d7e088f	Chore: update packages & bugfix new versions (#1487 ) * chore(package): yarn upgrade Updates all packages within version range. * Bugfix update stacktrace matching code The stacktrace has changed, most likely due to a babel plugin updating. It now seems based on the name of the `handlingErrors` argument instead of the variable name storing the anonymous function. * Bugfix update react-router patch version By updating the react packages, warnings were logged about unsafe componentWillMount usage. These warnings tripped a unit test. react-router was the cause of these, so this update avoids getting the warnings.	2019-12-10 20:52:55 +01:00
Robin van Boven	a41eb71949	GitHub: assume installation token length of 40 (#1486 ) In the documentation 16 characters were displayed. But testing showed we're typically seeing 40. Fixes #1474	2019-12-10 18:45:54 +01:00
Robin van Boven	b4a0cd5ec7	Discourse: remove update mode 1 (#1482 )	2019-12-09 13:00:47 +01:00
Robin van Boven	c209c40e08	Discourse: update error handling of fetch (#1481 ) - Have "topic" reflect actual method name. - Add missing 403 and 429 test for likes. - Preemptively change method used for headers, as .post will be obsolete after refactor.	2019-12-09 12:54:02 +01:00
Robin van Boven	32a1db3010	Discourse: default to update mode 2 (#1465 )	2019-12-03 20:11:28 +01:00
Robin van Boven	f30723b96d	Discourse: add update mode 2 (#1464 )	2019-12-02 20:37:19 +01:00
Robin van Boven	890489c0d2	Discourse: make MockFetcher API similar to real forum (#1473 ) This extends the MockFetcher in the tests to provide new semantics update mode 2 relies on. They're based on the below changes to the Fetcher: - add categoryId and bumpedMs to Topic data #1454 - make topicWithPosts fetch all posts #1455 - add categoryDefinitionTopicIds to fetcher #1456 - implement topicsBumpedSince in fetcher #1457 Particularly because the addition of two new concepts (categories and category definition topics), the API of the MockFetcher got rather convoluted. This refactor makes it behave a lot more like you'd be familiar with within Discourse. Such as, creating a topic creates it's opening post as a side effect. Instead of a post with an unknown topic ID creating a topic as a side effect. And creating a category creates it's category definition topic as a side effect. Also, we're being a lot more explicit, using objects instead of positional arguments.	2019-12-02 20:29:18 +01:00
Robin van Boven	c521acc145	Discourse: scope mirror tests as being "mode 1" (#1463 ) This is to prepare for mode 2 being tested side-by-side. The normalizeMode1Topics function enforces bumpedMs is not updated for mode 1 tests. Additionally describe "update semantics" is redundant, as the mirror has no other function than update.	2019-12-02 20:24:49 +01:00
Robin van Boven	3ceb4fb7fa	GitHub: update token validation function (#1471 ) Previously an inline check was used for this. It only accepted the personal access token format. This adds installation tokens as requested in #1461. With more complex logic, we'd benefit from tests. Therefore it's a separate function with a test suite.	2019-11-29 11:53:07 +01:00
Robin van Boven	984c6bbe9f	Add transifex-integration bot (#1469 ) See `9d48a5fca6` as an example of the bot acting as a user.	2019-11-29 11:46:02 +01:00
Robin van Boven	f2e1775c20	Add github-actions bot (#1466 ) See https://github.com/sourcecred/sourcecred-action/pull/6 as an example of the bot acting as a user.	2019-11-29 11:40:37 +01:00
Robin van Boven	a9e89b9f32	Discourse: move update steps to separate functions (#1462 ) Makes no functional changes, it simply splits the update into separate functions so it can be switched out for another implementation.	2019-11-26 11:55:39 +01:00
Robin van Boven	1e643d012f	Discourse: add SyncHeads to the repository (#1460 ) This tracks the local state for new mirroring logic.	2019-11-26 11:47:16 +01:00
Robin van Boven	f6bc91ce5f	Discourse: add replaceTopicTransaction method to repository (#1459 ) Idempotent insert/replace of a Topic, including all it's Posts. Note: this will insert new posts, update existing posts and delete old posts. As these are separate queries, we use a transaction here. This is to be used in the new update logic, which also fetches all posts of a topic when the topic is loaded. In particular this allows post editing, which is important for wiki's such as those used for the initiative system.	2019-11-26 11:35:04 +01:00
Robin van Boven	7deb0a3205	Discourse: adds bumpedMsForTopic and topicsInCategories queries (#1458 ) bumpedMsForTopic For the given topic ID, retrieves the bumpedMs value. Returns null, when the topic wasn't found. Used by the new update code as a fallback value when making API calls that don't contain the bumpedMs field. topicsInCategories Finds the TopicIds of topics that have one of the categoryIds as it's category. Useful to find out which topics a set of categories contains. For example to implement the `recheckTopicsInCategories` mirror option, or to locate topics for the initiative plugin.	2019-11-26 11:02:27 +01:00
Robin van Boven	564fd89b1e	Discourse: implement topicsBumpedSince in fetcher (#1457 )	2019-11-20 13:27:13 +01:00
Robin van Boven	51e3eb8c25	Discourse: add categoryDefinitionTopicIds to fetcher (#1456 )	2019-11-16 14:04:45 +01:00
Robin van Boven	623c362246	Discourse: make topicWithPosts fetch all posts (#1455 ) Previously it would only consider page 1. Now we're walking through all pages, as this is a much more effective way of discovering all posts.	2019-11-16 13:59:18 +01:00
Robin van Boven	23f1db6ce4	Discourse: add categoryId and bumpedMs to Topic data (#1454 ) As not all API calls return bumpedMs, make a new type to show the distinction.	2019-11-16 13:52:32 +01:00
Robin van Boven	e79cca6c6c	Discourse: add mirror options to 0.4.0 projects (#1451 ) N.b. this is an alternative to #1433, removing multi-server support for discourse.	2019-11-16 13:46:09 +01:00
Robin van Boven	8e693a942d	Discourse: CLI cleanup (#1448 ) - Remove username from help text. - Simplify projectId generation.	2019-11-15 14:19:08 +01:00
Robin van Boven	d6fb58bf2c	Discourse: split Mirror from MirrorRepository (#1432 )	2019-11-15 13:52:01 +01:00
Robin van Boven	28737cd4d2	Discourse: fetcher 404s for user actions as null (#1446 ) This is an alternative to solve #1440, taking my review comments from #1443, to narrow the error handling to just 404s from the server and crash on other errors.	2019-11-15 13:39:08 +01:00
Dandelion Mané	d34ef1cb42	Fix console warn issues in discourse mirror tests (#1444 ) @wchargin identified issues with the way we setup and reset the warning mocks in discourse/mirror.test.js. During testing, we found issues where an unexpected warning might not cause test failures, or an unexpected warning could break subsequent tests. This commit fixes both issues. Test plan: Besides the fact that `yarn test` passes, we've found that adding a single unexpected console.warn to a test will cause that test (and only that test) to fail. Paired with @wchargin	2019-11-11 19:20:53 -08:00
Dandelion Mané	aabeda2403	Make Discourse robust 404s on user actions (#1443 ) This fixes the non-recoverable error in #1440; namely SourceCred crashing when the Discourse server returns 404 for a user's actions. I'm not sure why this happens (maybe DB is in an inconsistent state?) but missing the likes for a particular user is less frustrating than not being able to load cred at all. I've also added a unit test which verifies this behavior; I've confirmed that before applying the fix, test test fails. Test plan: `yarn test`	2019-11-11 17:58:23 -08:00
William Chargin	c07a3fe208	deps: upgrade `prettier@1.19.1` (#1439 ) Summary: Most changes due to <https://github.com/prettier/prettier/pull/6694>. Generated with `yarn add prettier@1.19.1 && yarn prettify`. Test Plan: Running `yarn test` suffices. wchargin-branch: prettier-v1.19.1	2019-11-11 12:47:26 -08:00
Robin van Boven	64834c6874	Remove Discourse admin API key and user. (#1431 ) This removes all usage of and reference to the admin API key and username. Instead relying on anonymous access of the Discourse API. This enables anyone to deploy an instance with discourse support, and is much safer, since the admin API key isn't used for this purpose anymore. Once merged I would encourage revoking any admin API keys used in the past. The only notable remaining reference of the discourse username is in the project file. Which goes from 0.3.0 to 0.3.1 in a backwards-compatible way here, simply ignoring the username if present. For #1426 I'm expecting a 0.4.0 version, so this is to prevent having to change project files twice. Test plan: updated the snapshots to their latest anonymous versions. Ran yarn test and anonymous discourse loading from CLI numerous times.	2019-11-07 17:34:00 -08:00
William Chargin	ada9140663	Upgrade Flow to v0.111.0 (#1436 ) Summary: The Flow team fixed a lot of bugs related to object spreading recently. Some of these enable us to simplify our code (`generateGraphqlFlowTypes` and `mirror`). Some find new genuine errors. Others require suppressions in place of a larger change. Test Plan: Running `yarn flow` now passes. wchargin-branch: upgrade-flow-v0.111.0	2019-11-01 19:55:07 -07:00
Dandelion Mané	d47e6e28c0	legacy UI defaults to showing all users (#1430 ) This is basically a backport of #1371 to the legacy UI. Test plan: Manual inspection verifies it's doing the right thing. `yarn test` passes. Part of https://discourse.sourcecred.io/t/fixup-legacy-explorer/316	2019-10-28 23:55:53 -06:00
Dandelion Mané	dfc7ee8524	Show all plugins' types in legacy ui (#1429 ) This commit upgrades the legacy explorer to now properly include types from all loaded plugins, rather than just the GitHub plugin. This makes the legacy UI much more usable for inspecting SourceCred's own (multi-plugin) cred. Test plan: Manual inspection of the frontend. `yarn test` passes. Part of https://discourse.sourcecred.io/t/fixup-legacy-explorer/316	2019-10-28 23:51:33 -06:00
Dandelion Mané	3754cafb7d	legacy app state includes TimelineCred (#1428 ) By keeping the TimelineCred in state instead of the Graph, we can access the plugin information (and potentially other config) from TimelineCred. Note that the legacy app does still use old-style cred calculation (no time weighting). Test plan: `yarn test`. It's just a refactor. Part of https://discourse.sourcecred.io/t/fixup-legacy-explorer/316	2019-10-28 23:49:11 -06:00
Dandelion Mané	d896f73329	Discourse plugin now properly detects mentions (#1424 ) As suggested in #1420, heretofore the Discourse plugin wasn't actually picking up mentions. The issue is that the (thoroughly tested) mention detection logic assumed that mention urls took the form `$SERVERURL/u/$USERNAME`, but actually they are encoded as a relative link, as in `/u/$USERNAME`. As such, the logic was internally consistent but never detected any actual mentions! It's a good case study in the need for integration tests and not just unit tests. I've updaded the code so we do have a proper integration test: references.test.js validates that a topic reference, post reference, and user mention are all properly detected in the real output from a Discoures topic. Test plan: `yarn test` passes; inspect updated snapshots and tests. Fixes #1420.	2019-10-25 15:01:39 -06:00
Dandelion Mané	4e0d884283	discourse: factor out snapshotTestUtil (#1423 ) I want to have the reference tests depend on real snapshotted data. Therefore, I'm factoring out the utilities for interacting with the snapshot data out of fetch.test.js and into snapshotTestUtil.js Test plan: `yarn test` still passes.	2019-10-25 14:58:36 -06:00
Dandelion Mané	eed115a995	Add to (and update) Discourse snapshots (#1422 ) I made a new [test post][1] which has references. The Discourse snapshots now include it, so we can give a realistic test of reference and mention detection. This will allow us to verify whether #1420 is affecting us, and fix it if so. Test plan: Commit was generated by running the snapshot updater. Other snapshots have been updated and look OK. `yarn test` passes. [1]: https://sourcecred-test.discourse.group/t/a-post-with-references/21	2019-10-24 11:28:16 -06:00
William Chargin	01bdb2e94a	mirror: remove unused helper functions (#1351 ) Summary: The functions `isSqlSafe` and `_nontransactionallyFindUnusedTableName` are unused, because we no longer need to dynamically generate SQL, and all operations are clearly safe by construction. Test Plan: That `yarn flow` passes suffices. wchargin-branch: mirror-prune-helpers	2019-10-19 18:14:40 -07:00
William Chargin	b0b911cec4	mirror: use fixed temp table for transitive deps (#1350 ) Summary: The Mirror module extraction code calculates the set of transitive dependencies and stores these results in a temporary table to avoid unnecessary marshalling between JavaScript and C. We originally chose the temporary table name dynamically, guaranteeing that it was unused. However, this is unnecessary: - The temporary table namespace is unique to each database connection, so we need only consider possible conflicts in the same connection. - A `Mirror` instance exercises exclusive ownership of its database connection, per its constructor docs, so we need only consider conflicts within this module. - Temporary tables are only used in the `extract` method, so we need only consider conflicts in this method. - The `extract` method makes no open calls nor recursive calls, and does not yield control back to the event loop, so only one stack frame can be in `extract` at any time. - The `extract` method itself only creates the temporary table once. Thus, the temporary table creation is safe. Furthermore, the failure mode is simply that we raise an exception and fail cleanly; there is no risk of data loss or corruption. This patch replaces the dynamically generated table name with a fixed name. On top of the work in #1313, this removes the last instance of SQL queries that are not compile-time constant expressions. Test Plan: Running `yarn unit -f graphql/mirror` suffices. wchargin-branch: mirror-fixed-temp-table	2019-10-19 18:12:59 -07:00
William Chargin	ebdd20b576	mirror: clean up references to “EAV” primitives (#1349 ) Summary: The migration is complete; only EAV primitives remain, so they shall be called simply “primitives”. See #1313 and adjacent commits for context. Test Plan: Running `git grep -iw eav` no longer returns any results. wchargin-branch: mirror-eav-prune-names	2019-10-19 18:09:24 -07:00
William Chargin	dbf22cdcfc	mirror: remove primitives test multiplexing logic (#1348 ) Summary: This logic now abstracts over only one implementation, and is no longer needed. Test Plan: That `yarn unit -f graphql/mirror` passes is sufficient. wchargin-branch: mirror-eav-prune-test-mux	2019-10-19 18:06:20 -07:00
William Chargin	0f52fb4c26	mirror: remove legacy tables (#1347 ) Summary: This data is now stored in EAV `primitives` table; see issue #1313 and adjacent commits for details. We simultaneously lift the restriction that GraphQL type and field names be SQL-safe identifiers, as it’s no longer necessary. Test Plan: Some test cases queried the legacy primitives tables to check properties about the database state. These queries have of course been removed; note that each such removed query was already accompanied by an equivalent query against the EAV `primitives` table. Note that `yarn test --full` still passes, and that when manually loading `sourcecred/example-github` the cache no longer has any of the legacy tables. wchargin-branch: mirror-eav-prune-tables	2019-10-19 18:02:22 -07:00
William Chargin	003efdffa7	mirror: remove legacy non-EAV `extract` (#1346 ) Test Plan: Existing tests suffice, retaining full coverage. wchargin-branch: mirror-eav-prune-extract	2019-10-19 17:58:17 -07:00
William Chargin	f577ae7c1e	identity: forbid underscores in GitHub logins (#1414 ) Summary: GitHub logins may not have underscores, because underscores are not valid characters in DNS labels. We already have a good-enough regular expression for validating GitHub usernames; this commit updates the alias parser to use that. Discourse usernames are more permissive than what is listed here, but we leave that unchanged for now. Test Plan: Unit tests updated. wchargin-branch: alias-no-underscore	2019-10-19 09:10:38 -07:00
William Chargin	28b25c2910	identity: require aliases to be anchored (#1413 ) Summary: All the documentation and tests seem to be assuming that aliases must be anchored: `github/torvalds`, but not `some github/torvalds stuff`. JavaScript regular expressions aren’t anchored by default; this commit adds explicit anchoring and adds tests. Test Plan: Unit tests added. wchargin-branch: alias-anchor	2019-10-19 09:06:09 -07:00
Dandelion Mané	b2943390dc	add discourse references to the graph (#1410 ) This commit modifies `discourse/createGraph` so that it finds all of the same-server Discourse references in Discourse posts, and creates appropriately typed references edges in response. The unit tests have been updated with cases for both references that should exist, and references that shouldn't (e.g. post index out of bounds, or a reference to the wrong server). Test plan: `yarn test --full` along with snapshot update. This is progress towards [Discourse reference and mention detection][1]. [1]: https://discourse.sourcecred.io/t/discourse-reference-mention-detection/270	2019-10-18 10:56:53 -06:00
Robin van Boven	e043347526	Support dashes in alias usernames. (#1412 )	2019-10-17 13:21:39 -06:00
Dandelion Mané	78c34b5a36	Parse Discourse references from hyperlinks (#1405 ) The `discourse/references` module now has a `linksToReferences` method which extracts the parsed Discourse references from an array of hyperlinks. The method is tested. Test plan: Unit tests added; `yarn test` passes. This is progress towards [Discourse reference and mention detection][1]. [1]: https://discourse.sourcecred.io/t/discourse-reference-mention-detection/270	2019-10-16 18:39:46 -06:00
Robin van Boven	00cc8b2a54	Expand the blacklist, found new type inconsistencies (#1407 ) - Bots being Users as a commit author - Orgs being Users on a reaction Repositories affected, check represents tested after patch: - [x] prettier/prettier - [x] lovell/sharp - [x] facebook/jest - [x] babel/babel-eslint - [x] recharts/recharts - [x] webpack-contrib/css-loader - [x] yannickcr/eslint-plugin-react - [x] vuejs/vuex - [x] chimurai/http-proxy-middleware - [x] sass/node-sass - [x] lodash/lodash - [x] vuejs/vue - [x] reacttraining/react-router - [x] axios/axios - [x] webpack/webpack-dev-middleware - [x] eslint/eslint - [x] webpack/webpack - [x] webpack/webpack-cli - [x] sinonjs/sinon - [x] neutrinojs/webpack-chain - [x] webpack/webpack-dev-server Found as part of https://github.com/teamopen-dev/sourcecred-stack-lookup Test after this patch: pending, it's a lot of data after the cache invalidated 😅	2019-10-15 08:37:24 -07:00
William Chargin	0380088af2	mirror: update implementation notes for EAV tables (#1345 ) Summary: The notes used to focus on the legacy implementation with a minor note about the EAV implementation; this change flips that relationship. Test Plan: None. wchargin-branch: mirror-eav-impl-notes	2019-10-12 11:36:16 -07:00
William Chargin	809fd23def	mirror: read from EAV tables by default (#1344 ) Summary: This flips the switch for all production `Mirror` reads to use the single `primitives` EAV table as their source of truth, rather than the legacy type-specific primitives tables. For context and design discussion, see issue #1313 and commits adjacent to this one. Test Plan: All relevant code paths are already tested (see test plans of commits adjacent to this one). Running `yarn test --full` passes. wchargin-branch: mirror-eav-flip	2019-10-12 11:28:55 -07:00
William Chargin	e5a77488de	mirror: add EAV reading to `extract`, behind flag (#1343 ) Summary: This completes the end-to-end EAV mode pipeline, but does not yet set it as default or use it in production. A note about indentation: we take care to avoid reindenting the entire block of `extract` test cases, which is over 900 lines long. As to the implementation code, reindenting the legacy type-specific primitives branch is not easily avoidable, but when we remove that branch we won’t have to reindent the EAV mode branch: we can replace its `if` block with two scope blocks (which is the right thing to do, anyway). Test Plan: We reuse existing tests, which suffice for full coverage in both implementation branches. Note that these tests cover the case of object types with no primitive fields (the `Feline` and `Socket` types), which are more likely to fail in a broken EAV implementation than in a broken type-specific primitives implementation due to deletion anomalies. To check that all relevant calls to `mirror.extract(…)` have been properly replaced with `extract(mirror, …)`, run yarn coverage -f graphql/mirror -t 'EAV primitives' and note that the “else” path of the `if (fullOptions.useEavPrimitives)` branch is not taken; then, run yarn coverage -f graphql/mirror -t 'legacy type-specific primitives' and note that the “if” path of the same branch is not taken. To check that the table hiding logic is working, invert the branch that checks `if (fullOptions.useEavPrimitives)`, and note that every test case using the table hiding logic fails (except for some of the error handling test cases, which do not actually need to read primitive data). Finally, `yarn test --full` passes after flipping the `useEavPrimitives` default to `true`. wchargin-branch: mirror-eav-extract	2019-10-12 11:23:35 -07:00
Dandelion Mané	e1a73ac368	refactor discourse createGraph (#1409 ) This is a minor refactor to re-organize the createGraph function in the Discourse plugin to use a class under the hood. Using a hidden class makes sense because there is a fair bit of shared state that's needed while creating the graph. The proximate cause for this refactor is tha adding reference edges will bloat the `addPost` section of the function, which was already a little too complex. Simply shoving in more complexity would make it unweidy. So I opted for this minor refactor. It's internal-only (no public APIs are changed). Test plan: `yarn test` passes. As noted, refactor is internal-only. This is progress towards [Discourse reference and mention detection][1]. [1]: https://discourse.sourcecred.io/t/discourse-reference-mention-detection/270	2019-10-11 13:46:49 -06:00
Dandelion Mané	d4804a7a68	Add edge types for Discourse references (#1406 ) Test plan: It's just a declaration change. `yarn flow` passes. This is progress towards [Discourse reference and mention detection][1]. [1]: https://discourse.sourcecred.io/t/discourse-reference-mention-detection/270	2019-10-11 13:46:35 -06:00
Dandelion Mané	eb008f40cc	discourse: factor out address module (#1404 ) This will make it possible to depend on addresses in the reference module. Test plan: `yarn test` passes. This is progress towards [Discourse reference and mention detection][1]. [1]: https://discourse.sourcecred.io/t/discourse-reference-mention-detection/270	2019-10-11 13:40:10 -06:00
Dandelion Mané	5e02a2caeb	Add logic for plucking hyperlinks from cooked html (#1403 ) This commit adds a `parseLinks` method to a new module, `plugins/discourse/references`. `parseLinks` allows us to extract the hyperlinks from `<a>` tags in "cooked" html. I added `htmlparser2` as a dependency to parse the html. There were a lot of options to choose from; I chose htmlparser2 because it has a lot of usage, reasonable performance, and suits our needs. We use this dependency in a lightweight and local way, so we can always change it later if needed. One thing which was a bit odd: I wasn't able to import it using `import`, and needed a `require` statement instead. Test plan: Unit tests added; `yarn test` passes. This is progress towards [Discourse reference and mention detection][1]. [1]: https://discourse.sourcecred.io/t/discourse-reference-mention-detection/270	2019-10-11 13:36:31 -06:00
Dandelion Mané	f82c1bfbbe	Add post contents to the Discourse mirror (#1402 ) This modifies the Discourse fetcher and mirror so that we now keep post contents around, thus enabling future reference detection (and other things). The post contents are stored and provided as retrieved from the API, which is in "cooked" HTML form. Test plan: Unit tests and snapshots updated. Observe that the snapshots now include Discourse post contents. This is progress towards [Discourse reference and mention detection][1]. [1]: https://discourse.sourcecred.io/t/discourse-reference-mention-detection/270	2019-10-11 13:31:01 -06:00
Dandelion Mané	026d3dc705	Upgrade flow to v109 (#1395 ) We need one tiny change in test code, where Flow (correctly) detects an error. I've added an error suppression comment because it is truly a Flow error, but is appropriate as we are testing an error condition. Test plan: `yarn test`	2019-10-03 10:41:51 -06:00
Dandelion Mané	64c17f7dba	Change default alpha to 0.2 (#1391 ) SourceCred is currently quite sensitive to inadvertent 'tight loops' in the cred, where (e.g.) one user recieves cred but doesn't have many out edges, resulting in a feedback loop where that person gets disproportinate cred. See [1] and [2] for some examples. Per a [suggestion] from @mzargham, I'm going to bandaid this issue by increasing the alpha parameter; I've increased it 4x from 0.05 to 0.2. Subjectively, I think this improves the cred quality. [1]: https://discourse.sourcecred.io/t/sneak-peek-sourcecred-discourse-plugin/171 [2]: https://discourse.sourcecred.io/t/preliminary-credsperiment-cred/219 [suggestion]: https://discourse.sourcecred.io/t/preliminary-credsperiment-cred/219/16?u=decentralion	2019-09-30 10:49:25 -06:00
Dandelion Mané	6e2af1070f	Expose alpha in TimelineExplorer (#1390 ) This commit modifies the TimelineExplorer so that the user can both see the chosen alpha value, and change it. Alpha has a pretty profound impact on the final scores, and I want to tweak it for CredSperiment week two, so this is an important addition. Test plan: Modify the alpha, re-run cred calculation, and observe that the scores change. `yarn test` passes.	2019-09-30 10:33:15 -06:00
Dandelion Mané	54ece536d3	Integrate the identity plugin (#1385 ) This commit integrates the identity plugin, which was created in #1384. It does this by adding explicit identity fields to the project configuration, which are then applied when loading the graph in `api/load.js`. The actual integration is quite straightforward. Test plan: The underlying logic is thoroughly tested; I added one new test case to verify that it is integrated properly. Since the project compat has changed, I've updated all the snapshots. Prior to merging this PR, I will produce one "integration test", using this code to do identity resolution for a real project (i.e. on the SourceCred instance itself).	2019-09-20 12:08:27 +02:00
Dandelion Mané	9a9f211901	Add the identity plugin (#1384 ) This commit adds the new SourceCred identity plugin. As described in the README.md file: This folder contains the Identity plugin. Unlike most other plugins, the Identity plugin does not add any new contributions to the graph. Instead, it allows collapsing different user accounts together into a shared 'identity' node. To see why this is valuable, imagine that a contributor has an account on both GitHub and Discourse (potentially with a different username on each service). We would like to combine these two identities together, so that we can represent that user's combined cred properly. The Identity plugin enables this. Specifically, the instance maintainer can provide a (locally unique) username for the user, along with a list of aliases the user is known by, e.g. `github/username` and `discourse/other_username`. The aliases are simple string representations, that are intended to be easy to maintain by hand in a configuration file. Then, the identity plugin will provide a list of `NodeContraction`s that can be used by `Graph.contractNodes` to combine the user identities as described. The plugin is broken up into a few submoudles: - `declaration.js` provides the PluginDeclaration. It has a single node type (the identity node). - `identity.js` declares the `Identity` type (a username and list of aliases), allows constructing identity nodes, and does some validation on the identity username. - `alias.js` implements the logic for parsing aliases like "github/decentralion" or "discourse/s_ben" into a node address. - `nodeContractions.js` provides logic for turning a list of Identities into a list of NodeContractions, suitable for use in `Graph.contractNodes`. The plugin is not yet integrated; that will come in a followon commit. Test plan: Unit tests added; `yarn test` passes.	2019-09-20 11:50:59 +02:00
Dandelion Mané	b86dcf742e	Make the Discourse plugin robust to errors (#1387 ) Currently attempting to load the SourceCred discourse instance fails with foreign key constraint errors. Basically, we have a few weird situations: - A post (which corresponds to the 'psuedo-topic' generated by creating a new category) is picked up, but its topic is not detected, because Discourse does not list these 'psuedo-topics' in the latest topic endpoint. Attempting to add the post breaks the foreign key constraint. - We have several likes which correspond to posts that don't exist. Possibly they were deleted? I'm not sure. Right now, the load process fails entirely when it hits these exceptions, which is bad. It should print a warning instead, and continue without the offending interactions. This commit effects that change in behavior. Test plan: Before this commit, loading the SourceCred discourse with a clean cache fails. After building with this commit, loading the SourceCred discourse with a clean cache workes and prints the following warnings: ``` $ node bin/sourcecred.js discourse https://discourse.sourcecred.io credbot GO load-discourse.sourcecred.io GO discourse GO discourse/topics DONE discourse/topics: 3m 53s GO discourse/posts Warning: Encountered error 'FOREIGN KEY constraint failed' while adding post https://discourse.so urcecred.io/t/214/1. DONE discourse/posts: 2m 38s GO discourse/likes DONE discourse/likes: 50s DONE discourse: 7m 21s GO compute-cred DONE compute-cred: 547ms DONE load-discourse.sourcecred.io: 7m 22s ``` Also, unit tests have been added that verify the specific behavior changes.	2019-09-20 11:21:53 +02:00
Robin van Boven	d5d00aae5a	Blacklist techtribe org, thumbsup reaction (#1386 ) Fixes #1353 Tested manually by creating a docker image including the changes. Running the dev-preview @passbolt command until completion. (once hitting the github rate limit, once till #1354 happens) No more problematic interactions show up during load.	2019-09-20 11:20:14 +02:00
Robin van Boven	d6bbc939b2	Add more bots. (#1383 ) Fixes #1381	2019-09-19 17:52:20 +02:00
Dandelion Mané	8f46d7d812	Fix bug when selecting "All users" in explorer (#1388 ) This fixes a bug introduced in #1371, where selecting a type other than "All users" and then trying to reselect "All users" would break the UI. Test plan: Manual inspection; load an instance, try selecting a different type, and then go back to "All users". It now works as expected.	2019-09-19 14:01:17 +02:00
Dandelion Mané	007568d3f0	Add `sourcecred discourse` command (#1374 ) This adds a new command, `discourse`, which makes it convenient to load Discourse servers as standalone SourceCred projects. For example, you could load the official SourceCred discourse via the following: ```sh export SOURCECRED_DISCOURSE_KEY=.... yarn backend node bin/sourcecred.js discourse https://discourse.sourcecred.io credbot yarn start ``` I've updated the README with instructions for using the plugin. Test plan: No automated testing because I see this tool as a temporary placeholder until we get the SourceCred instances setup. I manually tested the error cases (e.g. providing an invalid server url) as well as success cases like the one above. I validated that the weights file argument is being interpreted correctly (i.e. trying to load invalid weights produces an expected error message, loading valid weights results in those weights being present in the UI).	2019-09-19 12:32:49 +02:00
Dandelion Mané	1449935651	GitHub plugin: Expose user addresses (#1382 ) Allow getting the node address for a user, given the user's login. This will be needed by the upcoming identity plugin. If the login in question corresponds to a bot, then a bot address will be returned. When we make the bot-set configuration (rather than hardcoded), we'll need to change the signature of this function; I think that's fine. Test plan: Unit tests added. (Also, it's really simple.)	2019-09-18 14:50:52 +02:00
Dandelion Mané	ac8ac7051f	add `Graph.contractNodes` (#1380 ) This commit adds Graph.contractNodes, which allows collapsing certain nodes in the graph into each other. This will enable the creation of a SourceCred "identity" plugin, allowing identity resolution between users different accounts on different services. Test plan: Thorough unit tests have been added. `yarn test` passes. Thanks to @wchargin for [review feedback][1] which significantly improved this API. [1]: https://github.com/sourcecred/sourcecred/pull/1380#discussion_r324958055	2019-09-18 13:59:49 +02:00
William Chargin	ddf07c6714	Replace `PartialTimelineCredParams` with `$Shape` (#1379 ) Summary: Flow provides a utility type for this purpose; there’s no need to implement, document, and keep it in sync ourselves: <https://flow.org/en/docs/types/utilities/#toc-shape> Test Plan: As written, `yarn flow` passes. Changing the definition of `params` on line 77 of `load.test.js` to add a key `foo: "wat"` or change the value of `weights` to `{hmm: "hmm"}` yield appropriate type errors. wchargin-branch: use-shape	2019-09-16 19:22:35 -07:00
William Chargin	3cb22565e5	mirror: update EAV primitives (#1342 ) Summary: This commit modifies `_updateOwnData` to write to both the old type-specific primitives tables as well as the new EAV table. This establishes the invariant that a node with non-null `last_update` will always have primitive data (if its object type has primitive fields). Test Plan: Existing tests expanded. Commenting out each of the `updateEavPrimitive` calls (independently) causes a test to fail. Note that every test that queries an internal `primitives_*` table to inspect the database state has been expanded to make an equivalent query against the `primitives` table as well. wchargin-branch: mirror-eav-update	2019-09-14 17:28:09 -07:00
William Chargin	463f3a073a	mirror: initialize EAV primitives at registration (#1341 ) Summary: This establishes the invariant that every object in the `objects` table has all relevant rows in the `primitives` table, though those rows’ values are never yet set. Test Plan: Unit tests updated. Manually loading `sourcecred/example-github` and running `.dump primitives` generates reasonable-looking output, with lots of rows, including entries for nested fields and eggs. Verified that the set of non-`id` columns on `Issue` equals the set of values for the `fieldname` column of an `Issue` object, and likewise for `Commit`s, thus covering each kind of field. wchargin-branch: mirror-eav-init	2019-09-14 17:24:58 -07:00
William Chargin	0418dfe9dd	mirror: add `primitives` table for EAV migration (#1340 ) Summary: See #1313 for context. The plan is to set up dual-writes with `extract` calls still reading from the old tables until the new ones are complete and tested. The primary risk to production would be a fatal exception in the new write paths, which seems like an acceptable risk. Test Plan: Unit tests pass. wchargin-branch: mirror-eav-schema	2019-09-14 17:21:42 -07:00
William Chargin	976afb6665	mirror: test `registerObject` with nested fields (#1339 ) Summary: Prior to this commit, removing the `addLink.run({id, fieldname})` on line 487 of `mirror.js` would cause test failures down the pipeline, but not at the root cause. Such an error is now caught earlier. Test Plan: Comment out line 487 of `mirror.js` and observe that the newly added test case fails, but the other `registerObject` test cases do not. wchargin-branch: mirror-test-registerobject-nested	2019-09-14 17:16:24 -07:00
Dandelion Mané	c58315fe4d	Hackily add support for mixed GitHub/Discourse projects (#1378 ) For phase one of the CredSperiment, I need a SourceCred instance which combines GitHub and Discourse servers. I'll also need to be able to give it very specific configuration to collapse certain user identities together. Shortly after launching the CredSperiment, I plan to come back and totally re-write SourceCred's command line interface and site building system, in a way that will throw away most of the existing codebase. As such, I found it expedient to add rather hacky and untested support for loading combined GitHub/Discourse instances, so I can land the promised features. This PR does so by: - adding sourcecred gen-project for constructing project.json files - adding sourcecred load --project for loading a project.json file - ensuring that load provides the right plugins based on the project that's in scope - updating build_static_site so that it can use the new --project flag Test plan: I have done some end-to-end testing, but the overall commit stack lacks automated testing. This is a deliberate tradeoff: I'm planning to re-write this section of the codebase, and the testing ergonomics are not great, so I'd rather accept some technical debt, especially since I plan to pay it off soon. See the pull request on GitHub for the individual constituent commits.	2019-09-12 17:35:21 +02:00
Dandelion Mané	7a0dd49b42	factor loadWeights into Common (#1377 ) As suggested by @Beanow in [a review comment][1], this commit factors loading weights from disk into a cli/common utility method. The actual method is really generic, and we have a number of similar constructions across the codebase (grep for `JSON.parse` to find them). I considered factoring out a generic utility for loading and deserializing JSON data from disk in general, but it didn't seem valuable enough at this time. Test plan: Unit tests added, existing tests pass. [1]: https://github.com/sourcecred/sourcecred/pull/1374#discussion_r323149740	2019-09-12 15:55:05 +02:00
Dandelion Mané	0a0010f38e	Share default TimelineCredParameters (#1376 ) At present, every place in the codebase that needs TimelineCredParameters constructs them ad-hoc, meaning we don't have any shared defaults across different consumers. This commit adds a new type, `PartialTimelineCredParameters`, which is basically `TimelineCredParameters` with every field marked optional. Callers can then choose to override any fields where they want non-default values. A new internal `partialParams` function promotes these partial parameters to full parameters. All the public interfaces for using params (namely, `TimelineCred.compute` and `TimelineCred.reanalyze`) now accept optional partial params. If the params are not specified, default values are used; if partial params are provided, all the explicitly provided values are used, and unspecified values are initialized to default values. Test plan: A simple unit test was added to ensure that weights overrides work as intended. `git grep "intervalDecay: "` reveals that there are no other explicit parameter constructions in the codebase. All existing unit tests pass.	2019-09-12 15:21:13 +02:00
Dandelion Mané	def1fef192	Factor TimelineCredParameters into new module (#1375 ) The `timelineCred.js` file is a bit of a beast. One way to start slimming it down is to pull the parameters into their own file. This is especially helpful as I'm planning a followon PR that will colocate the default parameter values with their declaration. The naming of everything in the `/timeline/` subdirectory is a bit wonky: it reflects that at the time of creation, "Timeline" designated an experimental version of SourceCred. Now, it is becoming canonical, but the cumbersome naming persists. I haven't made any effort to tackle the name debt here. Test plan: `yarn test` passes; since this is merely a code reorganization, this give me great confidence that the change is correct. I also added a few small tests to the new module. Although the behavior in question is already tested, I think setting up test files liberally is a good practice, as the existence of the test file invites the creation of more tests.	2019-09-12 15:12:17 +02:00
Dandelion Mané	e1b9b07cac	group explorer types by plugin (#1373 ) Now that we're adding support for the Discourse plugin, we'll start having >1 plugin present in the frontend again. As such, we should provide clear grouping of types in the frontend so that it's possible to distinguish between a GitHub user and a Discourse user. This commit does just that, by resurrecting code that we used when the GitHub and Git plugins co-existed in the frontend. Test plan: Launch the fronted and observe that node types in the filter selection dropdown are grouped by the name of their plugin. Also, clicking on the name of a plugin should filter to all nodes from that plugin.	2019-09-11 02:28:42 +02:00
Dandelion Mané	093955dea1	scores command no longer assumes GitHub plugin (#1372 ) Previously, the `sourcecred scores` command assumed that all users are GitHub users, and assigned users an id based on their GitHub login. Now, the command returns information on all users, regardless of which plugin provided them. As such, we need to identify users differently. Instead of a string id, they now have an array of address parts. That array contains all of the parts of their corresponding node address. For example, the GitHub user `@Beanow` would correspond to the address array `["sourcecred", "github", "USERLIKE", "USER", "Beanow"]` As a general convention, the first two components of any node's address contain information about the plugin that owns that node. The first component is the owner of the plugin, and the second is the name of the plugin. Afterwards, the plugin may represent nodes in whatever manner it sees fit. Thanks to @Beanow and @vsoch for some feedback and discussion on this design. Test plan: Snapshots have been updated. `yarn test` passes.	2019-09-10 23:49:45 +02:00
Dandelion Mané	b3ffd3758b	TimelineExplorer defaults to showing all users (#1371 ) Now instead of always defaulting to GitHub users, it shows all user-typed nodes. This will make SourceCred work non-hackily when there is e.g. just a Discourse plugin in scope. I also fixed an issue where it was loading the GitHub declaration in a hardcoded way, instead of properly getting it from the TimelineCred's plugin array. Test plan: Manual UI inspection.	2019-09-10 22:50:39 +02:00
Dandelion Mané	8de57fdb7b	add TimelineCred.userNodes (#1369 ) This is a convenience method that extracts cred for all the user-typed nodes. It's basically an abstraction over calling `credSortedNodes` with the right set of prefixes. I forsee using it in at least two places (score retrieval in the CLI and score display in the frontend) so I decided to make it a method. Test plan: A very simple unit test was added. (It's a very simple wrapper function.)	2019-09-10 20:02:28 +02:00
Dandelion Mané	1079f5ec86	timelineCred.credSortedNodes takes prefixes (#1368 ) This lets us filter by a group of prefixes simultaneously, which enables e.g. seeing all user node types at once. I also tweaked the API to make it a bit more convenient, you can now pass no arguments and get all nodes in sorted order. Test plan: Unit tests updated.	2019-09-10 19:44:03 +02:00
Dandelion Mané	65f22a0a74	Replace TimelineCredConfig with array of plugins (#1367 ) The PluginDeclaration has all of the information we need to configure TimelineCred: it knows all the node and edge types, as well as which node types are user (or scoring) node types. Therefore, we can replace the ad-hoc config object with a simple array of plugin declarations. Since the plugins will be saved as part of the TimelineCred, it means the UI can configure to only show information for plugins that are actually in scope. Test plan: `yarn test` passes, and the prototype still works. Snapshots updated.	2019-09-10 19:36:12 +02:00
Dandelion Mané	dcf4010ff0	discourse: fix fetch failure on 410 (#1366 ) When a post or topic is deleted, Discourse fetch will give status 410. As with 404 and 403, we should just ignore the post and move on. I took the opportunity to slightly refactor the fetch error handling while I was there. Test plan: Previously, doing a load on the SourceCred discourse instance would fail due to a deleted topic. Now, it doesn't.	2019-09-10 19:13:13 +02:00
Dandelion Mané	aecd2864bf	Let plugins specify user types (#1365 ) This modifies the pluginDeclaration so that it can specifiy user node types. This will allow us to replace the TimelineCredConfig type with a plugin collection instead. It's expected that the user types will also be present in the node types, although this isn't validated anywhere at present. Test plan: `yarn flow`.	2019-09-10 19:09:01 +02:00
Dandelion Mané	dbb31a586c	Capitalize Discourse plugin name (#1364 ) This ensures consistency with GitHub, and will allow us to use plugin names in the UI. Test plan: Not needed, trivial change.	2019-09-10 19:06:05 +02:00
Dandelion Mané	e2e6c56650	Enable multiple scoring node types (#1361 ) This updates the cred computation logic so that we can have multiple "scoring node types". Context: Currently, we designate a single node type (GitHub users) as the scoring node type, and normalize so that all users have 1000 score in total. This commit updates the pipeline to admit using more than one prefix for scoring, meaning that we could have GitHub users, Discourse users, and more, and still have all users sum to 1000 score. We will still need to update the frontend so that it will have a user pane which aggregates across all users. Test plan: Unit tests updated. `yarn test` passes.	2019-09-10 19:05:46 +02:00
William Chargin	0d7db99d7f	Blacklist `@allcontributors` bot (#1363 ) Summary: This adds `MDM6Qm90NDY0NDczMjE=` (`@allcontributors`) to the blacklist to enable loading the `aragon/aragon` repository. See #1362 and #996 for context. Test Plan: Running `node ./bin/sourcecred.js load aragon/aragon` on a clean cache now completes successfully. wchargin-branch: blacklist-allcontributors	2019-09-10 08:55:16 -07:00
Dandelion Mané	545b084146	Change TimelineCred filtering strategy (#1358 ) This changes how TimelineCred filtering works. Instead of using the filterTimelineCred module, which includes all nodes matching filterPrefixes, we now take all nodes matching scorePrefixes and additionally the top `k` nodes for every other type. This ensures that we will have the top comments, pull requests, issues, etc in the UI, without needing to take every single comment or PR or issue. Concurrently, the UI is updated so that every type is included in the filter dropdown. CHANGELOG has been updated, since this is user facing. Test plan: `yarn test` passes, snapshots are updated, and I also tested the UI manually.	2019-09-08 00:32:10 +02:00
Dandelion Mané	f31a92874b	hide `filterTimelineCred` (#1357 ) TimelineCred computation is implemented as follows: - Compute Distribution - Filter it down to specified node types - Wrap the filtered results into a TimelineCred I want to change how the filtering works. The new filtering logic will depend on logic we've already implemented in TimelineCred; therefore filtering should be done on the TimelineCred object and not separately. Specifically, I want to be able to filter down to the highest-scored nodes by type (dependent on the type). As a first step, I've refactored the interface to TimelineCred so that the filtering is an implementation detail, i.e. the TimelineCred constructor doesn't expect objects defined in `filterTimelineCred`. Test plan: `yarn test` passes after a snapshot update.	2019-09-08 00:20:34 +02:00
Dandelion Mané	5996dd710a	timeline cred config is stored in JSON (#1356 ) This modifies the TimelineCred serialization so that it includes the CredConfig in the JSON. This means that it's easier to coordinate which plugins and types are in scope, as the data itself can contain that information. Rather than define a new hand-rolled serializer, I just passed the config directly through for stringification. Unit tests verify that this still works (round-trip serialization is tested). As an added sanity check, I generated a new small `cred.json`, and inspected the file via `cat` to ensure that it's still legible text, and isn't interpreted as a binary file due to the `NUL` bytes in node addresses. Every client that previously depended on the `DEFAULT_CRED_CONFIG` now properly gets its cred configuration from the JSON. Test plan: Unit tests for serialization already exist. Generated a fresh `cred.json` file and tested the frontend with it. Also, `yarn test --full` passes.	2019-09-08 00:04:01 +02:00
William Chargin	5bcec38e5b	Blacklist more problematic quasar interactions (#1335 ) Blacklist more problematic quasar interactions Summary: Context: <https://github.com/sourcecred/sourcecred/issues/1256#issuecomment-526252852> Without also blacklisting the reaction, we hit an invariant violation in the relational view (reactions are expected to have exactly one author). Test Plan: Running `node ./bin/sourcecred.js load quasarframework/quasar-cli` now completes successfully (in about 2 minutes 40 seconds). It does emit a warning: ``` Issue[MDU6SXNzdWUzNDg0NjUzNDg=].reactions: unexpected null value ``` …because one of the reactions was blacklisted. But the relational view handles this correctly, it seems: timeline cred is still computed and renders without obvious error. wchargin-branch: blacklist-more-quasar	2019-09-02 08:18:36 -07:00
William Chargin	7d3d24e0ec	mirror: guess typenames and warn on mismatch (#1337 ) Summary: The format of GitHub’s GraphQL object IDs is explicitly opaque, and so we must not introspect them in any way that would influence our results. But it seems reasonable to introspect these IDs solely for diagnostic purposes, enabling us to proactively detect GitHub’s contract violations while we still have useful information about the root cause. This commit adds an optional `guessTypename` option to the Mirror constructor, which accepts a function that attempts to guess an object’s typename based on its ID. If the guess differs from what the server claims, we continue on as before, but omit a console warning to help diagnose the issue more quickly. Resolves #1336. See that issue for details. Test Plan: Unit tests for `mirror.js` updated, retaining full coverage. To test manually, revert #1335, then load `quasarframework/quasar-cli`. Note that it emits the following warning before failing: > Warning: when setting Reaction["MDg6UmVhY3Rpb24zNDUxNjA2MQ=="].user: > object "MDEyOk9yZ2FuaXphdGlvbjQzMDkzODIw" looks like it should have > type "Organization", but the server claims that it has type "User" Unit tests for the GitHub typename guesser added as well. Running `yarn test --full` passes. wchargin-branch: mirror-guess-typenames	2019-09-01 01:04:53 -07:00
William Chargin	ae8ab0d1bd	Check typesafety of `NullUtil.filterList` (#1328 ) Summary: The current implementation of `NullUtil.filterList` uses an `any`-cast. This is fine as long as the definition is actually typesafe; we should take a least a little care to ensure that it is. This commit adds a typesafe version, commented out but still typechecked, and refines the type around the `any`-cast to make the cast slightly more robust. Test Plan: Note that changing `$ReadOnlyArray<?T>` to `$ReadOnlyArray<?T \| number>` in the declaration of `filterList` caused no Flow error prior to this commit, but now causes one. wchargin-branch: filter-list-typecheck	2019-08-26 10:35:08 -07:00

1 2 3 4 5 ...

987 Commits