Commit Graph

1286 Commits

Author SHA1 Message Date
Dandelion Mané e97a1209d0
Remove TimelineCred.reduceSize (#1633)
TimelineCred has a `reduceSize` method which discards cred for most
nodes, keeping only cred for the top nodes of each type across all time.
I've wanted to remove this for a while, because it is a bad fit for the
kind of experimentation we're starting to do with showing the top nodes
for recent activity periods. Since the recent nodes haven't had much
time to accumulate cred, they are almost all discarded by reduceSize. As
an added inducement, I want to get rid of reduceSize for #1557 because
it requires type information.

`reduceSize` still serves one function, which is enabling the frontend
to load faster because it only loads a smaller amount of data which is
discoverable in the UI. However, it doesn't make sense to discard most
of the data just for a fast UI load--we can later make another data
structure which is tuned for the needs of the frontend, and have that
data structure include only summary statistics.

This will make the cred.json file much larger for large repos, e.g. I
expect that loading tensorflow/tensorflow would now go over the 100MB
hard cap for GitHub pages. However, right now we're prioritizing using
SourceCred for medium-small projects (e.g. SourceCred itself, and maybe
Maker), so this isn't a current concern.

Test plan: `yarn test --full` passes.
2020-02-05 10:46:17 -08:00
Dandelion Mané 378255e91d
Fix console errors from state.test.js (#1628)
This commit fixes an issue introduced in #1625, which caused the
`state.test.js` file to print some unhandled console errors when
running `yarn unit`.

First, this commit changes the file so that it properly errors if any
unexpected console errors are printed. Then it fixes the erroring tests.

Test plan: `yarn test` passes; `yarn unit --watch legacy/state.test.js`
no longer prints any error messages to console.
2020-02-05 10:45:35 -08:00
Robin van Boven 9cf412c437
Github: remove unused loaders (#1635) 2020-02-05 19:28:26 +01:00
Robin van Boven 3c971ebaef
Discourse: remove unused loaders (#1634) 2020-02-05 19:21:14 +01:00
Robin van Boven e48902cfc9
Remove deprecated load function (#1630)
One of several cleanup commits. See #1629.

The previously created loadContext (see #1622) function will be used
instead and renamed to be load.

We're removing significant amounts of test code as well. The individual
components that make up the replacement function already provide the
needed coverage, like `dataDirectory.test.js` and `loadContext.test.js`.
The integration provided by this function is also covered through the
sharness `test_load_example_github.t`.
2020-02-05 11:27:43 +01:00
Robin van Boven 361961c59f
Switch to LoadContext based loading (#1622)
This naming is temporary, as the old loading code is removed it
will be named load and replace the existing function.
2020-02-04 22:28:24 +01:00
Robin van Boven f3cfab636e
Backend: implement LoadContext (#1621)
Follows the general outline of #1586.
It uses a new trick of aliasing external module functions as
private properties. This makes the spyOn / mock tests more robust,
while fitting in the composition responsibility.
2020-02-04 22:14:21 +01:00
Robin van Boven 5fd43bf62d
Backend: implement PluginLoaders.contractPluginGraphs (#1619)
Note: this doesn't include a WeightedGraph.overrideWeights step.
Because overriding weights isn't related to plugins, this will be
handled as a separate feature, later in the load pipeline.
2020-02-04 21:08:51 +01:00
Robin van Boven ffd6c38e4c
Backend: implement PluginLoaders.createPluginGraphs (#1618)
Similar to CachedProject, we're using an opaque PluginGraphs return
type. Because only PluginLoaders can add the semantic of this being
all plugin graphs, and to use this semantic in future functions.
2020-02-04 21:02:47 +01:00
Robin van Boven 4c53558c65
Backend: implement PluginLoaders.updateMirror (#1617)
Note, the return type is a CachedProject. See #1586 for discussion.
Having this type allows us to create new functions with a semantic
of requiring the project is mirrored into cache. It is opaque,
because only the "all plugins" semantic which PluginsLoaders has
could know when mirroring of a Project has been completed.

Additionally MirrorEnv is not a strict type. We're expecting this
to be a subset of parameters. We'll use Flow to ensure we only use
the ones we need from it.
2020-02-04 20:06:50 +01:00
Dandelion Mané 1073374dc7
Frontend gets plugins from disk, not TimelineCred (#1625)
This commit modifies the frontend so that it now pulls plugin
declarations from disk, rather than from the TimelineCred. This will
allow us to decouple the TimelineCred from the PluginDeclarations, which
is another step towards #1557.

As proof that the frontend no longer gets plugins from the TimelineCred,
I removed the public `plugins()` method on TimelineCred.

Test plan: The frontend is somewhat sketchily tested. `yarn test`
passing is good, manual inspection of the frontend is also necessary;
I've done this.
2020-02-04 10:44:42 -08:00
Dandelion Mané aeaa945a27
api/load: save plugin declarations to disk (#1624)
This builds on #1623 and is another step towards separating cred
computation from plugin declarations, as described in #1557. Basically,
this will allow the frontend to get plugin declarations even if the
TimelineCred computation never saw them.

This commit modifies `api/load`, and adds a new facility to
`DataDirectory` for saving the PluginDeclarations (which will be used by
@Beanow's in-flight refactor of `api/load`).

Test plan: See included unit tests, also try loading a project and
inspect the newlys saved file.
2020-02-04 10:30:14 -08:00
Robin van Boven 806919a6cc
Backend: add ComputeFunction (#1620)
Similar to PluginLoaders, this accepts an interface like
TimelineCred.compute. During a load, we have the added
responsiblity of using the TaskReporter.

This lets us mock the concrete TimelineCred.compute, while testing
just the extra functionality. The responsiblity to compose this
with the concrete TimelineCred.compute lies with LoadContext.
2020-02-04 12:45:48 +01:00
Robin van Boven 492f2ff6b4
Github: add fetchGithubRepoFromCache (#1569)
Updating the mirror and using the mirror data should be separated.
As unified reference detection requires the mirror of all plugins are
updated. Upcoming refactors to the load system will solidify this.

While this refactor is ongoing, we will ignore the fetchGithubRepo
return value, using it as a mirror update only. And use this new
function to extract the data as needed in other loading steps.
2020-02-04 12:09:54 +01:00
William Chargin 115740e7fe
mirror: remove docs about legacy `extract` options (#1612)
Summary:
The `options` argument was introduced during the EAV table refactor and
dropped once that was complete, so we can remove the docs.

Test Plan:
None.

wchargin-branch: mirror-extract-no-options
2020-02-03 18:39:35 -08:00
William Chargin ba9cc43a39
mirror: add default typename-mismatch error message (#1610)
Summary:
Previously, `_nontransactionallyRegisterObject` differed from its
counterpart `registerObject` in two ways: the former does not enter a
transaction, but it also requires a function to format an error message.
The value provided by `registerObject` to the helper is a suitable
default, so we can use it to decouple the transaction semantics from the
error message.

Test Plan:
Existing tests suffice, retaining full coverage.

wchargin-branch: mirror-default-typename-message
2020-02-03 17:50:30 -08:00
Dandelion Mané 9539a18a67
Add to/fromJSON for plugin declarations (#1623)
This commit adds a simple method for (de-)serializing arrays of
PluginDeclarations. This will allow us to save PluginDeclarations for
consumption by the frontend, without having them bundled with cred
results in TimelineCred. Thus, we can simplify and clean up TimelineCred
as described in #1557.

Test plan: Inspect unit tests and snapshots; `yarn test` passes.
2020-02-03 16:16:53 -08:00
Robin van Boven 8838f856ab
Backend: implement PluginLoaders.declarations (#1616)
This is the first of several commits to create the PluginLoaders
abstraction. Using this allows us to define "for all plugins"
semantics, while keeping each underlying plugin interface flexible.
2020-02-04 00:02:54 +01:00
Robin van Boven ca63ea00fb
Github: switch to CacheProvider for fetchGithubRepo (#1614)
Delegating to a CacheProvider instance, will limit the number
of places where we need to handle filesystem details. It will
also allow a mock, or in-memory cache to be provided.
2020-02-03 23:46:03 +01:00
Robin van Boven 6904621646
Backend: implement MemoryCacheProvider (#1613)
While so far tests haven't required this (by mocking), there are
scripts like plugins/github/bin/fetchAndPrintGithubRepo.js which
create a tmp directory as a single-use cache. Here we'll be able
to use the MemoryCacheProvider instead.

In future PRs we'll switch to an API which only accepts CacheProviders.
So having MemoryCacheProvider in place will provide a good alternative
to creating tmp directories.
2020-02-03 23:36:58 +01:00
greenkeeper[bot] f22b6a539f
Update eslint-plugin-import to the latest version 🚀 (#1611)
* chore(package): update eslint-plugin-import to version 2.20.1

* chore(package): update lockfile yarn.lock
2020-02-03 12:56:12 -08:00
William Chargin 0aff3081ec
mirror: remove unnecessary shadowing (#1609)
Summary:
A common table expression shadows a (main or temporary) table name for
`SELECT` statements, but doing so is confusing and makes the code harder
to read.

Test Plan:
Existing unit tests suffice.

wchargin-branch: mirror-no-cte-shadow
2020-02-03 08:17:58 -08:00
Dandelion Mané 5de027e83d
Move weights out of TimelineCredParams (#1607)
This commit moves weights out of the "parameters" to TimelineCred. This
makes sense, because the Weights are now passed to TimelineCred via the
included WeightedGraph. As such, we now have the `api/load` options
include explicit Weights that are used as overrides, rather than having
them be included in the TimelineCred parameters.

Test plan: I've manually tested this commit by:
- Changing weights in the explorer, and verifying that the `recalculate
cred` button activates as expected, and the new weights are used
correctly in the resultant distribution.
- Verifying that downloading weights form the UI still works.
- Verified that uploading weights to the UI still works.
- Verifying that passing command-line weights files still works.

Also, `yarn test` passes.
2020-01-30 16:20:31 -08:00
Dandelion Mané eb47465421
Add WeightedGraph.overrideWeights (#1606)
In a few occasions in the codebase, we need the ability to take a
WeightedGraph and apply manual user overrides to its weights (keeping
the base weights wherever non-conflicting). It's actually a fairly
simple application of Weights.merge, but since it's of general utility
I'm adding it to the WeightedGraph API.

Test plan: I've added unit tests that validate its behavior; take a
look. `yarn test` passes.
2020-01-30 15:47:39 -08:00
Dandelion Mané f557af9020
Convert load pipeline to pass WeightedGraphs (#1605)
This commit changes `api/load` and downstream consumers to use
WeightedGraphs instead of regular Graphs. In addition to `api/load`, we
also modify the frontends and the timeline cred calculation module.

However, we don't yet _use_ the weights from the WeightedGraph. So as to
make this commit easier to review, it only changes the data type being
passed around; however in practice the consumers ignore the weights and
simply use the underlying graph. A followon commit will modify the
consumers so that they properly retrieve weights from within the
WeightedGraph.

This is a major step towards #1557.

Test plan:

`yarn test --full` passes; manual testing verifies that the frontend
still displays cred properly, and that modifying the weights and
re-calculating shows that the weights are being used properly.
2020-01-30 15:42:23 -08:00
Dandelion Mané b5be554d63
add api/loadWeightedGraph (#1604)
This adds a new module the api directory which loads a combined
WeightedGraph across all available plugins. This is intended as a key
piece of a future, less-tightly-coupled load pipeline which will produce
WeightedGraphs, as required by #1557.

Test plan:
The "clean" logic (combining graphs, applying transformations,
overriding weights) is tested explicitly. The "unclean" logic, which
involves directly generating graphs from Discourse/GitHub, are untested.
Arguably we could test with mocks, I'm dubious that doing so would add
real value. I think most of the potential issues (especially
refactoring-induced issues) would get caught by Flow. This is also one
of those "works perfectly or is totally broken" type situations. (Thus,
the likelihood of costly "subtle failures" is low.)
2020-01-30 15:22:57 -08:00
Dandelion Mané 1dd7e7a3c3
add `loadWeightedGraph` modules for plugins (#1602)
This commit adds `loadWeightedGraph` modules for both the GitHub and
Discourse plugins. They will replacing existing (and inconsistently
named) modules which load regular graphs. In addition to loading
the underlying graph, they set weights according to the plugins'
default type-level weights.

The soon-to-be-replaced modules have been marked deprecated.

Another small and vital step towards #1557.

Test plan: The functions that these functions replace are not tested,
because they are IO-heavy composition methods which are painful to test
themselves, and directly depend on well-tested behavior. For the same
reason, no unit tests have been added. Given the nature of the methods
in question, it's unlikely that they'll be sublty broken.
2020-01-30 15:17:32 -08:00
Dandelion Mané 4407c4f9fc
Weights.merge: add support for resolvers (#1597)
This commit adds support for resolvers to `Weights.merge`. The change is
documented and unit tested. Another step towards #1557.

Test plan: Inspect included tests; `yarn test` passes.
2020-01-30 10:37:05 -08:00
Dandelion Mané 566ecdd255
identity plugin provides a unified IdentitySpec (#1603)
This commit contains a slight refactor to the identity plugin so that it
provides a unified `IdentitySpec` type which wraps the list of
Identities with the metadata (currently a discourse server url) needed
to interpret those identities. This makes the API slightly nicer to use.

Test plan: Simple refactor; `yarn test` is sufficient.
2020-01-30 10:35:54 -08:00
greenkeeper[bot] 3d3c8c92b3
Update flow-bin to the latest version 🚀 (#1600)
* chore(package): update flow-bin to version 0.117.0

* chore(package): update lockfile yarn.lock

* Fixup flow error

From the [Flow 0.117.0 release notes](https://github.com/facebook/flow/releases/tag/v0.117.0)
> Removed uses of Symbol from libdefs in favor of symbol.

Test plan: `yarn flow`

Co-authored-by: Dandelion Mané <decentralion@dandelion.io>
2020-01-28 18:28:16 -08:00
Dandelion Mané 3f42d72467
Create a node for each Discourse like (#1587)
*Let's use the syntax `(node)` to represent some node, and `> edge >` to
represent some edge.*

In the past, for every like, we would create the following graph
structure:

`(user) > likes > (post)`

As of this commit, we instead create:

`(user) > createsLike > (like) > likes > (post)`

We make this change because we want to mint cred for likes. Arguably,
this is more robust than minting cred for activity: something being
liked signals that at least one person in the community found a post
valuable, so you can think of moving cred minting away from raw activity
and towards likes as a sort of implicit "cred review".

Create a node for each like is a somewhat hacky way to do it--in
principle, we should have a heuristic which increases the cred weight of
a post based on the number of likes it has received--but it is expedient
so we can prototype this quickly.

Obviously, this is not robust to Sibyll attacks. If we decide to adopt
this, in the medium term we can add some filtering logic so that e.g. a
user must be whitelisted for their likes to mint cred. (And, in a nice
recursive step, the whitelist can be auto-generated from the last week's
cred scores, so that e.g. every user with at least 50 cred can mint more
cred.) I think it's OK to put in a Sibyll-vulnerable mechanism here
because SourceCred is still being designed for high trust-level
communities, and the existing system of minting cred for raw activity is
also vulnerable to Sibyll and spam attacks.

Test plan: Unit tests updated; also @s-ben can report back on whether
this is useful to him in demo-ing SourceCred [on MakerDAO][1].

If we merge this, we should simultaneously explicitly set the weight to
like nodes to 0 in our cred instance, so that we can separate merging
this feature from actually changing our own cred (which should go
through a separate review).

[1]: https://forum.makerdao.com/t/possible-data-source-for-determining-compensation
2020-01-28 18:10:24 -08:00
Dandelion Mané 6561a61a63
Generate Weights from PluginDeclarations (#1595)
This adds a simple method, `weightsForDeclaration`, which generates
weights from a plugin declaration. This is a small but important piece
of #1557, as it allows us to create appropriate Weights cleanly from the
plugin.

Test plan: Unit tests added; `yarn test` passes.
2020-01-28 18:08:22 -08:00
Dandelion Mané 02b072699e
Add `identity/contractIdentities` (#1601)
This commit adds a `contractIdentities` method to the Identity plugin,
which allows contracting a WeightedGraph using the provided identities.
The method does not attempt to contract weights together, although as a
safety check it will error if weights have been explicitly provided for
any of the contracted nodes.

This PR replaces #1591; see that pull for some context on why this
method is defined on the identity plugin rather than as part of the
WeightedGraph module.

Test plan: For ease of testing, `contractIdentities` is a thin wrapper
around `nodeContractions` (which is already tested) and a new private
`_contractWeightedGraph` method (for which tests have been added). Since
`contractIdentities` is a trivial oneline composition, it does not need
any additional explicit testing.

`yarn test` passes.
2020-01-28 17:03:43 -08:00
Dandelion Mané 50a6a5f6a7
Add core/weightedGraph (#1590)
This module is a simple data type which contains a graph and associated
weights. It provides methods for JSON (de)serialization, constructing
new WeightedGraphs, and for merging them.

Test plan: See included unit tests. The unit tests are simple because
the data type and associated methods are quite simple; the underlying
functions for Graphs and Weights have more extensive testing.

Progress towards #1557.
2020-01-28 16:15:33 -08:00
Dandelion Mané 6415da7bc8
de-noise scripts/update_snapshots.sh (#1599)
scripts/update_snapshots.sh is intended as a general-purpose snapshot
updater for SourceCred. Currently, it includes updating Discourse
snapshots, but only if an obsolete Discourse API key is present.

Updating Discourse snapshots is very noisy, because the API responses
are not stable (they include the view count, which increments when
making API requests). Also, most times when we want to update our
snapshots, it's because we changed some core data structure, not because
we actually want new data from Discourse. Therefore, we should
disconnect the Discourse snapshot update process from the general
snapshot updating script.

Test plan:
Run `./scripts/update_snapshots.sh` and verify that it does not produce
Discourse update churn. Run
`./src/plugins/discourse/update_discourse_api_snapshots.sh` and verify
that it does update all the Discourse snapshots.
2020-01-27 13:27:37 -08:00
Dandelion Mané 3a962af0af
Stop setting weights on user node types (#1593)
As discussed in [this GitHub comment][1], it doesn't make sense for user
node types (or user nodes) to have non-zero weight. The reason is that
we use weights for minting cred. Minting cred to users in general
doesn't make sense (having more user accounts is not intrinsically
valuable to a project) and minting cred to specific users is
inappropriate (it means that users' cred is being determined by their
power to influence the weights, rather than because of the value of
their contributions).

This commit makes two changes:
- It sets the default weight for all user types to 0. This has no
implications for cred, since the user weights were already (implicitly)
discarded because users all have null timestamps.
- It filters user node types from the weight config, so the UI no longer
incorrectly suggests that user node weights can be meaningfully changed.

As a result of the second change, the identity plugin now displays in
the weight change UI but has no node or edge types associated. As a
followon commit, we may want to add a bit of filtering logic to clean
that up.

Test plan:
Setting the default weights to 0 for the user types has no effect on
cred, as can be manually ascertained by taking an existing cred
instance, changing the user type weights, and re-calculating.

Filtering the user node types from the WeightConfig is validated through
manual inspection testing.

I've found that frontend unit testing of changes like this has limited
value; since there aren't subtle edge cases to validate, and regressions
are unlikely, I don't think we need a unit test at this time. Therefore,
I haven't added formal tests.

[1]: https://github.com/sourcecred/sourcecred/pull/1591#discussion_r370951707
2020-01-26 15:32:28 -08:00
Dandelion Mané 1626c3c2e4
Remove github/loadGithubData (#1596)
The function is marked deprecated and has no usage in the codebase.

Test plan: `yarn test` passes.
2020-01-26 15:08:53 -08:00
Dandelion Mané 9784927402
Add `Weights.merge` (#1584)
This adds a `merge` method to the weights module, which allows combining
multiple weights together. If any of the weights overlap (i.e. the same
address is specified in multiple Weights), then an error will be thrown.

In the future, we will likely extend the API so that the client can
specify how to resolve cases where the same address is present in
multiple weights.

The method is thoroughly unit tested.

This is part of work on #1557 (we'll want to be able to merge
`WeightedGraph`s.)

Test plan: Run `yarn test`
2020-01-25 09:56:36 -08:00
Robin van Boven 325487a3ea
Backend: create DataDirectory class (#1564)
Currently the responsibility for the SourceCred directory
is spread out in different places. Some in core/project_io
some in api/load, some in the plugins.

This class is intended to centralize that IO using simple
interfaces we can depend on (and mock) instead.
2020-01-23 21:49:32 +01:00
Dandelion Mané 3b9e9f332b
Rename `Weights.defaultWeights` to `Weights.empty` (#1589)
`empty` is a more descriptive name for a `Weights` object that has no
weights set, rather than `defaultWeights`.

In every case where we were importing `defaultWeights` as a direct
symbol, I switched to importing the whole module, as usage of
`Weights.empty` makes it clear that the empty object returned will be an
empty weights (as opposed to an empty list or some other empty type).
This is as proposed in the reviews from #1538.

Test plan: It's just a rename and change in imports, so `yarn flow` would
catch any error. `yarn test` passes.

Note: this is a port of #1583, which merged to the wrong branch.
2020-01-23 12:08:25 -08:00
Robin van Boven c7f386933d
Github: derive mirror database name from repoId (#1581)
Currently, to produce a Github graph from a populated mirror
there is an unexpected dependency on a GithubToken. See #1580.

This is step 1 to remove the dependency. It will allow us to
locate the Database without a GithubToken.
2020-01-21 22:23:35 +01:00
Dandelion Mané f70dd7a00f
Move weights from analysis to core (#1582)
As part of #1557, I want to move the concept of weights into core, so
that plugins can produce a WeightedGraph rather than raw Graph. This
will allow us to do cred computation directly on the data we get from
the plugins, without recourse to plugin metadata.

Test plan: It's a simple file move; `yarn test` suffices.
2020-01-21 11:56:19 -08:00
Robin van Boven 4d86c538be
Use stable stringify for cred.json & graph.json in load (#1576)
The upcoming DataDirectory class will use stable stringify too.
But since that will affect the snapshots, make sure those are
updated before we switch to the new load implementation.
2020-01-20 21:44:16 +01:00
Dandelion Mané f5a44a5a4f
Enable eslint eqeqeq rule (#1579)
The [eqeqeq rule][1] allows us to ban potentially confusing usage of the
type-coercing `==` equality operator. I've set it to allow `== null`, and
to disallow any other use of `==`.

Alongside activating the rule, I've fixed 16 violations. Inspired by
[review of #1560][2].

Test plan: `yarn test` passes.

[1]: https://eslint.org/docs/rules/eqeqeq
[2]: https://github.com/sourcecred/sourcecred/pull/1560#discussion_r368331081
2020-01-19 20:20:15 -08:00
Brian Litwin 90c76a5023 mirror: add network request logging (#1562)
Logs network requests and responses into a table in the db. Also logs
an `UpdateId` that links the mutation to the network request/response
that generated it.

This is a breaking change to the Mirror database format; existing caches
will need to be regenerated.

Resolves #1450.

Test Plan:
Unit tests added, covering the core functionality and integration into
the public APIs. After loading a project and manually inspecting it in
an sqlite shell, the results in the `network_log` table looked
reasonable.
2020-01-19 17:58:49 -08:00
Brian Litwin dea45fb8c1 Update distributionToCred contract (#1560)
Resolves #1317

Updates timeline cred to handle the case where the scoring nodes' total
cred sums to zero in an interval. In practice, we've encountered this
circumstance when a github.io repository contains timestamps that
predates any User's contributions by several weeks, such as
sfosc.github.io.

Test Plan:

- Added a test case to handle this circumstance
- Updated a test case per discussion on #1317 to return a cred score
of 0 for all nodes in all intervals when there are no scoring nodes
passed to the function, so that we handle these cases consistently.

Also loaded sfosc.github.io and observed that the cred output appeared
to match expectations and didn't contain any `NaN` or `Infinity` values
as it did before.
2020-01-19 15:09:51 -08:00
William Chargin 74acd1fa80
lint: require camel case variables (#1563)
Summary:
We’ve had this policy unspoken since the beginning; making it explicit
as a lint rule makes it one less thing to think about. The few existing
offenders can almost all be changed with no value lost; one is an extern
that needs a lint suppression.

Test Plan:
That Flow and tests still pass suffices.

wchargin-branch: lint-camelcase
2020-01-17 08:55:57 -08:00
Dandelion Mané a36873810a
Refactor Weights to remove awareness of types (#1558)
I'm currently on a quest to separate cred computation away from any
plugin metadata (see #1557). This means we need a way to represent node
and edge weights without any explicit concept of 'types'.

This commit is a first step towards that. It removes the distinction
between 'type weights' and 'manual weights' in the weights data type.
Instead, we now just have node weights and edge weights. In contrast to
before, all weights are now interpreted as prefix matchers, e.g. a
single node or edge may match multiple weights; when this occurs, the
weights compose multiplicatively.

Since types were already identified by prefix, if a plugin wants to
assign a weight to a particular type, it may do so by specifying a
weight for that type's prefix. As before, it's possible to have a
type-level weight and a weight on a specific node, and compose them
multiplicatively.

As an added bonus, we could now sensibly have 'plugin-level' weights and
'type-level weights' and compose them multiplicatively. Thus, if we
realized that the Foo plugin is undervalued relative to the Bar plugin,
we could increase the Foo weight rather than needing to adjust all of
its types individually.

So as to keep the scope for this commit somewhat manageable, I modified
the underlying data type for Weights, but not any of the cred
computation interfaces. The weights pipeline still takes the plugin
declarations, and we still get the default type level weights from the
plugin's types. A future commit will modify the pipeline so that the
plugins provide default types alongside the Graph.

I deliberately did not provide an upgrade handler for the old style
weights JSON. This is sensible as the semantics are now different. In
the past, it was possible to specify a weight for a single node without
affecting the weights of other nodes whose addresses have the first
node's address as a prefix. Since this is no longer possible, there is
no universally "correct" way to handle the old weights files. In
practice, there are so few users that it is not a big deal either way.

Test plan:
This change has implications across the codebase and UI. In addition to
`yarn test --full` passing, I verified that:
- updating and recomputing works in the mainline UI
- updating and recomputing works in the legacy UI
- downloading weights from the UI and then explicitly loading them still
works
2020-01-13 20:00:29 -08:00
Robin van Boven 063c08d974
UniRef: add GithubReferenceDetector (#1543)
As a RelationalView is not designed for multiple repositories, we
should implement our own merging of mappings obtained from
RelationalViews.

fromRelationalViews is a factory function which does this for us.
And by accepting an array of RelationalViews it's more apparent
it should be used this way.
2020-01-13 14:13:30 +01:00
Robin van Boven f16119af43
UniRef: add urlReferenceMap to RelationalView (#1542)
RelationalView provides easy access to ReferentEntities, which we
can use for reference detection. The map produced by
urlReferenceMap can be used easily by a MappedReferenceDetector.

As discussed in #1532 we'll use RelationalView for this before
deprecating it.
2020-01-13 14:09:19 +01:00