a reputation protocol for open collaboration
Go to file
William Chargin 2b301f9159
Use indexed edges in graph internals (#295)
Summary:
This is an implementation-only, API-preserving change to the `Graph`
class. Edges’ `src` and `dst` attributes are now internally represented
as integer indices into a fixed ordering of nodes, which may depend on
non-logical properties such as insertion order. The graph’s serialized
form also now stores edges with integer `src`/`dst` keys, but the node
ordering is canonicalized so that two graphs are logically equal if and
only if their serialized forms are equal. This change substantially
reduces the rest storage space for graphs: the `sourcecred/sourcecred`
graph drops from 39MB to 30MB.

Currently, the graph will have to translate between integer indices and
full addresses at each client operation. This is not actually a big
performance regression, because it is just one more integer-index
dereference over the previous behavior, but it does indicate that the
optimization is not living up to its full potential. In subsequent
changes, the `NodeReference` class will be outfitted with facilities to
take advantage of the internal indexing; a long-term goal is that
roughly all operations should be able to be performed within the indexed
layer, and that translating between integers and addresses should only
happen at non-hot-path API boundaries.

This diff is considerably smaller and easier to read with `-w`.

Paired with @decentralion.

Test Plan:
I inspected the snapshots for general form, and manually verified that
the indices for one edge were correct (the MERGED_AS edge for the head
commit of the example repo). Other than that, existing unit tests mostly
suffice; one new unit test added.

wchargin-branch: graph-indexed-edges
2018-05-22 13:15:39 -07:00
config Upgrade Flow to v0.72.0 (#285) 2018-05-15 17:09:29 -07:00
flow-typed/npm Add react-router-dom 2018-05-08 12:55:38 -07:00
scripts Make `ensure-flow.sh` more precise and accurate (#259) 2018-05-10 12:38:39 -07:00
src Use indexed edges in graph internals (#295) 2018-05-22 13:15:39 -07:00
.eslintrc.js Add `@flow` to `.eslintrc.js` (#258) 2018-05-10 12:27:46 -07:00
.flowconfig Move package json to root (#37) 2018-02-26 22:32:23 -08:00
.gitignore Configure Webpack for backend applications (#84) 2018-03-18 22:43:23 -07:00
.prettierignore Only exclude top-level directories from Prettier (#154) 2018-04-26 19:47:58 -07:00
.prettierrc.json Move package json to root (#37) 2018-02-26 22:32:23 -08:00
.travis.yml Setup travis CI testing (#58) 2018-03-02 14:39:54 -08:00
LICENSE Add LICENSE 2018-02-03 17:58:49 -08:00
README.md Update the README (#124) 2018-04-09 08:49:54 +03:00
package.json Upgrade babel-plugin-flow-react-proptypes to 23 (#294) 2018-05-21 11:11:21 -07:00
yarn.lock Upgrade babel-plugin-flow-react-proptypes to 23 (#294) 2018-05-21 11:11:21 -07:00

README.md

SourceCred

Build Status

Vision

Open source software is amazing, and so are the creators and contributors who share it. How amazing? It's difficult to tell, since we don't have good tools for recognizing those people. Many amazing open-source contributors labor in the shadows, going unappreciated for the work they do.

As the open economy develops, we need to go beyond commit streaks and follower counts. We need transparent, accurate, and fair tools for recognizing and rewarding open collaboration. SourceCred aims to do that.

SourceCred will enable projects to create and track "cred", which is a quantitative measure of how much value different contributors added to a project. We'll do this by providing a basic data structure—a cred graph—into which projects can add all kinds of information about the contributions that compose it. For example, a software project might include information about GitHub pull requests, function declarations and implementations, design documents, community support, documentation, and so forth. We'll also provide an algorithm (PageRank) which will ingest all of this information and produce a "cred attribution", which assigns a cred value to each contribution, and thus to the people who authored the contributions.

Principles

SourceCred aims to be:

  1. Transparent

    If it's to be a legitimate and accepted way of tracking credit in projects, cred attribution can't be a black-box. SourceCred will provide tools that make it easy to dive into the cred attribution, and see exactly why contributions were valued the way they were.

  2. Community-controlled

    At the end of the day, the community of collaborators in a project will know best which contributions were important and deserve the most cred. No algorithm will do that perfectly on its own. To that end, we'll empower the community to modify the cred attribution, by adding human knowledge into the cred graph.

  3. Forkable

    Disputes about cred attribution are inevitable. Maybe a project you care about has a selfish maintainer who wants all the cred for themself :(. Not to worry—all of the cred data will be stored with the project, so you are empowered to solve cred disputes by forking the project.

Roadmap

SourceCred is currently in a very early stage. We are working full-time to develop a MVP, which will have the following basic features:

  • Create: The GitHub Plugin populates a project's GitHub data into a Contribution Graph. SourceCred uses this seed data to produce an initial, approximate cred attribution.

  • Read: The SourceCred Explorer enables users to examine the cred attribution, and all of the contributions in the graph. This reveals why the algorithm behaved the way that it did.

  • Update: The Artifact Plugin allows users to put their own knowledge into the system by adding new "Artifact Nodes" to the graph. An artifact node allows users to draw attention to contributions (or groups of contributions) that are particularly valuable. They can then merge this new information into the project repository, making it canonical.

Community

Please consider joining our community.