Summary: This commit adds internal functions to (a) emit a GraphQL query to fetch data for own-data of an object, and (b) ingest the results of said query back into the database. The API and implementation differ from the connection-updating analogues introduced in #878 in that the query for own data is independent of an object’s ID: it depends only on the object’s type. This affords us more flexibility in composing queries. As described in a internal documentation comment, values are stored in the database in JSON-stringified form: we cannot use the obvious raw SQL values, because there is no native encoding of booleans (`0`/`1` is conventional), and we need to distinguish them from other data types. There are other ways to solve this problem. Notably: 1. We could take inspiration from OCaml: encode stronger static types and a simpler runtime representation. That is, we could change the schema field types from simply “primitive” to the various specific primitive types. Then, when reading data out from the database, we could reinterpret the values appropriately. 2. We could take advantage of the fact that we are not using all of SQLite’s data types. In particular, we do not store anything as a binary blob, so we could encode `false` as a length-0 zeroblob and `true` as a length-1 zeroblob, for instance. Again, when reading data out from the database, we would reinterpret the values—but in this approach we would not need an explicit schema from the user. For now, we take the easiest and simplest approach just to get ourselves off the ground. We can easily move to the second option described above later. This commit makes progress toward #622. Test Plan: Unit tests included, with full coverage. While these tests check that the GraphQL queries are as expected, they cannot check that they are actually valid in production. To check this, follow the instructions in the added snapshot test. wchargin-branch: mirror-own-data-updates
SourceCred
SourceCred creates reputation networks for open-source projects. Any open-source project can create its own cred, which is a reputational metric showing how much credit contributors deserve for helping the project. To compute cred, we organize a project’s contributions into a graph, whose edges connect contributions to each other and to contributors. We then run PageRank on that graph.
To learn more about SourceCred’s vision and values, please check out our website and our forum. One good forum post to start with is A Gentle Introduction to Cred.
For an example of SourceCred in action, you can see SourceCred’s own prototype cred attribution.
Current Status
We have a prototype that can generate a cred attribution based on GitHub interactions (issues, pull requests, comments, references, etc.). We’re working on adding more information to the prototype, such as tracking modifications to individual files, source-code analysis, GitHub reactions, and more.
Running the Prototype
If you’d like to try it out, you can run a local copy of SourceCred as follows. First, make sure that you have the following dependencies:
- Install Node (tested on v8.x.x).
- Install Yarn (tested on v1.7.0).
- Create a GitHub API token. No special permissions are required.
- For macOS users: Ensure that your environment provides GNU coreutils. See this comment for details about what, how, and why.
Then, run the following commands to clone and build SourceCred:
git clone https://github.com/sourcecred/sourcecred.git
cd sourcecred
yarn install
yarn backend
export SOURCECRED_GITHUB_TOKEN=YOUR_GITHUB_TOKEN
node bin/sourcecred.js load REPO_OWNER/REPO_NAME
# this loads sourcecred data for a particular repository
yarn start
# then navigate to localhost:8080 in your browser
For example, if you wanted to look at cred for ipfs/js-ipfs, you could run:
$ export SOURCECRED_GITHUB_TOKEN=0000000000000000000000000000000000000000
$ node bin/sourcecred.js load ipfs/js-ipfs
replacing the big string of zeros with your actual token.
You can also combine data from multiple repositories into a single graph.
To do so, pass multiple repositories to the load
command, and specify an “output name” for the repository.
For instance, the invocation
node bin/sourcecred.js load ipfs/js-ipfs ipfs/go-ipfs --output ipfs/meta-ipfs
will create a graph called ipfs/meta-ipfs
in the cred explorer, containing the combined contents of the js-ipfs and go-ipfs repositories.
Early Adopters
We’re looking for projects who want to be early adopters of SourceCred! If you’re a maintainer of an open-source project and would like to start using SourceCred, please reach out to us on our Discord or our forum.
Contributing
We’d love to accept your contributions! You can reach out to us by posting on our forum, or chatting with us on Discord. We'd be happy to help you get started and show you around the codebase. Please also take a look at our contributing guide.
If you’re looking for a place to start, we’ve tagged some issues Contributions Welcome.
Acknowledgements
We’d like to thank Protocol Labs for funding and support of SourceCred. We’d also like to thank the many open-source communities that produced the software that SourceCred is built on top of, such as Git and Node.