Max Inden 0a8dbab87c
feat(perf): continuosly measure on single conn (iperf-style) (#276)
Our current throughput tests open a connection, open a stream,
up- or download 100MB and close the connection. 100 MB is not enough on the
given path (60ms, ~5gbit/s) to exit congestion controller's slow-start. See
https://github.com/libp2p/test-plans/issues/261 for details.

Instead of downloading 100MB multiple times, each on a new connection, establish
a single connection and continuously measure the throughput for a fixed
duration (20s).
2023-10-25 13:24:08 +02:00
..

libp2p performance benchmarking

This project includes the following components:

  • terraform/: a Terraform scripts to provision infrastructure
  • impl/: implementations of the libp2p perf protocol running on top of e.g. go-libp2p, rust-libp2p or Go's std-library https stack
  • runner/: a set of scripts building and running the above implementations on the above infrastructure, reporting the results in benchmark-results.json

Benchmark results can be visualized with https://observablehq.com/@libp2p-workspace/performance-dashboard.

Running via GitHub Action

  1. Create a pull request with your changes on https://github.com/libp2p/test-plans/.
  2. Trigger GitHub Action for branch on https://github.com/libp2p/test-plans/actions/workflows/perf.yml (see Run workflow button).
  3. Wait for action run to finish and to push a commit to your branch.
  4. Visualize results on https://observablehq.com/@libp2p-workspace/performance-dashboard.

Running manually

Prerequisites

Provision infrastructure

  1. Save your public SSH key as the file ./terraform/modules/short_lived/files/perf.pub; or generate a new key pair with make ssh-keygen and add it to your SSH agent with make ssh-add.
  2. cd terraform/configs/local
  3. terraform init
  4. terraform apply
  5. CLIENT_IP=$(terraform output -raw client_ip)
  6. SERVER_IP=$(terraform output -raw server_ip)

Notes

  • While running terraform you may encounter the following error:

      Error: collecting instance settings: reading EC2 Launch Template versions: couldn't find resource
      │
      │   with module.short_lived_server[0].aws_instance.perf,
      │   on ../../modules/short_lived/main.tf line 15, in resource "aws_instance" "perf":
      │   15: resource "aws_instance" "perf" {
    
  • This implies that you haven't deployed the long-lived infrastructure on your AWS account. To do so along with each short-lived deployment, you can set TF_VAR long_lived_enabled env variable to default to true. Terraform should then spin up the long-lived resources that are required for the short-lived resources to be created.

  • It's best to destroy the infrastructure after you're done with your testing, you can do that by running terraform destroy.

Build and run libp2p implementations

Given you have provisioned your infrastructure, you can now build and run the libp2p implementations on the AWS instances.

  1. cd runner
  2. npm ci
  3. npm run start -- --client-public-ip $CLIENT_IP --server-public-ip $SERVER_IP
    • Note: The default number of iterations that perf will run is 10; desired iterations can be set with the --iterations <value> option.

Deprovision infrastructure

  1. cd terraform/configs/local
  2. terraform destroy

Adding a new implementation or a new version

  1. Add the implementation to new subdirectory in impl/*.
    • For a new implementation, create a folder impl/<your-implementation-name>/ e.g. go-libp2p
    • For a new version of an existing implementation, create a folder impl/<your-implementation-name>/<your-implementation-version>.
    • In that folder include a Makefile that builds an executable and stores it next to the Makefile under the name perf.
    • Requirements for the executable:
      • Running as a libp2p-perf server:
        • The perf server must not exit as it will be closed by the test runner.
        • The executable must accept the command flag --run-server which indicates it's running as server.
      • Running as a libp2p-perf client
        • Given that perf is a client driven set of benchmarks, the performance will be measured by the client.
          • Input via command line
            • --server-address
            • --transport (see runner/versions.ts for possible variants)
            • --upload-bytes number of bytes to upload per stream.
            • --download-bytes number of bytes to download per stream.
          • Output
            • Logging MUST go to stderr.
            • Measurement output is printed to stdout as JSON.
            • The output schema is:
              interface Data {
                type: "intermediary" | "final";
                timeSeconds: number;
                uploadBytes: number;
                downloadBytes: number;
              }
              
            • Every second the client must print the current progress to stdout. See example below. Note the type: "intermediary".
              {
                "type": "intermediary",
                "timeSeconds": 1.004957645,
                "uploadBytes": 73039872,
                "downloadBytes": 0
              },
              
            • Before terminating the client must print a final summary. See example below. Note the type: "final". Also note that the measurement includes the time to (1) establish the connection, (2) upload the bytes and (3) download the bytes.
              {
                "type": "final",
                "timeSeconds": 60.127230659,
                "uploadBytes": 4382392320,
                "downloadBytes": 0
              }
              
  2. For a new implementation, in impl/Makefile include your implementation in the all target.
  3. For a new version, reference version in runner/src/versions.ts.