nomos-pocs/da/subnets/s2nquic
gusto 97c3076e57
DA: Feasability test for large number of udp connections (#5)
* Client and server setup for testing increasing number of connections over udp

* Add results and conclusions

* Move udp poc to the same directory as network poc

* Add readme for subnets pocs
2024-07-22 21:15:48 +03:00
..
results DA: Feasability test for large number of udp connections (#5) 2024-07-22 21:15:48 +03:00
src/bin DA: Feasability test for large number of udp connections (#5) 2024-07-22 21:15:48 +03:00
.gitignore DA: Feasability test for large number of udp connections (#5) 2024-07-22 21:15:48 +03:00
Cargo.toml DA: Feasability test for large number of udp connections (#5) 2024-07-22 21:15:48 +03:00
README.md DA: Feasability test for large number of udp connections (#5) 2024-07-22 21:15:48 +03:00
cert.csr DA: Feasability test for large number of udp connections (#5) 2024-07-22 21:15:48 +03:00
cert.pem DA: Feasability test for large number of udp connections (#5) 2024-07-22 21:15:48 +03:00
key.pem DA: Feasability test for large number of udp connections (#5) 2024-07-22 21:15:48 +03:00
san.cnf DA: Feasability test for large number of udp connections (#5) 2024-07-22 21:15:48 +03:00

README.md

Large number of direct UDP connections test

In Nomos DA, the Executor has to maintain persistent connections to 4096 (or more for redundancy) DA Nodes for dispersing the encoded blobs. To see if such number of connections is feasable, a test mimicking high speed dispersal was conducted.

Test setup

Rudimentary client and server applications was created using s2n-quic crate.

  • Client

    • Mimicks behaviour of Zone Executor by sending 1024 bytes packet per connection every second;
    • Opens 10 new udp socket connections every second to the Server;
    • Tracks how many currently open connections there are;
    • Tracks the shortest, longest and average time it takes for the packet to be echoed back from the server.
  • Server

    • Mimicks behaviour of Multiple DA Nodes by accepting any incomming udp connections and echoing any data that is received;
    • Tracks total number of currently open connections;
    • Tracks total number of received bytes.

Hardware used for tests

  • Client

    1. (MAC, Kaunas) MacBook Air (Sonoma 14.3.1) 8GB
    2. (PC, Kaunas) i7-4770 CPU @ 3.40GHz (Linux 6.9.8-arch1-1) 32GB
    3. (DC Dedicated, Helsinki) AMD Ryzen 5 3600 6-Core (Linux 5.15.0-88-generic) 64GB
  • Server

    1. OVH VPS (Warsaw) vps2020-starter-1-2-20 vCores 1 (Linux 5.10.0-31-cloud-amd64) 2GB
  • Network hardware

    1. Router (Technicolor DGA0122)
    2. Switch (TP Link)
    3. 4G AP (Samsung A53)
  • Network topologies

    1. MAC Wifi > Router > Server
    2. MAC Wifi > 4G AP > Server
    3. MAC Eth > Router > Server
    4. PC > Switch > Router > Server
    5. PC > Router > Server
    6. DC Dedicated > Server

Results

  1. MAC Wifi > Router > Server (Client logs, Server logs)
  • Issues started appearing around 3349 active connections. Existing connections started failing, new connections was still being created without issues
  1. MAC Wifi > 4G AP > Server (Client logs, Server logs)
  • Issues started appearing around 502 active connections. No new connections were allowed to be created, most likely hard limit by the AP (Samsung A53 Android phone)
  1. MAC Eth > Router > Server (Client logs, Server logs)
  • Issues started appearing around 5100 active connections. No new connections were allowd to be created because of the hard limit of open files on Mac OS.
  1. PC > Switch > Router > Server (Client logs, Server logs)
  • Issues started appearing around 1598 active connections. Suspected reason physical limits of cheap TP Link switch.
  1. PC > Router > Server (Client logs, Server logs)
  • Issues started appearing around 6267 active connections. Suspected reason physical limits on the network card, unoptimal interface configuration.
  1. DC Dedicated > Server (Client logs, Server logs)
  • Issues started appearing around 7258 active connections. Suspected reason DDOS protection on the OVH VPS Server side (got an email about that).

Conclusions

Having an executor running on consumer level hardware might pose some challenges, but is possible, from a perspective of maintaining large number of connections to large number of remote hosts. Datacenter level machine should be able to handle these connections without complex network configuration. Ideally, the Executor would have couple network interfaces to help spread the load.

How to run

If server is being deployed on a remote machine, a new certificate needs to be created with updated ip address and the hostname. Follow "Certificate" section for that. Once a new certificate is created, recompile client and server.

Certificate

To genereate a new key and certificate for remote testing:

openssl ecparam -name prime256v1 -genkey -noout -out key.pem
openssl req -new -key key.pem -out cert.csr -config san.cnf
openssl x509 -req -in cert.csr -signkey key.pem -out cert.pem -days 365 -extensions req_ext -extfile san.cnf

Cross compilation

To crosscompile to x86 linux target use x86_64-unknown-linux-musl, gcc-10 is required for aws-lc-sys and as of 2024-07-18, cross doens't have this version when using docker.

cross build --target x86_64-unknown-linux-musl --release