4.5 KiB
Large number of direct UDP connections test
In Nomos DA, the Executor has to maintain persistent connections to 4096 (or more for redundancy) DA Nodes for dispersing the encoded blobs. To see if such number of connections is feasable, a test mimicking high speed dispersal was conducted.
Test setup
Rudimentary client and server applications was created using s2n-quic crate.
-
Client
- Mimicks behaviour of Zone Executor by sending 1024 bytes packet per connection every second;
- Opens 10 new udp socket connections every second to the Server;
- Tracks how many currently open connections there are;
- Tracks the shortest, longest and average time it takes for the packet to be echoed back from the server.
-
Server
- Mimicks behaviour of Multiple DA Nodes by accepting any incomming udp connections and echoing any data that is received;
- Tracks total number of currently open connections;
- Tracks total number of received bytes.
Hardware used for tests
-
Client
- (MAC, Kaunas) MacBook Air (Sonoma 14.3.1) 8GB
- (PC, Kaunas) i7-4770 CPU @ 3.40GHz (Linux 6.9.8-arch1-1) 32GB
- (DC Dedicated, Helsinki) AMD Ryzen 5 3600 6-Core (Linux 5.15.0-88-generic) 64GB
-
Server
- OVH VPS (Warsaw) vps2020-starter-1-2-20 vCores 1 (Linux 5.10.0-31-cloud-amd64) 2GB
-
Network hardware
- Router (Technicolor DGA0122)
- Switch (TP Link)
- 4G AP (Samsung A53)
-
Network topologies
- MAC Wifi > Router > Server
- MAC Wifi > 4G AP > Server
- MAC Eth > Router > Server
- PC > Switch > Router > Server
- PC > Router > Server
- DC Dedicated > Server
Results
- MAC Wifi > Router > Server (Client logs, Server logs)
- Issues started appearing around 3349 active connections. Existing connections started failing, new connections was still being created without issues
- MAC Wifi > 4G AP > Server (Client logs, Server logs)
- Issues started appearing around 502 active connections. No new connections were allowed to be created, most likely hard limit by the AP (Samsung A53 Android phone)
- MAC Eth > Router > Server (Client logs, Server logs)
- Issues started appearing around 5100 active connections. No new connections were allowd to be created because of the hard limit of open files on Mac OS.
- PC > Switch > Router > Server (Client logs, Server logs)
- Issues started appearing around 1598 active connections. Suspected reason physical limits of cheap TP Link switch.
- PC > Router > Server (Client logs, Server logs)
- Issues started appearing around 6267 active connections. Suspected reason physical limits on the network card, unoptimal interface configuration.
- DC Dedicated > Server (Client logs, Server logs)
- Issues started appearing around 7258 active connections. Suspected reason DDOS protection on the OVH VPS Server side (got an email about that).
Conclusions
Having an executor running on consumer level hardware might pose some challenges, but is possible, from a perspective of maintaining large number of connections to large number of remote hosts. Datacenter level machine should be able to handle these connections without complex network configuration. Ideally, the Executor would have couple network interfaces to help spread the load.
How to run
If server is being deployed on a remote machine, a new certificate needs to be created with updated ip address and the hostname. Follow "Certificate" section for that. Once a new certificate is created, recompile client and server.
Certificate
To genereate a new key and certificate for remote testing:
openssl ecparam -name prime256v1 -genkey -noout -out key.pem
openssl req -new -key key.pem -out cert.csr -config san.cnf
openssl x509 -req -in cert.csr -signkey key.pem -out cert.pem -days 365 -extensions req_ext -extfile san.cnf
Cross compilation
To crosscompile to x86 linux target use x86_64-unknown-linux-musl
, gcc-10 is required for aws-lc-sys
and as of 2024-07-18, cross
doens't have this version when using docker.
cross build --target x86_64-unknown-linux-musl --release