Commit Graph

461 Commits

Author SHA1 Message Date
Jakub Sokołowski 58cbfee30f
nimbus.prater: disable resyncing on all hosts
It just causes unnecessary alerts for an obsolete network.

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-11-27 11:45:16 +01:00
Jakub Sokołowski 24020d0962
all: reduce MTR report cycle from 10 to 1
We have received a complaint from InnovaHosting about them being hit by
about 150 ICMP `ttl1` packets/s on their routers, causing excess CPU usage.
https://client.innovahosting.net/viewticket.php?tid=532874&c=8gALx9vm

By using `tcpdump` I have identified that `mtr` by default pings the
target 10 times, which means that the default value of `-c`/`--report-cycles`
is 10, although this is not documented in the manual.

We can see this when calling `mtr github.com` and watching with `tcpdump`:
```
 > sudo tcpdump -v -i eno1 icmp and src 185.181.230.78 and dst github.com | grep 'ttl 1,'
tcpdump: listening on eno1, link-type EN10MB (Ethernet), snapshot length 262144 bytes
19:54:53.981243 IP (tos 0x0, ttl 1, id 37119, offset 0, flags [none], proto ICMP (1), length 64)
...(8 packets omitted)...
19:55:03.025460 IP (tos 0x0, ttl 1, id 38226, offset 0, flags [none], proto ICMP (1), length 64)
```
We don't need to run the test 10 times to get a result for our metric.

Related to:
https://github.com/status-im/infra-role-bootstrap-linux/commit/ea22bdfe

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-11-20 21:07:17 +01:00
Jakub Sokołowski 6b800a5342
nimbus.fluffy: re-enable Consul healthchecks
It appears the RPC issues was resolved in:
https://github.com/status-im/nimbus-eth1/issues/1880

Most probably caused by DB size.

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-11-20 09:57:46 +01:00
Jakub Sokołowski 2cac3081a0
layouts: add script and generate TSVs of validators
Helps developers identify which host holds which validator.

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-11-17 15:01:57 +01:00
Jakub Sokołowski c29b23c6dc
nimbus.sepolia: open ports for waku.test fleet
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-11-14 11:56:16 +01:00
kdeme ce37186651
all: update SSH key for kim
This one is from a YubiKey.

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-11-07 18:48:05 +01:00
Jakub Sokołowski 10dd722e29
all: grant admin rights to kim
Necessary to run 'perf'.

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-11-07 16:19:24 +01:00
Jakub Sokołowski c1be589960
all: add debug tools like gdb and perf
Also allow use of 'perf' without root.

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-11-07 12:48:05 +01:00
Jakub Sokołowski 4df34ac3c1
nimbus.sepolia: enable payload builder for 4th node
Also drop unnecessary Nim build flags.

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-11-07 11:50:26 +01:00
Jakub Sokołowski 36f78a5970
nimbus.fluffy: disable Consul healthchecks
They are too flaky to be useful, see:
https://github.com/status-im/nimbus-eth1/issues/1880

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-11-02 12:51:01 +01:00
Jakub Sokołowski d2feb628c4
nimbus.fluffy: raise Consul alert threshold limits
This host has constantly issue with nodes and nobody cares.

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-11-02 10:51:57 +01:00
Jakub Sokołowski 7272d55105
nimbus.prater: drop chronos and erigon from linux-06
The host was overloaded and ran out of disk space on `/docker` volume.

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-11-02 10:25:40 +01:00
Jakub Sokołowski a6dc16830d
all: grant SSH access to ujscale, mumar@status
Necessary to look at full Nimbus Prater logs.

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-10-31 14:04:26 +01:00
Jakub Sokołowski bd9d7cc752
all: prevent SPAM Nimbus logs from reaching Logstash
Depends on:
https://github.com/status-im/infra-role-bootstrap-linux/commit/20609731
https://github.com/status-im/infra-role-bootstrap-linux/commit/98816e2a

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-10-24 16:13:01 +02:00
Jakub Sokołowski a86a65c4bc
nimbus.prater: disable log aggregation for the fleet
Zahary agreed that we need to start phasing out use of Prater.
This also helps us avoid paying extra for 10 Gbps link for aggr host.

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-10-23 13:02:39 +02:00
Jakub Sokołowski d2270feece
logs.nimbus: increase total_fields.limit to 1500
This is a special case since these logs are all custom JSON, so
increasing this is fine for now. I can't control what they put in logs.
https://discuss.elastic.co/t/approaches-to-deal-with-limit-of-total-fields-1000-in-index-has-been-exceeded/241039

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-10-12 18:56:21 +02:00
Jakub Sokołowski 63de71f759
all: remove SSH access for tanguy
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-10-12 18:00:59 +02:00
Jakub Sokołowski 6855fc016b
ih-eu-mda1: drop data center override
We now have Consul, logs, and metrics hosts in `ih-eu-mda1`.
https://github.com/status-im/infra-hq/issues/105

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-10-12 17:26:08 +02:00
Jakub Sokołowski 874771e109
nimbus.prater: fix port clash between Erigon and Geth
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-10-09 11:15:32 +02:00
Jakub Sokołowski c6fc550cf3
nimbus.holesky: fix Erigon DevP2P port offset
It needs to be 10, because a special flag `--p2p.allowed-ports` is used
to open multiple ports for multiple enabled Eth protocol versions.

For more information you can see:
https://github.com/status-im/infra-role-erigon/commit/eaef1e9f
https://github.com/ledgerwatch/erigon/issues/8330

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-10-02 19:42:16 +02:00
Alexis Pentori 458652e7f8
sepolia: Exposing ERA files
Signed-off-by: Alexis Pentori <alexis@status.im>
2023-10-02 13:42:51 +02:00
Jakub Sokołowski 5e12025aa6
all: grant admin ot Dustin user
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-09-29 10:52:36 +02:00
Jakub Sokołowski adc1a061c4
nimbus.holesky: use the same ports for all EL node types
Otherwise we'd need some kind of weird logic to compile the list of URLs
used by the beacon node, and the node types are exclusive so this is fine.

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-09-28 14:43:28 +02:00
Jakub Sokołowski 7b45d24b43
nimbus.holesky: upgrade Geth to 1.13.2
Drop usage of master build.

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-09-28 11:54:00 +02:00
Jakub Sokołowski 2588a658cf
nimbus.holesky: use 2.49.3 Erigon release
https://github.com/ledgerwatch/erigon/releases/tag/v2.49.3

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-09-27 19:54:35 +02:00
Jakub Sokołowski 7db4374fc9
nimbus.holesky: drop index from BN and VC names
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-09-27 19:49:36 +02:00
Jakub Sokołowski ebb9cc82b3
nimbus.holesky: upgrade EL nodes to support new genesis
https://github.com/status-im/infra-nimbus/issues/152

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-09-27 19:49:35 +02:00
Jakub Sokołowski 5446b3fc0f
nimbus.holesky: open metrics ports for EL nodes
https://github.com/status-im/infra-nimbus/issues/152

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-09-16 11:38:40 +02:00
Jakub Sokołowski f200a1b4c5
nimbus.holesky: fleet config and validator layout
https://github.com/status-im/infra-nimbus/issues/152

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-09-15 16:25:38 +02:00
Alexis Pentori 73184446d6
nimbus.parter: removing ephemeral debug option
Signed-off-by: Alexis Pentori <alexis@status.im>
2023-09-15 14:37:39 +02:00
Alexis Pentori e8834f4a9e
nimbus.sepolia: removing ephemeral debug option
Signed-off-by: Alexis Pentori <alexis@status.im>
2023-09-15 14:37:39 +02:00
Daniil Sobol b9373b7889
all: grant SSH access to daniil@status.im
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-09-11 09:41:05 +02:00
Jakub Sokołowski a658d312a8
nimbus.prater: add stable node to macm1-01 host
https://github.com/status-im/infra-nimbus/issues/132

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-09-06 10:19:21 +02:00
Jakub Sokołowski 227206c82d
nimbus.prater: move validators to macm1-01
https://github.com/status-im/infra-nimbus/issues/132

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-09-06 10:18:44 +02:00
Jakub Sokołowski a823709dfe
add macm1-01.ih-eu-mda1.nimbus.prater host
Replacement for `macos-01.ms-eu-dublin.nimbus.prater`.

https://github.com/status-im/infra-nimbus/issues/132

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-08-18 21:46:25 +02:00
Jakub Sokołowski 2a99b6ab43
nimbus.prater: add Nethermind node on linux-04
https://github.com/status-im/infra-eth2/issues/11

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-08-17 13:26:08 +02:00
Jakub Sokołowski c662e92d51
nimbus.prater: configure Nethermind metrics endpoint
https://github.com/status-im/infra-eth2/issues/11

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-08-17 13:05:28 +02:00
Jakub Sokołowski 8b6a22110a
nimbus.prater: open ports for Nethermind EL node
https://github.com/status-im/infra-eth2/issues/11

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-08-11 14:21:19 +02:00
Jakub Sokołowski 1e24f891fe
nimbus.prater: add DNS discovery entry for nethermind
https://github.com/status-im/infra-eth2/issues/11

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-08-11 14:20:34 +02:00
Jakub Sokołowski 600c6b02df
nimbus.prater: add chronos node on linux-06 host
For Eugene for tracking regressions in chronos library.

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-08-08 15:54:11 +02:00
Jakub Sokołowski 8d5d8a3935
refactor handling of long libp2p branch name
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-08-08 15:22:03 +02:00
Jakub Sokołowski 5344827479
nimbus.prater: deploy Nethermind node on linux-02
Part of work to use Nethermind for eth2.prod fleet:
https://github.com/status-im/infra-eth2/issues/11

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-08-07 23:05:33 +02:00
Jakub Sokołowski 5586db729d
nimbus.prater: reduce max_headers_size to 128 KB
Probably the reason for elevated memory usage when using Validator
Client with large number of validators attached.

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-08-07 14:54:07 +02:00
Jakub Sokołowski 2c9bbe832b
nimbus.sepolia: open Geth Websocket ports for Vac
Requested by p1ge0nh8er for vacdev.misc host.

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-07-16 10:57:40 +02:00
Jakub Sokołowski f304db1cc0
nimbus.prater: bump Eirgon memory limit to 15%
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-07-16 10:18:40 +02:00
Jakub Sokołowski 41025265e2
nimbus.prater: add 16 GB SWAP file, no SWAP partition
There were OOM killer logs on `linux-06` due to Erigon.

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-07-12 09:38:19 +02:00
Jakub Sokołowski 7dff81fb2f
add windows-01.ih-eu-mda1.nimbus.prater host
https://github.com/status-im/infra-nimbus/issues/132

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-07-07 12:17:33 +02:00
Jakub Sokołowski 4480d292be
nimbus.sepolia: debug flag for old attestation stability
As requested by Dustin.

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-07-06 08:32:55 +02:00
Jakub Sokołowski cec778f4f1
nimbus.prater: debug flag for old attestation stability
As requested by Dustin.

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-07-06 08:32:54 +02:00
Jakub Sokołowski 52e518d3c3
nimbus.sepolia: drop nim_commit=version-1-6 flag
It no longer has any effect.

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-07-06 08:32:53 +02:00