Jakub Sokołowski
55b31f42f5
all: do not send trace level logs to logstash
...
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-07-23 12:12:54 +02:00
Jakub Sokołowski
7ef357f9e9
requirements: bump certbot and postgres-ha
...
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-07-23 10:28:15 +02:00
Jakub Sokołowski
5591327ea3
store: lower staging retention to 75 GB to avoid alerts
...
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-07-17 09:26:42 +02:00
Jakub Sokołowski
d66bb10326
store-db: bump data volume from 300 to 310 GB
...
Otherwise we trigger alert for lest than 15% disk space left.
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-07-17 09:26:41 +02:00
Jakub Sokołowski
770dad967e
store,boot: fix name of Docker tag name
...
We are in the middle of renaming fleets and this will make it more
robust.
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-07-16 16:55:32 +02:00
Jakub Sokołowski
749c281209
store-db: fix variable name for Postgres Alter System
...
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-07-16 13:42:51 +02:00
Jakub Sokołowski
f554fe7185
store-db: set autovacuum_work_mem to 10% of memory
...
We have seen host crashes caused by PostgreSQL using up all memory by
trying to run `autovacuum` workers.
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-07-16 13:42:50 +02:00
Jakub Sokołowski
97544ad634
store: set retention policy using size instead of time
...
Using time causes the DB to be filled quickly.
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-07-09 11:16:38 +02:00
Jakub Sokołowski
2988df6c5b
store: bump data volume form 250 GB to 300 GB
...
The garbage doesn't stop flowing.
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-07-09 11:15:43 +02:00
Jakub Sokołowski
8bb033cf6c
flake: add flake.nix and lock
...
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-07-05 13:40:30 +02:00
Jakub Sokołowski
a0ad0410d9
ansible: apply roles.py fixes
...
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-07-05 11:44:42 +02:00
Jakub Sokołowski
45f83b0039
store-db: bump data volume from 150 GB to 250 GB
...
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-07-04 16:54:24 +02:00
Jakub Sokołowski
040b9d4949
rename shards fleet to status fleet
...
While also retaining the old domain names.
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-07-03 22:00:29 +02:00
Jakub Sokołowski
b1da421448
boot: uncomment setting for boot node key
...
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-07-03 01:19:08 +02:00
Ivan Folgueira Bande
062cb6d51a
set max_locks_per_transaction to 2160
...
We are using partitions in our postgres DBs. And we have one
partition per hour (24 partitions per day.)
The default max_locks_per_transaction value (64) can cause
"our of memory" and block issues in the DB because we use to
have more than 64 partitions.
With 2160 we aim to avoid that issue for 90 days (2160 == 90*24.)
if we consider a time retention policy of 90 days. Nevertheless,
we usually have time retention policy of 30 days in our Status fleets,
but we are just adding some extra margin.
2024-06-26 13:15:24 +02:00
Jakub Sokołowski
46c7a759b9
versions.tf: upgrade pass provider to 2.1.1
...
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-06-24 13:29:35 +02:00
Jakub Sokołowski
a7e9cb6e30
ansible/roles.py: fix pull call to handle up-to-date repo
...
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-06-24 08:50:45 +02:00
Jakub Sokołowski
9fce0e4211
ansible: add roles.py script to manage roles
...
https://github.com/status-im/infra-template/pull/5
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-06-13 17:17:56 +02:00
Jakub Sokołowski
6ae82c6c09
requirements: bump postgres-ha role
...
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-06-13 17:17:42 +02:00
Jakub Sokołowski
a8162303d8
store-db: double size of hosts to handle big queries
...
We've been experience extremely high average load reaching 30-40, most
probably due to unoptimized queries. Doubling hosts to at least allow
easier debugging of issues.
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-06-12 17:22:13 +02:00
Alexis Pentori
7178fc4d83
store: set logrotate frequency to hourly
...
Signed-off-by: Alexis Pentori <alexis@status.im>
2024-06-10 10:31:15 +02:00
Jakub Sokołowski
f26e7f2708
readme: update entrees for all fleets
...
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-06-06 15:06:52 +02:00
Jakub Sokołowski
89c487dfcc
requirements: bump nim-waku to use new compose module
...
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-06-03 14:19:59 +02:00
Ivan Folgueira Bande
dc9b6f5a81
boot: set max msg size to 1024KiB to fit store nodes
...
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-06-03 10:21:08 +02:00
Jakub Sokołowski
9bbed44078
test: bump data volumes to 150 GB
...
Migrations introduced in `0.26.0` need more than twice the currently
used space to be performed. Stupid.
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-05-24 10:04:03 +02:00
Jakub Sokołowski
e1b4be4a24
store: un-command nim_waku_node_key variable
...
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-05-23 18:10:15 +02:00
Jakub Sokołowski
aa3e653a53
store: lower sensitivity of consul healthchecks
...
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-05-12 10:30:30 +02:00
Jakub Sokołowski
e7b1cdcb85
lookup_plugins/bitwarden: ignore stderr
...
Otherwise we get weird JSON parsing errors:
```
An unhandled exception occurred while running the lookup plugin 'bitwarden'.
Error was a <class 'json.decoder.JSONDecodeError'>, original message:
Extra data: line 1 column 843 (char 842). Extra data: line 1 column 843 (char 842)
```
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-05-07 14:55:51 +02:00
Anton Iakimov
f39afef54d
boot: logrotate hourly due to lots of DBG logs
2024-04-24 16:01:04 +02:00
Jakub Sokołowski
883893f547
deploy new shards.staging fleet
...
https://github.com/status-im/infra-shards/issues/29
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-03-18 20:48:58 +01:00
Jakub Sokołowski
4ef143ed20
ansible/main: run DB setup before node setup
...
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-03-18 20:06:16 +01:00
Jakub Sokołowski
f116eef7ce
requirements: bump certbot to fix init
...
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-03-18 18:39:43 +01:00
Jakub Sokołowski
ae852ef9b1
versions: upgrade cloudflare provider, drop account_id
...
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-03-18 16:25:08 +01:00
Jakub Sokołowski
3f5c9ea4cb
store: drop temporary image lock for store-02.gc
...
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-03-15 19:15:59 +01:00
Jakub Sokołowski
3c60a6dcde
boot,store: go back to using proper deploy branches
...
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-03-15 14:27:38 +01:00
Jakub Sokołowski
74be1115c6
boot,store: use both new and old domain names
...
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-03-15 14:27:37 +01:00
Jakub Sokołowski
c87a3310ac
ansible/inventory: update to use status.im domain
...
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-03-14 22:57:09 +01:00
Jakub Sokołowski
81850e6466
requirements: use full names for all roles
...
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-03-14 21:56:06 +01:00
Jakub Sokołowski
01e2f7bc1e
drop statusim.net domain config in favor of status.im
...
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-03-14 21:56:05 +01:00
Jakub Sokołowski
dbb007ef4d
requirements: bump nim-waku role
...
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-03-14 17:19:09 +01:00
Jakub Sokołowski
717b37aa0c
node: expose config.toml using Nginx server
...
This can then be linked from the new https://fleets.waku.org/ .
https://github.com/status-im/infra-misc/issues/229
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-03-07 12:44:09 +01:00
Jakub Sokołowski
04be3c33d4
requirements: bump nim-waku role to use config file
...
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-03-04 22:15:17 +01:00
Jakub Sokołowski
bde743c656
boot,store: add /waku/2/rs/16/1 topic
...
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-03-04 22:09:38 +01:00
Jakub Sokołowski
1e025a18ff
boot,store: temporarily lock image at v0.24.0
...
Attempt to upgrade to 0.25.0 caused major connectivity issues.
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-02-29 20:11:22 +01:00
Jakub Sokołowski
352c55ff73
boot,store,db: add serial setting for playbook
...
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-02-29 13:36:08 +01:00
Jakub Sokołowski
6004610d63
boot,store: add cluster ID required for 0.25.0
...
16 is the value "reserved" for status fleets with static sharding
https://rfc.vac.dev/spec/51/#static-sharding
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-02-29 12:34:10 +01:00
Jakub Sokołowski
5169eb13a5
requirements: bump nim-waku to fix consul definition
...
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-02-23 12:34:41 +01:00
Alexis Pentori
968ebdf925
store-db: increasing data volume to 100GB
...
Signed-off-by: Alexis Pentori <alexis@status.im>
2024-02-02 09:47:17 +01:00
Jakub Sokołowski
1813cf46ca
store: set max-msg-size to 1024KiB
...
https://github.com/waku-org/nwaku/issues/2305
Signed-off-by: Jakub Sokołowski <jakub@status.im>
2024-02-01 13:41:18 +01:00
Anton Iakimov
abe3642480
nim-waku: add --ip-colocation-limit flag
...
https://github.com/status-im/infra-shards/issues/27
2024-01-24 15:25:34 +01:00