Commit Graph

382 Commits

Author SHA1 Message Date
gusto 07ae5a4df9
Add missing build and test flags (#301) 2023-08-10 14:20:19 +03:00
Daniel Sanchez 44367a15a5
Sim update overlay (#298)
* Update sim engine on timeout qc

* Remove rebuild method
2023-08-10 13:18:17 +02:00
Daniel Sanchez ef5935d572
Simulation prune tally (#299)
* Add pruning methods

* Call pruning on node step method

* Fix tally retain closure
2023-08-10 13:12:48 +02:00
gusto da60d8fc95
Parallel node init (#300) 2023-08-10 12:56:02 +03:00
Giacomo Pasini 78c6566d8a
Add libp2p backend to nomos-node (#285)
* Add support for libp2p backend in integration tests

* Add support for libp2p in nomos-node

* change default to waku

* add mutually exclusive features warning

* disable default features to avoid unification

* disable default features

* remove leftover cargo build

* Make sure we are subscribed to libp2p topic at startup

* unify imports

* typo in ci config

* Sequential build and test steps for features

* Add RandomBeaconState to libp2p carnot variant

---------

Co-authored-by: Gusto <bacvinka@gmail.com>
2023-08-09 07:42:08 +02:00
gusto d675585a0f
Test for partial message sending in simulation (#294)
* Test for partial message sending

* Test correctness, typos

* Fix the node capacity flushing

* Not process double timeout qcs in simulations

* Discard older view messages in simulations messages

* Refactor committed_blocks to latest_committed_blocks

* Remove tally default

* Fix condition to root committee parenting

* Bring back pruning

* Clippy happy

---------

Co-authored-by: danielsanchezq <sanchez.quiros.daniel@gmail.com>
2023-08-08 18:00:08 +03:00
Daniel Sanchez 4bdc3ed15a
Add update committees structure for overlay (#295)
* Move roundrobin to leadership module

* Use references in leader selection

* Add membership traits to overlay and create membership module

* Implement committee membership for random beacon state

* Update flat overlay

* Create updateable membership traits and impls

* Update tree overlay

* Update overlay on consensus service

* Update overlay on simulations nomos node

* Update types on tests and modules

* Use chacha for shuffling

* Change to mut slice instead of inner cloning

* Use fisher yates shuffle from scratch

* Stylish and clippy happy
2023-08-08 10:34:02 +02:00
gusto ef72c7a110
Add ci build steps for libp2p node version (#290)
* Add ci build steps for libp2p node version

* Update ci/Jenkinsfile.nightly.integration

Co-authored-by: Youngjoon Lee <taxihighway@gmail.com>

* Fix typos

* Use features in cargo check

* Feature and testcase matrix for integration tests

* Use jenkins matrix to seperate steps for different features

---------

Co-authored-by: Youngjoon Lee <taxihighway@gmail.com>
2023-08-07 18:39:24 +03:00
Giacomo Pasini 4a3d677ea9
Small fixes for libp2p network backend (#280)
* Generate network events for self messages

Waku does that and it's kind of convenient not to handle ourselves
in a different way from the rest.

* Use bigger buffer + fmt

When receiving messages coming from libp2p IWANT requests, it's
common to receive a burst of packest which can cause subscribers
to lag. To account for that, let's increase the buffer in the
broadcast channel.

* Check if topic is being subscribed before self-notification (#292)

* fmt

---------

Co-authored-by: Youngjoon Lee <taxihighway@gmail.com>
2023-08-07 06:00:41 +02:00
Daniel Sanchez 30bf101576
Disable tree rebuild (#293) 2023-08-04 12:17:10 +02:00
Al Liu c16b794517
fix large msg sending logic (#274)
* fix large msg sending logic
2023-08-03 20:05:43 +08:00
Daniel Sanchez fa8e1025f5
Fix: genesis pruning in engine (#289)
* Do not prune genesis

* Fix retain condition
2023-08-03 12:25:33 +02:00
Giacomo Pasini 083d061e46
Add dummy libp2p adapter for mempool (#286) 2023-08-02 17:00:52 +02:00
Giacomo Pasini f8422fc7a8
Add initial implementation of libp2p consensus adapter (#279)
* Add initial implementation of libp2p consensus adapter

Co-authored-by: Youngjoon Lee <taxihighway@gmail.com>

* fix

* Handle all message types received via gossipsub (#283)

* remove todo

Co-authored-by: Youngjoon Lee <taxihighway@gmail.com>

---------

Co-authored-by: Youngjoon Lee <taxihighway@gmail.com>
2023-08-02 14:07:44 +02:00
Giacomo Pasini c2ca46e6a8
Add command to retrieve libp2p network info (#281)
* Add command to retrieve libp2p network info

* fix field name

* fix
2023-08-01 17:04:18 +02:00
Daniel Sanchez fac42cd31d
Prune blocks in consensus engine (#275)
* Add prune older blocks call

* Prune older blocks in simulation app

* Use latest safe view for failure case

* Pick parent view from high qc

* Add safety assertion on pruning method
2023-08-01 16:47:47 +02:00
Daniel Sanchez ae59be5466
Refactor parent committee methods to return optional parent instead of empty one (#282) 2023-08-01 09:19:29 +02:00
Giacomo Pasini f21f1ea10a
Small additions to libp2p (#278)
* Add initial peer config to nomos-libp2p

* Use custom message id to avoid duplicates

* Expose reference to the inner swarm

* move closure into function
2023-08-01 03:26:42 +02:00
gusto 976b1f9577
Add node step time to report (#271)
* Skip step if previous node exeeded step time

* Fix clippy

* Passively keep track of step execution time in node
2023-07-26 10:21:05 +03:00
gusto ef0b0701f3
Simulations nodeid report (#272)
* Add node id to the node state report

* Update config with record settings example
2023-07-25 17:17:26 +03:00
gusto df97ea2543
Simulatios network throughput (#270)
* Use step time from configuration

* PayloadSize trait for inmemory network interface

* Per node network capacity

* Initial carnot message sizes

* Divide provided kbps by step time

* Use std::mem::size_of for msg sizes
2023-07-25 16:03:05 +03:00
gusto 7d64915dd7
Get current view directly from the engine (#269) 2023-07-21 12:30:52 +03:00
gusto 8dd34f81b4
Simulation tally fix (#268)
* Do not remove entry cache when the threshold is reached

* Leader super majority change

* Update leader threshold test
2023-07-21 12:30:32 +03:00
Al Liu 6dee12704d
Change simulation `Node` trait `current_view` method return `View` (#259) 2023-07-20 17:31:55 +08:00
Daniel Sanchez 0866dfc8af
Process single events (#267) 2023-07-20 10:08:09 +02:00
Daniel Sanchez adf935830e
Do not remove entry cache when the threshold is reached (#266)
Co-authored-by: Gusto <bacvinka@gmail.com>
2023-07-19 18:04:08 +02:00
gusto 9f71dbb24c
Simulation network broadcast fix (#262)
* Replace network broadcast msg type to a dedicated channel

* Update tests with broadcast chan

* Replace threadrng with seedable smallrng

* Simplify the broadcast loop
2023-07-18 15:01:29 +03:00
gusto 94e384f609
Tree settings for sim overlay (#257)
* Tree settings for sim overlay

* WIP dyn node simulation

* Allow generic settings and states for dispatched nodes

* Update tests
2023-07-18 14:44:23 +03:00
gusto fe9f2ba006
Updated integration tests cases, increased nightly iteration count (#263) 2023-07-17 21:16:05 +03:00
Al Liu a59682be54
fix waku feature gate (#260) 2023-07-14 15:59:16 +08:00
Giacomo Pasini c29a641a9f
Refactor NetworkAdapter (#258)
* Rework NetworkAdapter API

The NetworkAdapter API failed to isolate the internals by
providing a way to send a message to a user-provided channel while
the stream listeners expected specific formats.
Unify network messages under the same enum and simplify sending/
broadcasting messages.

* remove redundant inlines

* use committee.id()

* fmt
2023-07-12 16:12:25 +02:00
Al Liu 9467351c10
Finish `View` wrapper (#254)
* finish View wrapper
2023-07-12 21:30:22 +08:00
Giacomo Pasini 4745b99996
use type alias for network adapter (#256) 2023-07-12 13:33:58 +02:00
Al Liu 7a776af530
Finish `BlockId` wrapper (#253)
* finish BlockId wrapper
2023-07-12 19:15:29 +08:00
Al Liu 2135676606
Finish `NodeId` type wrapper (#252)
* add NodeId wrapper
2023-07-11 23:16:49 +08:00
Giacomo Pasini da2dba2e51
Add unhappy path tests (#247)
* Make timeout configurable

Add a way to configure the consensus timeout at startup.

* Make leader threshold and timeout configurable in tests

* Add tests for the unhappy path

Add a test for the unhappy path by stopping a node.
The rest of the peers are sufficient to reach a quorum but the
offline node will fail to produce a block when it's its turn as a
leader, thus triggering the recovery procedure twice before the
test is considered complete.

* ignore clippy warning
2023-07-11 11:00:11 +02:00
Youngjoon Lee 2b9769b5b7
feat: add `nomos-libp2p` crate (for nomos-network backend) (#237)
* feat: add libp2p network backend skeleton

* use tokio runtime managed by Overwatch

* feat: add nomos-libp2p crate

* remove gossipsub_message_id_fn

* clippy

* use next() instead of select_next_some()

* rename send_command to execute_command

* const timeout

* disable authn / msg signing to start from a clean slate

* rename CommandSender to CommandResultSender

* add comments

* move node machinery to networkbackend

* fmt

* logs more network events

---------

Co-authored-by: Giacomo Pasini <g.pasini98@gmail.com>
2023-07-11 10:33:57 +02:00
Al Liu a0cb738b9f
Finish `Committee` wrapper (#251)
* Committee type wrapper

---------

Co-authored-by: danielsanchezq <sanchez.quiros.daniel@gmail.com>
2023-07-07 16:08:19 +08:00
Al Liu 95bfd24f6d
Finish `CommitteeId` wrapper (#249)
* CommitteeId type wrapper

* rewrite committee id logic

* remove unused convert

* use correct way to get committe id by member id

* cleanup and add comment

* Simplify child committees method

---------

Co-authored-by: danielsanchezq <sanchez.quiros.daniel@gmail.com>
2023-07-07 15:38:28 +08:00
gusto 37076aaeeb
Nightly tests (#240)
* Nightly notifications

* Publish discovered regression cases as a PR

* Nightly notifications

* Increase nightly tests runtime

* Extract discordnotify and params

* Groovy script fixes

* Remove unused iterations variable

* Cleanup jenkinsfiles and add verbosity
2023-07-06 16:30:43 +03:00
Youngjoon Lee 8d0360ab3c
fix(fuzz): fix precondition for receive_timeout_qc (#248) 2023-07-06 21:14:26 +09:00
Giacomo Pasini 3607ce7627
Unhappy path fixes (#245)
* Add unhappy path handlers for first view

Nodes resolving the first view were not instructed to handle
possible failures. In particular, no timeouts were configured
and nodes were not listening for timeouts from other nodes.
This commit fixes this by making them use the
same path as every other view.

* Endless timeouts

Keep signaling timeout until consensus is achieved.
This could be helpful in network partitioning where messages could
be lost.

* Ensure consistent committee hash

Sort committee members before hashing to ensure all nodes in the
network obtain a consistent hash

* Fix timeout_qc filtering condition

'>' was used instead of '=='

* Fix new view topic name

'votes' was used instead of 'new-view'

* Fix filtering condition for unhappy path tally

Rogue '!' at the beginning

* Add timeout tally

Filter timeouts to allow only timeouts for the current view
coming from root committee members.
We might want to try to unigy this with happy and unhappy tally.

* Add debug logs

* clippy happy
2023-07-05 15:30:57 +02:00
Al Liu d0c6df23fc
Implement Rust version tree overlay (#238)
* implement tree overlay in Rust
2023-07-05 17:28:22 +08:00
Giacomo Pasini 90cf29bf86
Better tests (#232)
* add basic integration tests

* add a way to configure overlay threshold

* Save logs to file in case of failure

* Increase number of test nodes to 10

* fix tests

* use fraction instead of tuple

* fmt
2023-07-04 11:39:28 +02:00
Youngjoon Lee 5199ee12e9
fix: add a guard on the view for LeaderProof validation (#233) 2023-07-04 18:31:33 +09:00
Youngjoon Lee 98aa138b87
feat: enforce TimeoutQc to be constructed only by new() (#229) 2023-06-28 23:37:27 +09:00
Al Liu 224a3a53f5
fix tally_by condition (#234) 2023-06-28 17:56:16 +08:00
Al Liu 4a80690c1f
Add a cli arg to support redirect log (#230)
* add cli arg to support redirect log
2023-06-28 17:10:19 +08:00
Youngjoon Lee 8956a3547b
fix: remove `Qc.parent_view()` that has a bug (#223) 2023-06-27 23:56:36 +09:00
Giacomo Pasini b884e1ceca
Fix waku backend (#231)
* Fix order of network streams

When fetching a message from the network, we need to first listen
for incoming messages and then look at the storage. If we do this
in the opposite order, there's a brief moment where we've freezed
our view of stored messages and are not yet listening for incoming
ones, thus risking loosing messages.

* Wait for waku db spurious delays

Sometimes waku takes some time (e.g. a few seconds) before making
a received message available through the archive query. Let's
account for this by making repeated calls if the first one is not
successful.

* Add initial network wait

We've observed in testing that even if waku reports that some
peers are connected it can't really deliver a message. To overcome
this limitation, we add a wait at the network service startup.
We know this is not ideal, but Waku will eventually be replaced and
we're looking for a quick fix.

* fmt
2023-06-27 16:29:01 +02:00