Commit Graph

143 Commits

Author SHA1 Message Date
gusto 37076aaeeb
Nightly tests (#240)
* Nightly notifications

* Publish discovered regression cases as a PR

* Nightly notifications

* Increase nightly tests runtime

* Extract discordnotify and params

* Groovy script fixes

* Remove unused iterations variable

* Cleanup jenkinsfiles and add verbosity
2023-07-06 16:30:43 +03:00
Youngjoon Lee 8d0360ab3c
fix(fuzz): fix precondition for receive_timeout_qc (#248) 2023-07-06 21:14:26 +09:00
Giacomo Pasini 3607ce7627
Unhappy path fixes (#245)
* Add unhappy path handlers for first view

Nodes resolving the first view were not instructed to handle
possible failures. In particular, no timeouts were configured
and nodes were not listening for timeouts from other nodes.
This commit fixes this by making them use the
same path as every other view.

* Endless timeouts

Keep signaling timeout until consensus is achieved.
This could be helpful in network partitioning where messages could
be lost.

* Ensure consistent committee hash

Sort committee members before hashing to ensure all nodes in the
network obtain a consistent hash

* Fix timeout_qc filtering condition

'>' was used instead of '=='

* Fix new view topic name

'votes' was used instead of 'new-view'

* Fix filtering condition for unhappy path tally

Rogue '!' at the beginning

* Add timeout tally

Filter timeouts to allow only timeouts for the current view
coming from root committee members.
We might want to try to unigy this with happy and unhappy tally.

* Add debug logs

* clippy happy
2023-07-05 15:30:57 +02:00
Al Liu d0c6df23fc
Implement Rust version tree overlay (#238)
* implement tree overlay in Rust
2023-07-05 17:28:22 +08:00
Giacomo Pasini 90cf29bf86
Better tests (#232)
* add basic integration tests

* add a way to configure overlay threshold

* Save logs to file in case of failure

* Increase number of test nodes to 10

* fix tests

* use fraction instead of tuple

* fmt
2023-07-04 11:39:28 +02:00
Youngjoon Lee 5199ee12e9
fix: add a guard on the view for LeaderProof validation (#233) 2023-07-04 18:31:33 +09:00
Youngjoon Lee 98aa138b87
feat: enforce TimeoutQc to be constructed only by new() (#229) 2023-06-28 23:37:27 +09:00
Al Liu 224a3a53f5
fix tally_by condition (#234) 2023-06-28 17:56:16 +08:00
Al Liu 4a80690c1f
Add a cli arg to support redirect log (#230)
* add cli arg to support redirect log
2023-06-28 17:10:19 +08:00
Youngjoon Lee 8956a3547b
fix: remove `Qc.parent_view()` that has a bug (#223) 2023-06-27 23:56:36 +09:00
Giacomo Pasini b884e1ceca
Fix waku backend (#231)
* Fix order of network streams

When fetching a message from the network, we need to first listen
for incoming messages and then look at the storage. If we do this
in the opposite order, there's a brief moment where we've freezed
our view of stored messages and are not yet listening for incoming
ones, thus risking loosing messages.

* Wait for waku db spurious delays

Sometimes waku takes some time (e.g. a few seconds) before making
a received message available through the archive query. Let's
account for this by making repeated calls if the first one is not
successful.

* Add initial network wait

We've observed in testing that even if waku reports that some
peers are connected it can't really deliver a message. To overcome
this limitation, we add a wait at the network service startup.
We know this is not ideal, but Waku will eventually be replaced and
we're looking for a quick fix.

* fmt
2023-06-27 16:29:01 +02:00
gusto 55648e3151
Simulation network broadcast (#225)
* Sim network broadcast method

* Remove list of all nodes in carnot node instance

* Use network broadcast method within nodes

* Add helper method for network payload
2023-06-27 14:20:59 +03:00
Giacomo Pasini 09370dcef8
Integrations tests (#221)
* make config struct fields public

* add basic integration tests

* Add libssl-dev dependency

* fix comments

---------

Co-authored-by: Gusto <bacvinka@gmail.com>
2023-06-27 13:05:09 +02:00
Youngjoon Lee 3eceed5d9a
test(fuzz): more randomness for receive_timeout_qc (#226) 2023-06-27 08:34:16 +09:00
Youngjoon Lee bdf31b77cb
docs(fuzz): update README (#227) 2023-06-26 23:46:24 +09:00
Youngjoon Lee 2c9d6b1401
test(fuzz): add more invariant checks (#224) 2023-06-26 20:34:12 +09:00
Youngjoon Lee feb428bf18
fix(fuzz): apply `ReceiveSafeBlock` & `ReceiveUnsafeBlock` only when parent exists (#222) 2023-06-26 20:20:55 +09:00
Youngjoon Lee deeb3eeba0
test(fuzz): add `approve_new_view` & `receive_safe_block_with_aggregated_qc` transition (#208) 2023-06-26 18:37:35 +09:00
Giacomo Pasini 8fad13b0cc
Separate nomos node into bin and lib sections (#209) 2023-06-22 16:58:47 +02:00
Al Liu 18bbf63a3f
update simulation app configuration (#217) 2023-06-22 19:50:34 +08:00
Youngjoon Lee ec3fb62baf
refactor(fuzz): modularize fuzz testing (#211) 2023-06-22 15:15:38 +09:00
Youngjoon Lee f4194bc728
test(fuzz): add `local_timeout` & `receive_timeout_qc` transitions (#207) 2023-06-22 10:58:26 +09:00
Youngjoon Lee 371ba17922
test: consensus-engine fuzz testing (happy path) (#186) 2023-06-21 22:35:32 +09:00
Alberto Soutullo 53a89c33a8
Fixed ident in initial_peers and added relayTopics that was missing (#213) 2023-06-21 14:16:45 +02:00
Alberto Soutullo 3f30bcc0a8
Changed mockpool-node to nomos-node in Dockerfile (#212)
* Changed mockpool-node to nomos-node in Dockerfile

* Deleted old yml and copying correct one
2023-06-21 12:02:56 +02:00
Al Liu 4ccc19f5a3
Support configurable records (#200)
* support configurable records
2023-06-21 16:31:36 +08:00
Al Liu c74b53be2e
Log message sending failure (#206)
* fix a todo in waku adapter
2023-06-21 16:30:47 +08:00
Giacomo Pasini 2fc10f94f8
Rename mockpool-node to nomos-node (#203) 2023-06-20 17:04:52 +02:00
Youngjoon Lee f02234f9d1
deps: bump waku-bindings to handle `ephemeral` field (#160) 2023-06-20 20:58:22 +09:00
Youngjoon Lee 6061ba3f3d
fix: prevent skipping one view when proposing a block in unhappy path (#199) 2023-06-20 00:09:33 +09:00
Youngjoon Lee 2db493c753
build: bump Rust to 1.70.0 to resolve clippy errors (#201) 2023-06-20 00:03:54 +09:00
Daniel Sanchez bed0b9448d
Simulation unhappy path (#193)
* Use elapsed time

* Added timeout

* Extract tally

* Missing elapsed time

* Fix new view leader behaviour

* Fix tests

* Fix timeout double check

* Fix logs

* TimeoutHandler nitpicks

* Clippy happy

* Fix timeout sub

* Modify discard messages comment
2023-06-19 15:27:14 +02:00
gusto faacd10172
Remove dummy node from simapp settings and app itself (#196) 2023-06-17 13:38:59 +03:00
Al Liu 40048fa47b
Fix #183: support structured log for sim app. (#184) 2023-06-16 19:26:48 +08:00
gusto 285f899c8a
Update and rename main.yml to master.yml (#194) 2023-06-15 12:48:36 +03:00
Al Liu 9afd6c007c
Simulation conf polish (#181)
* polish Duration ser/deser and support step_time in conf file
2023-06-15 17:42:42 +08:00
Al Liu b6789b994e
Fix #182: gracefully shutdown (#187)
* finish signal and cleanup
2023-06-15 17:40:29 +08:00
Youngjoon Lee f631d9b737
fix: prevent approving new views not bigger than the last timeout_qc (#165) 2023-06-15 18:31:52 +09:00
Giacomo Pasini 2c23bda702
Carnot info api (#177)
* add missing accessors

* add api to return info from consensus service

* move processing into its own function

* feat: add HTTP GET `/carnot/info` API (#178)

---------

Co-authored-by: Youngjoon Lee <taxihighway@gmail.com>
2023-06-15 11:29:50 +02:00
Al Liu e57cb7b3cf
remove RwLock from SimmulationRunnerInner (#189) 2023-06-15 11:16:28 +02:00
Al Liu 27d9f72035
Simulation happy path (#161)
* finish subscriber manager

* optimize subscribe on SimulationRunnerHandle

* fix comment

* replace std locks to parking_lot locks

* move producer initialization out simulate fn

* WIP

* optimize run fn

* update condition

* fix CI

* collect run times

* Add happy-path consensus engine

* tmp

* Fit types from spec (#124)

* Match types to spec

* Remove Output import

* Consensus engine rework (#126)

* rework

* fix test

* clippy happy

---------

Co-authored-by: Giacomo Pasini <Zeegomo@users.noreply.github.com>

* Adapt carnot network adapter interfaces and implementations

* Fix errors

* Update network with engine types

* Fit types yet again

* Remove leadership and old overlay
Create carnot event builder
Added some adjustments

* Add view to vote

* Fix serde derive in consensus-engine

* Add serde feature for engine in core

* Use view in tally

* Move carnot tally to consensus service

* Add new view msg

* Fit engine types in adapter

* Missing serde feature in consensus service

* Implement carnot event builder

* Implement even builder run main tasks

* Fill up view resolver

* Fix errors on network adapter implementations

* Clippy happy

* Extract event handling to independent methods in View

* Fix test

* Refactor carnot event builder (#135)

* refactor

* format

* Discriminate proposal messages (#136)

* Derive block id from wire format (#139)

* Derive block id from wire format

* Derive id on block creation

* Use compile time hash size

* Add leader role (#138)

* add leadership stub

* fix gather_new_views

* fmt

* actually build qc

* remove redundant fields

* add flat overlay (#143)

* add flat overlay

* fix

* sort imports

* fix tests

* Fix waku update

* rewrite data collection add different kind of subscribers

* fix fmt

* Unhappy tally (#137)

* Refactor tally module

* Implement tally for new view messages

* Assess pr comments

* Fix rebase

* simplify tally

---------

Co-authored-by: Giacomo Pasini <g.pasini98@gmail.com>

* fix gather_new_views

* working node

* fix unhappy path

* remove leftover

* Kickstart event building in sim app

* finish event builder

* fix comment

* add Tally

* gather enough new views then construct ProposalBlock event

* Revert "gather enough new views then construct ProposalBlock event"

This reverts commit 87da2bdd0c.

* WIP: CarnotNode

* WIP

* finish event handle

* dump state

* WIP

* finish message sending

* fix some compile errors

* make project compile

* update

* fix fmt and clippy

* optimize json ser/deser and add a config

* update Cargo.toml

* Implement leader proposing (#154)

* Implement leader proposing

* fix fmt

---------

Co-authored-by: al8n <scygliu1@gmail.com>

* fix ser/deser bugs

* fix subscriber bugs

* Fix proposing genesis

* Fix genesis retrieval in consensus-engine

* Bring back general block proposal event

* Fix leaf voting

* fix init node bugs

* add more tracing

* fix empty qc

* fix data race

* fix all panics

* cleanup

* propose new blocks

* fix comment

* do not approve for the same block

* no panics

* fix some comments

* use serde_with

* Bring back genesis on 0

* Fix genesis retrieval
Replace output enum
Vote for genesis proposal

* Genesis methods

* fix StardardQc::genesis()

* fix genesis block bug

* fix PR comment

* fix PR comment

* fix PR comment

* fix PR comment

* fix PR comment

* Fix tally
Fix proposing

* Remove public block building
Added raw method

* Missing fmt

* clippy happy

* fix io stream downcast

* optional stream-type arg, by default we do not run any subscriber

* fmt

* cleanup

* Remove from header block constructor

* Fix duplicated approve (#180)

* fix duplicated approve

* Success tally just on threshold

* Integrate random beacon on happy path

* Fix missing updating beacon

* Replicate consensus output

* Prune older non relevant messages from cache

* Remove view info just again

* Refactor Block deps

* Reverse wrong parent committee call in Consensus engine

* Remove useless event builder settings

* Remove blocks store from event builder

* Remove unnecessary carnot seed

* Remove duplicated proposals check

---------

Co-authored-by: Giacomo Pasini <g.pasini98@gmail.com>
Co-authored-by: Daniel Sanchez <sanchez.quiros.daniel@gmail.com>
Co-authored-by: Giacomo Pasini <Zeegomo@users.noreply.github.com>
2023-06-14 16:52:37 +02:00
Giacomo Pasini 75025e8cf0
Consensus service refactor (#176)
The consensus service was becoming a bit messy and difficult to maintain.
This PR moves handling invididual events to their own function and refactors the cancelabe task system in a separate struct.
2023-06-13 12:11:55 +02:00
Jakub Sokołowski fe0361a5b8
ci: fix GOCACHE and GOPATH to avoid poisoning (#125)
We've had this issue before with some `go-waku` builds:
https://github.com/waku-org/go-waku/pull/512

Signed-off-by: Jakub Sokołowski <jakub@status.im>
2023-06-13 11:29:01 +03:00
Giacomo Pasini 9c81b72711
Random beacon (#167)
* move overlay to consensus engine

* Integrate random beacon in overlay

This commit integrates the random beacon as specified in the nomos
spec into the consensus engine library as part of the overlay.
In addition, it separates the overlay part responsible for leader
selection as an independent trait LeaderSelection, so as to share
the overall overlay structure among most constructions.

Furthermore, a leader proof has been added to a block to verify
that the proposer had valid rights to do so.

The current implementation hardcodes the leader selection update
in the consensus service, but we probably want to abstract it away
through something like the adapter pattern we use of other services
in the node.

* Move leader selection update to separate function

* Add generic support for leader selection in consensus service (#170)

* Add generic support for leader selection in consensus service

* fix

* use settings struct instead of tuple

* fix tests
2023-06-12 15:14:49 +02:00
Youngjoon Lee cfaa7cf772
fix: prevent skipping the current view when calculating latest_committed_block (#163) 2023-06-07 19:32:48 +09:00
Youngjoon Lee a055c2524a
fix: prevent skipping one view when proposing a block in unhappy path (#166) 2023-06-07 09:56:21 +09:00
Al Liu d864fecd07
chore: replace std locks to parking_lot locks in simulations (#141)
* replace std locks to parking_lot locks
2023-05-31 12:57:42 +08:00
Youngjoon Lee 2d60ce9921
fix: remove duplicate `tracing_subscriber` init for `nomos-node` bin (#157) 2023-05-29 22:40:22 +09:00
Youngjoon Lee 6229fc98bc
fix gather_timeout being pending (#156) 2023-05-27 09:32:39 +09:00
Giacomo Pasini 6c64720e39
Add root timeout handling (#153)
* Add handling of root timeout

* Use StandardQc instead of Qc for high_qc

Since high_qc is guaranteed to be a standard qc, let's just use
StandardQc as a type in all messages that use it.
2023-05-22 15:23:30 +02:00