logos-blockchain/logos-blockchain-testing

Fork 0

mirror of https://github.com/logos-blockchain/logos-blockchain-testing.git synced 2026-04-13 06:33:35 +00:00

andrussal d3bb6348e5 feat(core): add observation runtime foundation

2026-04-12 11:22:24 +02:00

7.8 KiB

Raw Blame History

Observation Runtime Plan

Why this work exists

TF is good at deployment plumbing. It is weak at continuous observation.

Today, the same problems are solved repeatedly with custom loops:

TF block feed logic in Logos
Cucumber manual-cluster polling loops
ad hoc catch-up scans for wallet and chain state
app-local state polling in expectations

That is the gap this work should close.

The goal is not a generic "distributed systems DSL". The goal is one reusable observation runtime that:

continuously collects data from dynamic sources
keeps typed materialized state
exposes both current snapshot and delta/history views
fits naturally in TF scenarios and Cucumber manual-cluster code

Constraints

TF constraints

TF abstractions must stay universal and simple.
TF must not know app semantics like blocks, wallets, leaders, jobs, or topics.
TF must remain useful for simple apps such as openraft_kv, not only Logos.

App constraints

Apps must be able to build richer abstractions on top of TF.
Logos must be able to support:
- current block-feed replacement
- fork-aware chain state
- public-peer sync targets
- multi-wallet UTXO tracking
Apps must be able to adopt this incrementally.

Migration constraints

We do not want a flag-day rewrite.
Existing loops can coexist with the new runtime until replacements are proven.

Non-goals

This work should not:

put feed back onto the base Application trait
build app-specific semantics into TF core
replace filesystem blockchain snapshots used for startup/restore
force every app to use continuous observation
introduce a large public abstraction stack that nobody can explain

Core idea

Introduce one TF-level observation runtime.

That runtime owns:

source refresh
scheduling
polling/ingestion
bounded history
latest snapshot caching
delta publication
freshness/error tracking
lifecycle hooks for TF and Cucumber

Apps own:

source types
raw observation logic
materialized state
snapshot shape
delta/event shape
higher-level projections such as wallet state

Public TF surface

The TF public surface should stay small.

`ObservedSource<S>`

A named source instance.

Used for:

local node clients
public peer endpoints
any other app-owned source type

`SourceProvider<S>`

Returns the current source set.

This must support dynamic source lists because:

manual cluster nodes come and go
Cucumber worlds may attach public peers
node control may restart or replace sources during a run

`Observer`

App-owned observation logic.

It defines:

Source
State
Snapshot
Event

And it implements:

init(...)
poll(...)
snapshot(...)

The important boundary is:

TF owns the runtime
app code owns materialization

`ObservationRuntime`

The engine that:

starts the loop
refreshes sources
calls poll(...)
stores history
publishes deltas
updates latest snapshot
tracks last error and freshness

`ObservationHandle`

The read-side interface for workloads, expectations, and Cucumber steps.

It should expose at least:

latest snapshot
delta subscription
bounded history
last error

Intended shape

pub struct ObservedSource<S> {
    pub name: String,
    pub source: S,
}

#[async_trait]
pub trait SourceProvider<S>: Send + Sync + 'static {
    async fn sources(&self) -> Vec<ObservedSource<S>>;
}

#[async_trait]
pub trait Observer: Send + Sync + 'static {
    type Source: Clone + Send + Sync + 'static;
    type State: Send + Sync + 'static;
    type Snapshot: Clone + Send + Sync + 'static;
    type Event: Clone + Send + Sync + 'static;

    async fn init(
        &self,
        sources: &[ObservedSource<Self::Source>],
    ) -> Result<Self::State, DynError>;

    async fn poll(
        &self,
        sources: &[ObservedSource<Self::Source>],
        state: &mut Self::State,
    ) -> Result<Vec<Self::Event>, DynError>;

    fn snapshot(&self, state: &Self::State) -> Self::Snapshot;
}

This is enough.

If more helper layers are needed, they should stay internal first.

How current use cases fit

`openraft_kv`

Use one simple observer.

sources: node clients
state: latest per-node Raft state
snapshot: sorted node-state view
events: optional deltas, possibly empty at first

This is the simplest proving case. It validates the runtime without dragging in Logos complexity.

Logos block feed replacement

Use one shared chain observer.

sources: local node clients
state:
- node heads
- block graph
- heights
- seen headers
- recent history
snapshot:
- current head/lib/graph summary
events:
- newly discovered blocks

This covers both existing Logos feed use cases:

current snapshot consumers
delta/subscription consumers

Cucumber manual-cluster sync

Use the same observer runtime with a different source set.

sources:
- local manual-cluster node clients
- public peer endpoints
state:
- local consensus views
- public consensus views
- derived majority public target
snapshot:
- current local and public sync picture

This removes custom poll/sleep loops from steps.

Multi-wallet fork-aware tracking

This should not be a TF concept.

It should be a Logos projection built on top of the shared chain observer.

input: chain observer state
output: per-header wallet state cache keyed by block header
property: naturally fork-aware because it follows actual ancestry

That replaces repeated backward scans from tip with continuous maintained state.

Logos layering

Logos should not put every concern into one giant impl.

Recommended layering:

Chain source adapter
- local node reads
- public peer reads
Shared chain observer
- catch-up
- continuous ingestion
- graph/history materialization
Logos projections
- head view
- public sync target
- fork graph queries
- wallet state
- tx inclusion helpers

TF provides the runtime. Logos provides the domain model built on top.

Adoption plan

Phase 1: add TF observation runtime

add ObservedSource, SourceProvider, Observer, ObservationRuntime, ObservationHandle
keep the public API small
no app migrations yet

Phase 2: prove it on `openraft_kv`

add one simple observer over /state
migrate one expectation to use the observation handle
validate local, compose, and k8s

Phase 3: add Logos shared chain observer

implement it alongside current feed/loops
do not remove existing consumers yet
prove snapshot and delta outputs are useful

Phase 4: migrate one Logos consumer at a time

Suggested order:

fork/head snapshot consumer
tx inclusion consumer
Cucumber sync-to-public-chain logic
wallet/UTXO tracking

Phase 5: delete old loops and feed paths

only after the new runtime has replaced real consumers cleanly

Validation gates

Each phase should have clear checks.

Runtime-level

crate-level cargo check
targeted tests for runtime lifecycle and history retention
explicit tests for dynamic source refresh

App-level

openraft_kv:
- local failover
- compose failover
- k8s failover
Logos:
- one snapshot consumer migrated
- one delta consumer migrated
Cucumber:
- one manual-cluster sync path migrated

Open questions

These should stay open until implementation forces a decision:

whether ObservationHandle should expose full history directly or only cursor/subscription access
how much error/freshness metadata belongs in the generic runtime vs app snapshot types
whether multiple observers should share one scheduler/runtime instance or simply run independently first

Design guardrails

When implementing this work:

keep TF public abstractions minimal
keep app semantics out of TF core
do not chase a generic testing DSL
build from reusable blocks, not one-off mega impls
keep migration incremental
prefer simple, explainable runtime behavior over clever abstraction

7.8 KiB Raw Blame History