Captures the "super FFI" roadmap (future-work.md) and a concrete design for item #1, typed host callbacks via {.ffiHost.} (design-host-callbacks.md), including the per-language host-consumption sketches and the non-blocking FFI-thread contract the generated wrappers must enforce. The design is grounded in the existing threading model: a host answering from its own thread cannot complete a chronos Future directly, so the completion path mirrors the request path in reverse — enqueue + ThreadSignalPtr signal, drained and completed on the FFI (event-loop) thread. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
7.1 KiB
Design: typed host callbacks ({.ffiHost.})
Status: draft / in progress. Roadmap item #1 from future-work.md.
Goal
Let a Nim {.ffi.} handler call back into the host language for typed data
and await the result:
# Declared in the library, implemented by the host (no Nim body):
proc fetchProfile(userId: string): Future[Result[Profile, string]] {.ffiHost.}
proc myAppLogin(app: MyApp, req: LoginReq): Future[Result[Session, string]] {.ffi.} =
let profile = (await fetchProfile(req.userId)).valueOr:
return err("host fetch failed: " & error)
return ok(openSession(profile))
This is the inverse of events (which are lib → host, fire-and-forget). It is the "a lower layer needs to read from a higher one" case from logos-delivery #3865.
Why it's not just "events backwards"
Events invoke a host FFICallBack synchronously on the FFI thread and
ignore any return value. A host call must return data, and the host may take
arbitrary time / answer on its own thread. The chronos Future the Nim handler
awaits can only be completed on the FFI (event-loop) thread. So the result
has to be marshaled back across the thread boundary — exactly the reverse of the
existing request path:
host → lib request : reqChannel.trySend + reqSignal.fireSync → FFI loop → processRequest → reply callback
lib → host call : hostFn(token, req) … host works … <lib>_host_complete(token, result)
→ completionQueue.push + completionSignal.fireSync → FFI loop → fut.complete(result)
The completion path reuses the same primitive (ThreadSignalPtr + an SPSC/MPSC
queue) that reqSignal/reqChannel already use (ffi/ffi_context.nim).
Moving parts
1. Host-function registry (per context)
A small registry mirroring FFIEventRegistry (ffi/ffi_events.nim): maps a wire
name ("fetch_profile") to a (FFIHostFn, userData). The host registers an
implementation at runtime; a nil/missing entry makes the imported proc resolve
to err("host fn '<name>' not registered") rather than crash (never-crash
policy).
2. In-flight completion table (per context)
token: uint64 → Completer, where Completer holds the pending chronos
Future and a slot for the raw result bytes. Tokens are monotonic per context.
Guarded by a lock; only the FFI thread completes futures.
3. Completion bridge (FFI thread integration)
- New
completionSignal: ThreadSignalPtr+completionQueueonFFIContext. <lib>_host_complete(...)(called from the host thread) pushes(token, ret, bytes)onto the queue and firescompletionSignal.- The FFI loop (
ffiThreadBody) additionally waits oncompletionSignal; on wake it drains the queue and, for each entry, looks up the token andfut.complete(decodedResult)— on the loop thread, satisfying chronos.
4. The {.ffiHost.} macro
From a bodyless proc <name>(args…): Future[Result[T, string]] {.ffiHost.},
emit a normal async Nim proc whose body:
- marshals
argsinto a request buffer (native POD first; CBOR variant later), - allocates a token + registers a
Completer(Future) in the in-flight table, - looks up the host fn for
"<name>"; if absent →return err(...), - invokes
hostFn(token, reqMsg, reqLen, userData), return await completer.fut(decoded toResult[T, string]).
Note the same dual-proc spirit as {.ffi.}: in-process Nim callers could later
get a directly-injectable implementation, but the foreign path goes through the
registry.
5. ABI + codegen (per language)
Exported symbols (added to c.nim and the other generators):
typedef void (*FFIHostFn)(uint64_t token, const char *req, size_t reqLen, void *userData);
int <lib>_register_host_fn(void *ctx, const char *name, FFIHostFn fn, void *userData);
int <lib>_host_complete(void *ctx, uint64_t token, int ret, const char *msg, size_t len);
Each generator then emits an idiomatic wrapper: register a closure, and on
completion call <lib>_host_complete. (Out of scope for the first slice — C ABI
- a C e2e test prove the mechanism first.)
Host consumption (per language)
The raw contract a host satisfies, and the rule that shapes the wrappers:
- Register
fnunder a name. When the Nim handlerawaits the imported proc, the library invokesfnon the FFI thread with atoken+ marshaled args. fnmust return immediately — it sits on the chronos event-loop thread, so it captures thetoken, kicks the real work onto the host's own executor, and returns.- When the work finishes (any thread, any time later), the host calls
<lib>_host_complete(ctx, token, ret, msg, len), which enqueues + signals the FFI loop to complete the awaitedFuture. Thetokenis what decouples "invoked on the FFI thread" from "answered later on the host's thread."
The generated wrapper hides token, threading hop, and marshaling — the host dev
writes a normal function in the language's async idiom. The wrapper's trampoline
does three things: decode req → typed args, run the closure on the host
executor (never inline on the FFI thread), encode the result and call
<lib>_host_complete.
// Go — trampoline spawns a goroutine, then host_complete
node.SetFetchProfile(func(userID string) (Profile, error) { return db.Lookup(userID) })
// Swift — trampoline launches a Task
node.fetchProfile = { userID in try await db.lookup(userID) }
// Kotlin — JNI trampoline launches a coroutine
node.setFetchProfile { userID -> db.lookup(userID) }
// Rust — closure returning a future, driven on the host runtime
node.set_fetch_profile(|userId| async move { db.lookup(&userId).await });
The gotcha the wrappers exist to enforce: a binding that ran the closure
inline in FFIHostFn would stall the event loop (and deadlock if the closure
re-entered the library). Each language needs its own trampoline to hop onto its
executor — that's the real work of increment 5, not a shared shim.
Threading / safety notes
- Futures completed only on the FFI thread (drain runs there).
host_completefrom any thread only enqueues + signals. - A
host_completefor an unknown/expired token is dropped with a debug log (a late/double completion must not crash) — never-crash policy. - Context teardown must fail every outstanding
Completer(err("context shutting down")) and drain the queue so no future is abandoned (matches the existing in-flightpendingdrain inffiThreadBody). - Re-entrancy: an imported call happens inside a
{.ffi.}handler already on the FFI thread; it mustawait(yield the loop) so the loop keeps draining — it must never block the thread waiting on the host.
Increments
- Registry + in-flight table (pure data structures + unit tests) ← first
- Completion bridge on
FFIContext(signal + queue + loop drain + teardown) {.ffiHost.}macro (native POD marshaling, string args/results first)- C ABI codegen + a C end-to-end test (Nim handler calls a C-provided host fn)
- Idiomatic wrappers in the per-language generators
- CBOR variant + structured (
{.ffi.}-typed) args/results