Commit Graph

55 Commits

Author SHA1 Message Date
Jacek Sieka 17e00e642a
limit write queue length (#376)
To break a potential read/write deadlock, gossipsub uses an unbounded
queue for writes - when peers are too slow to process this queue, it may
end up growing without bounds causing high memory usage.

Here, we introduce a maximum write queue length after which the peer is
disconnected - the queue is generous enough that any "normal" usage
should be fine - writes that are `await`:ed are not affected, only
writes that are launched in an `asyncSpawn` task or similar.

* avoid unnecessary copy of message when there are no send observers
* release message memory earlier in gossipsub
* simplify pubsubpeer logging
2020-09-24 18:43:20 +02:00
Jacek Sieka 25bd0a18f4
small fixes (#374)
* add helper to read EOF marker after closing stream (else stream stay
alive until timeout/reset)
* don't assert on empty channel message
* don't loop when writing to chronos (no need)
2020-09-24 07:30:19 +02:00
Jacek Sieka 49a12e619d
channel close race and deadlock fixes (#368)
* channel close race and deadlock fixes

* remove send lock, write chunks in one go
* push some of half-closed implementation to BufferStream
* fix some hangs where LPChannel readers and writers would not always
wake up
* simplify lazy channels
* fix close happening more than once in some orderings
* reenable connection tracking tests
* close channels first on mplex close such that consumers can read bytes

A notable difference is that BufferedStream is no longer considered EOF
until someone has actually read the EOF marker.

* docs, simplification
2020-09-21 19:48:19 +02:00
Jacek Sieka 0db45462cd
mplex fixes (#362)
* remove almost-empty types module
* lock when writing message (that's the only place the lock matters, and
only when the message is > max msg size)
* logging updates (log in consistent order, makes reading logs easier)
* raise EOF from readExactly only if no bytes have been read (to signal
that _no_ bytes were lost)
2020-09-14 10:19:54 +02:00
Jacek Sieka 96d4c44fec
refactor bufferstream to use a queue (#346)
This change modifies how the backpressure algorithm in bufferstream
works - in particular, instead of working byte-by-byte, it will now work
seq-by-seq.

When data arrives, it usually does so in packets - in the current
bufferstream, the packet is read then split into bytes which are fed one
by one to the bufferstream. On the reading side, the bytes are popped of
the bufferstream, again byte by byte, to satisfy `readOnce` requests -
this introduces a lot of synchronization traffic because the checks for
full buffer and for async event handling must be done for every byte.

In this PR, a queue of length 1 is used instead - this means there will
at most exist one "packet" in `pushTo`, one in the queue and one in the
slush buffer that is used to store incomplete reads.

* avoid byte-by-byte copy to buffer, with synchronization in-between
* reuse AsyncQueue synchronization logic instead of rolling own
* avoid writeHandler callback - implement `write` method instead
* simplify EOF signalling by only setting EOF flag in queue reader (and
reset)
* remove BufferStream pipes (unused)
* fixes drainBuffer deadlock when drain is called from within read loop
and thus blocks draining
* fix lpchannel init order
2020-09-10 08:19:13 +02:00
Jacek Sieka 5b347adf58
logging fixes and small cleanups (#361)
In particular, allow longer multistream select reads
2020-09-09 19:12:08 +02:00
Jacek Sieka 82c179db9e
mplex fixes (#356)
* close the right connection when channel send fails
* don't crash on channel id that is not unique
2020-09-08 08:24:28 +02:00
Jacek Sieka c1856fda53
simplify and unify logging (#353)
* use short format for logging peerid
* log peerid:oid for connections
2020-09-06 10:31:47 +02:00
Eugene Kabanov 0b85192119
Remove asyncCheck from codebase. (#345)
* Remove asyncCheck from codebase.

* Replace all `discard` statements with new `asyncSpawn`.

* Bump `nim-chronos` requirement.
2020-09-04 18:30:45 +02:00
Jacek Sieka 397f9edfd4
simplify mplex (#327)
* less async
* less copying of data
* less redundant cleanup
2020-08-15 07:58:30 +02:00
Dmitriy Ryajov 2325692f55
Fix half closed (#324)
* don't call `close` in `remoteClose`

* make sure timeout are properly propagted

* fix tests

* adding remote close write test
2020-08-10 16:17:11 -06:00
Giovanni Petrantoni 5c986cf657
Fix build, add some raises (#315)
* Fix build, add some raises

* wip

* wip more raises

* missing exc object in mplex

* proper lifetime for subscribePeer

Co-authored-by: Dmitriy Ryajov <dryajov@gmail.com>
2020-08-05 19:30:57 -06:00
Dmitriy Ryajov cf2b42b914
Moving idle timeout to Connection to enable across all connection streams (#307)
* move idle timeout logic to connection

* more informative logs

* more informative logs
2020-08-04 07:22:05 -06:00
Dmitriy Ryajov 980764774e
pubsub timeouts tuning (#295)
* add finegrained timeouts to pubsub

* use 10 millis timeout in tests

* finalization

* revert timeouts

* use `atEof` for reads

* adjust timeouts and use atEof for reads

* use atEof for reads

* set isEof flag

* no backoff for pubsub streams

* temp timer increase, make macos finalize

* don't call `subscribePeer` in libp2p anymore

* more traces

* leak tests

* lower timeouts

* handle exceptions in control message

* don't use `cancelAndWait`

* handle exceptions in helpers

* wip

* don't send empty messages

* check for leaks properly

* don't use cancelAndWait

* don't await subscribption sends

* remove subscrivePeer calls from switch

* trying without the hooks again
2020-08-02 23:20:11 -06:00
Jacek Sieka e655a510cd
misc cleanups (#303) 2020-08-02 12:22:49 +02:00
Dmitriy Ryajov f7fdf31365
Pubsub lifetime (#284)
* lifecycle hooks

* tests

* move trace after closed check

* restore 1 second heartbeat

* await close event

* fix tests

* print direction string

* more trace logging

* add pubsub monitor

* add log scope

* adjust idle timeout

* add exc.msg to trace
2020-07-27 13:33:51 -06:00
Giovanni Petrantoni c3404f6eea
Handle cancellation in timeoutMonitor (#283)
* Handle cancellation in timeoutMonitor

* refactor lpchannel timeout as suggested by cheatfate
2020-07-21 09:03:41 -06:00
Dmitriy Ryajov 38eb36efae
don't use close event to stop timer (#280) 2020-07-18 11:00:44 -06:00
Dmitriy Ryajov ba071cafa6
Channel timeout (#278)
* add support for channel timeouts

* tests for channel timeout

* add timeouts to standard switch

* fix mplex init

* cleanup timer on stream close

* add comment for `isConnected`

* move cleanup event
2020-07-17 12:44:41 -06:00
Ștefan Talpalaru b8b0a2b4bc
CI: build binaries with TRACE & JSON logs (#268)
Also: remove unused imports.
2020-07-14 02:02:16 +02:00
Dmitriy Ryajov 181cf73ca7
Drain buffer (#264)
* drain lpchannel on reset

* move drainBuffer to bufferstream
2020-07-12 18:37:10 +02:00
Dmitriy Ryajov c788a6a3c0
Cleanup resources (#246)
* consolidate reading in lpstream

* remove debug echo

* tune log level

* add channel cleanup and cancelation handling

* cancelation handling

* cancelation handling

* cancelation handling

* cancelation handling

* cleanup and cancelation handling

* cancelation handling

* cancelation

* tests

* rename isConnected to connected

* remove testing trace

* comment out debug stacktraces

* explicit raises
2020-06-29 09:15:31 -06:00
Dmitriy Ryajov 5b28e8c488
Cleanup lpstream, Connection and BufferStream (#228)
* count published messages

* don't call `switch.dial` in `subscribeToPeer`

* don't use delegation in connection

* move connection out to own file

* don't breakout on reset

* make sure to call close on secured conn

* add lpstream tracing

* don't breackdown by conn id

* fix import

* remove unused lable

* reset  connection on exception

* add additional metrics for skipped messages

* check for nil in secure.close
2020-06-19 11:29:43 -06:00
Dmitriy Ryajov 6b196ad7b4
remove pubsub peer on disconnect (#212)
* remove pubsub peer on disconnect

* make sure lock is aquired

* add $

* count upgrades/dials/disconnects
2020-06-11 08:45:59 -06:00
Viktor Kirilov 1afec627c2
proper name for topics so that we can filter dynamically using chronicles (#210)
* proper name for topics so that we can filter dynamically using chronicles

* lowercase
2020-06-10 10:48:01 +02:00
Dmitriy Ryajov c7298f34f4 additional comments 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 27a7f8c948 move EOF flag after local close and comments 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov d83ce4c932 exceptions and resource cleanup 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 46daed9a38 wip 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 62da2a05c3 wip: rework with proper half-closed 2020-06-02 09:10:27 -06:00
Dmitriy Ryajov 7900fd9f61
Half closed (#174)
* call write until all is written out

* add comments to lpchannel fields

* add an eof flag to signal which end closed

* wip: rework with proper half-closed

* add eof and closed handling

* propagate closes to piped

* call parent close

* moving bufferstream trackers out

* move writeLock to bufferstream

* move writeLock out

* remove unused call

* wip

* rebasing master

* fix mplex tests

* wip

* fix bufferstream after backport

* wip

* rename to differentiate from chronos tracker

* close connection on chronos close

* make reset request asyncCheck

* fix channel cleanup

* misc

* don't use read

* fix backports

* make noise work again

* proper exception handling

* don't reraise just yet

* add convenience templates

* dont double wrap

* use async pragma

* fixes after backporting

* muxer owns connection

* remove on transport close cleanup

* revert back allread

* adding some todos

* read from stream

* inc count before closing

* rebasing master

* rebase master

* use correct exception type

* use try/finally insted of defer

* fix compile in trace mode

* reset channels on mplex close
2020-05-19 18:14:15 -06:00
Dmitriy Ryajov 681991ae48
reduce buffer size in lpchannel (#184) 2020-05-19 16:15:36 -06:00
Jacek Sieka 69abf5097d
handle a few exceptions (#170)
* handle a few exceptions

Some of these are maybe too aggressive, but in return, they'll log
their exception - more refactoring needed to sort this out - right now
we get crashes on unhandled exceptions of unknown origin

* during connection setup
* while closing channels
* while processing pubsubs

* catch exceptions that are raised and don't try to catch exceptions that are not raised

* propagate cancellederror

* one more

* more

* more

* make interop tests less fragile

* Raise expiration time in gossipsub fanout test for slow CI

Co-authored-by: Dmitriy Ryajov <dryajov@gmail.com>
Co-authored-by: Giovanni Petrantoni <giovanni@fragcolor.xyz>
2020-05-14 21:56:56 -06:00
Giovanni Petrantoni 005e088405
Properly track and close mplex handlers (#166)
* Properly track and close mplex handlers

* Avoid verbose warnings

* Fix tryAndWarn trace issue

* Handle LPEOF in lpchannel close
2020-05-12 14:45:32 +02:00
Jacek Sieka fbf640794b
move channel size constant 2020-05-09 08:48:26 +02:00
Jacek Sieka 1efada474c
remove readLoop in secure protocols (#162)
* remove readLoop in secure protocols, fix security issues

* fix Defect on remote sending 0-byte noise/secio message
* remove msglen from `write` (unused)
* simplify SecureConn data flow
* document some control-flow issues

* unify exception behaviour across noise and secio

* secio would not raise on mac/decryption errors

* fix compile error
2020-05-07 14:37:46 -06:00
Jacek Sieka 330da51819
removals (#159)
* remove unused stream methods
* reimplement some of them with proc's
* remove broken tests
* Error->Defect for defect
* warning fixes
2020-05-06 18:31:47 +02:00
Giovanni Petrantoni 303ec297da
Start removing allFutures (#125)
* Start removing allFutures

* More allfutures removal

* Complete allFutures removal except legacy and tests

* Introduce table values copies to prevent error

* Switch to allFinished

* Resolve TODOs in flood/gossip

* muxer handler, log and re-raise

* Add a common and flexible way to check multiple futures
2020-04-11 13:08:25 +09:00
Dmitriy Ryajov 5285f0d091
Fix/misc (#116)
* only check for payload size

* only subscribe if connection succeeded

* fix failing test

* check that the strem is active before openning

* msg type should not be > than 0x7

* fix tests

* check max against enum val
2020-03-29 08:28:48 -06:00
Dmitriy Ryajov 3effb95f10 close underlying bufferstream in lpchannel 2020-03-28 09:29:43 -06:00
Dmitriy Ryajov 2de98751ae fix: use exact types for mplex id 2020-03-24 15:41:40 -06:00
Giovanni Petrantoni 0a3e4a764b
Less verbose traces (#112)
* Make traces less verbose with shortHexDump utility

* Rename shortHexDump into shortLog

* Improve shortLog, add shortLog for crypto keys

* Add proper shortLog implementations in messages
2020-03-23 15:03:36 +09:00
Eugene Kabanov 381630f185
Fix and refactoring of some procedures which are able to return nil as result (#97)
* Fix do not return nil as result.

* Fix mplex test to properly raise.
2020-03-04 21:45:14 +02:00
Dmitriy Ryajov acdaeb8f5d working out synchronization issues 2020-02-16 11:31:35 -06:00
Dmitriy Ryajov f9ad113d11 don't send close message if remote closed 2020-02-16 11:31:35 -06:00
Dmitriy Ryajov f767614827 remove unnecesary async lock 2020-02-16 11:31:35 -06:00
Dmitriy Ryajov 7f8eb0272e cleanup and fix tests 2020-02-16 11:31:35 -06:00
Giovanni Petrantoni 23712ecf3b
Lazy channels (#78)
* Implemented lazy stream opening for mplex connections

* Properly fix newStream usage

* Make lazy channel open optional

* Add Lazy channel test

* Cleanup mplex test

* Move lazyness properly into LPChannel

* Connection writeLp back to proc
2020-02-11 12:30:36 -05:00
Dmitriy Ryajov 293a219dbe
Cleanup (#55)
* fix: don't allow replacing pubkey

* fix: several small improvements

* removing pubkey setter

* improove error handling

* remove the use of Option[T] if not needed

* don't use optional

* fix-ci: temporarily pin p2pd to a working tag

* fix example to comply with latest changes

* bumping p2pd again to a higher version
2019-12-10 14:50:35 -06:00
Dmitriy Ryajov 903e79ede1
Feat/conn cleanup (#41)
Backporting proper connection cleanup from #36 to align with latest chronos changes.

* add close event

* use proper varint encoding

* add proper channel cleanup in mplex

* add connection cleanup in secio

* tidy up

* add dollar operator

* fix tests

* don't close connections prematurely

* handle closing streams properly

* misc

* implement address filtering logic

* adding pipe tests

* don't use gcsafe if not needed

* misc

* proper connection cleanup and stream muxing

* re-enable pubsub tests
2019-12-03 22:44:54 -06:00