* Fix stability issues
why:
Handling malformed messages typically raises `RangeError` exceptions
when de-serialising RLP, or decoding message data. This is an
(incomplete) attempt to weed out some out it driven by real live
tests.
remark:
Employing the new `snap` protocol there might be different views on what
the messages really contain (currently specs are more a hint.)
* Update RLP exception handling
* Undo effect-less patch
why:
problem occurred somewhere above the try/catch handler
* Using `checkedEnumAssign()` for RLP enum
* Provide dedicated `DisconnectionReason` enum type RLP reader
why:
Without this reader, the program communicating via RLPX will crash when
receiving out of bound reason codes disconnect message.
Out of bound value assignments to an enum causes a `RangeError`defect
and consequently the program to terminate. This `RangeError` is avoided
here and a `MalformedRlpError` catchable error raised.
* Using default exception type in bespoke `read(DisconnectionReason)`
why:
This should not differ from the default enum parser. The particular
message is different and more targeted, here.
Note: The default RLP parser was not used because `
`array[1,DisconnectionReason]` is currently not properly handled and
should give a siliar error message as a `DisconnectionReason` error.
* De-clutter, custom read() was not needed
Co-authored-by: jordan <jordan@curd.mjh-it.com>
* Handle the decodeAuthMessage error case separatly and log to trace
Garbage data on the TCP port (e.g. from port scanners) would
cause lots of error log messages, so log this to trace and get rid
of a (little) bit of exception usage in the process.
* Remove usage of result var in rlpxAccept and rlpxConnect
* Discv4: Add ENRRequest & ENRResponse msgs to avoid fails on these
Fix#499
These messages are not implemented yet however, but just ignored.
* disc: updateExternalIp()
New public proc that can be used to inform the discovery subsystem about
a changed external IP (as reported by UPnP/NAT-PMP in some other module).
why:
Rlp errors throw exceptions which cause the dispatcher loop to
terminate the current session immediately.
details:
The DisconnectionReasonList message requires a single entry list.
Observed and now accepted deviations are:
Geth: single byte number
bor(a Geth fork): blobbed single entry list containing a number
why:
This is a legacy feature and its usage should peter out over time.
details:
Use -d:chunked_rlpx_enabled for enabling chunked RLPx message handling.
why:
For some reason, Nethermind insists on sending chunked messages to
the syncing peer. Unfortunately, for the test networks the Nethermind
modes are the importent ones as they speak eth/65 as well while others
like Geth only support eth/66 which is not implemented here, yet.
- add eventLoop to control all incoming events
- change semantic of write to asynchronously block only when send buffer is full, and not when bytes do not fit into send window
- change handling of receive buffer, to start dropping packets if the reorder buffer and receive buffer are full. Old behaviour was to async block unless there is space which could lead to resource exhaustion attacks
The ENR code used to be solely exception based, and these
exceptions where a left-over of that. They are useless as later
calls use Result anyhow.
Additionally, they cause quite the performance loss because they
are used in the "common path" for the toTypedRecord call, e.g.
when reading the fields of ip6, tcp6 and udp6.
Currently only setting `--styleCheck:hint` as there are some
dependency fixes required and the compiler seems to trip over the
findnode MessageKind, findnode Message field and the findNode
proc. Also over protocol.Protocol usage.
This removes the outdated copy of the SSZ code. It became incorrect
over time (e.g., empty SSZ list elements), and is no longer in use by
GitHub projects: https://github.com/search?q=extension%3Anim+eth%2Fssz
The canonical SSZ implementation resides at `nim-ssz-serialization`.
Compared to `nim-eth`, these changes were made meanwhile:
- `bitseqs` was extended with JSON serialization support,
and with the new functions `isZero` and `countOnes`.
- `bytes_reader` was renamed to `codec`, extended with a few additional
SSZ type conversions as well as support for `SingleMemberUnion`.
- The simplified merkle tree implementation in `merkle_tree.nim`
was removed. It was not used by other projects.
- `merkleization` was extended with support for `HashArray`, `HashList`
and `SingleMemberUnion`. The `isValidProof` functionality has been
moved to `nimbus-eth2` and replaced with the EF defined function
`is_valid_merkle_branch`. The test was also moved to `nimbus-eth2`.
There are no other GitHub projects using `isValidProof`:
https://github.com/search?q=extension%3Anim+isValidProof
Furthermore, a definition for `GeneralizedIndex` was added.
- `ssz_serialization` was moved one directory up, and improved with
bug fixes and `HashArray`, `HashList` and `SingleMemberUnion` support.
- `types` was extended with JSON serialization and new type support for
`Uint128`, `Uint256`, `HashArray`, `HashList` and `SingleMemberUnion`.
There is also a new `getBit` function for `BitList`.
* Add few missing top level raises Defect in uTP
- Add top level {.push raises: [Defect].}
- remove some local raises, including some unneeded
CatchableErrors.
- Don't export messageHandler (avoiding annoying naming collisions)
- export utp_router as those connection callbacks are in the API
* Add some missing copyright clauses
* Some ident and max line length cleanup
* Rename utp_discv5_protocol.nim to be more consistent
* Add separate datastructure to keep track of window
* Asynchronously block write until until new space in snd buffer
* Introduce write loop
* Properly handle write cancellation
* Proper handling of sending fin packet
* Reset remote window after configured amount of time
Closes [nimbus-eth1#767](https://github.com/status-im/nimbus-eth1/issues/767).
Crashes occur when certain invalid RLPx messages are received from a peer.
Specifically, `msgId` out of range. Because any peer can easily trigger this
crash, we'd consider it a DOS vulnerability if Nimbus-eth1 was in general use.
We noticed when syncing to Goerli, there were some rare crashes with this
exception. It turned out one peer with custom code, perhaps malfunctioning,
was sending these messages if we were unlucky enough to connect to it.
`invokeThunk` is called from `dispatchMessages` and checks the range of
`msgId`. It correctly recognise that it's out of range, raises and exception
and produces a message. Job done.
Except the code in `dispatchMessage` treats all that as a warning instead of
error, and continues to process the message. A bit lower down, `msgId` is used
again without a range check.
The trivial fix is to check array bounds before access.
--
ps. Here's the stack trace ("reraised" sections hidden):
```
WRN 2021-11-08 21:29:33.238+00:00 Error while handling RLPx message topics="rlpx" tid=2003472 file=rlpx.nim:607 peer=Node[<IP>:45456] msg=45 err="RLPx message with an invalid id 45 on a connection supporting eth,snap"
/home/jamie/Status/nimbus-eth1/nimbus/p2p/chain/chain_desc.nim(437) main
/home/jamie/Status/nimbus-eth1/nimbus/p2p/chain/chain_desc.nim(430) NimMain
/home/jamie/Status/nimbus-eth1/nimbus/nimbus.nim(258) process
/home/jamie/Status/nimbus-eth1/vendor/nim-chronos/chronos/asyncloop.nim(279) poll
/home/jamie/Status/nimbus-eth1/vendor/nim-chronos/chronos/asyncmacro2.nim(74) colonanonymous
/home/jamie/Status/nimbus-eth1/vendor/nim-eth/eth/p2p/rlpx.nim(1218) rlpxAccept
/home/jamie/Status/nimbus-eth1/vendor/nim-chronos/chronos/asyncmacro2.nim(101) postHelloSteps
/home/jamie/Status/nimbus-eth1/vendor/nim-chronos/chronos/asyncmacro2.nim(74) colonanonymous
/home/jamie/Status/nimbus-eth1/vendor/nim-eth/eth/p2p/rlpx.nim(985) postHelloSteps
/home/jamie/Status/nimbus-eth1/vendor/nim-chronos/chronos/asyncmacro2.nim(101) dispatchMessages
/home/jamie/Status/nimbus-eth1/vendor/nim-chronos/chronos/asyncmacro2.nim(77) colonanonymous
/home/jamie/Status/nimbus-eth1/vendor/nim-eth/eth/p2p/rlpx.nim(614) dispatchMessages
/home/jamie/Status/nimbus-eth1/vendor/nimbus-build-system/vendor/Nim/lib/system/chcks.nim(23) raiseIndexError2
/home/jamie/Status/nimbus-eth1/vendor/nimbus-build-system/vendor/Nim/lib/system/fatal.nim(49) sysFatal
[[reraised from: ... ]]
[[reraised from: ... ]]
[[reraised from: ... ]]
[[reraised from: ... ]]
Error: unhandled exception: index 45 not in 0 .. 40 [IndexError]
```
Signed-off-by: Jamie Lokier <jamie@shareable.org>
Separate the logging when the node is not reachable and
enrAutoUpdate is on or off to avoid confusion whether or not the
node might still become reachable.
* avoid allocation in `hash(ValidIpAddress)`
While casually browsing the profiler output, to my great surprise I
found that an allocating string conversion function (gasp!) in the hash
function for ip addresses - this PR carefully excises this evil
construct from the codebase.
* bump nim version
queryRandom was currently only async for the `enrField` version.
However the basic queryRandom is also exported and thus gets
changed so it can be properly used as async proc.
Also added exports for the modules of which objects are used in
the discovery public API.
toBytes for NodeId wasn't selected by compiler byt if it does
get selected, it will fail on the test cases due to the
countdown that is done in logDistance.
Set to toBytesBE properly now and do countup, that should make
it correct also for BE architecture.
Removed toBytes to avoid confusion and avoid this one being
selected ever. The only place toBytes for NodeId was used is in
sessions.nim makeKey func and there also the stint one
(thus native endianness) was selected in Nim 1.2.x.
Native endianness is fine there as it is only an internal
representation.
* Refactor tests and move socket to separate file
* Move sockets handling to separate class
* Abstract over underlying transport
* Fix bug with receiving duplicated SYN packet
* Fix race condition in connect
* Modify outbuffer
Each element of outbuffer keeps encoded packet ,number
of transmissions of givern packet and information if
given packet needs to be re-send.
* Add initial handling of timeouts
* Add tests for syn re-sends