When `write` is called on a `StreamTransport`, the current sequence of
operations is:
* copy data to queue
* register for "write" event notification
* return unfinished future to `write` caller
* wait for "write" notification (in `poll`)
* perform one `send`
* wait for notification again if there's more data to write
* complete the future
In this PR, we introduce a fast path for writing:
* If the queue is empty, try to send as much data as possible
* If all data is sent, return completed future without `poll` round
* If there's more data to send than can be sent in one go, add the
rest to queue
* If the queue is not empty, enqueue as above
* When notified that write is possible, keep writing until OS buffer is
full before waiting for event again
The fast path provides significant performance benefits when there are
many small writes, such as when sending gossip to many peers, by
avoiding the poll loop and data copy on each send.
Also fixes an issue where the socket would not be removed from the
writer set if there were pending writes on close.