diff --git a/.obsidian/app.json b/.obsidian/app.json index f53f5bc..5f33603 100644 --- a/.obsidian/app.json +++ b/.obsidian/app.json @@ -1,5 +1,6 @@ { "newFileLocation": "folder", "newFileFolderPath": "10 Notes", - "attachmentFolderPath": "90 Extras/92 Assets" + "attachmentFolderPath": "90 Extras/92 Assets", + "alwaysUpdateLinks": true } \ No newline at end of file diff --git a/10 Notes/APIs and protocol optimisations.md b/10 Notes/APIs and protocol optimisations.md new file mode 100644 index 0000000..03d4681 --- /dev/null +++ b/10 Notes/APIs and protocol optimisations.md @@ -0,0 +1,324 @@ +--- +tags: + - bittorrent +--- +#bittorrent + +The [[libtorrent-rasterbar|libtorrent]] library public API can be found at [libtorrent reference](https://www.libtorrent.org/reference.html) page. + +The library provides lots of utility functions for loading/processing of the torrent files, handling magnet links, and low level internal functions of the BitTorrent peer-exchange protocol. Skipping those, the core interface used by clients can be limited to Session control and the related *Settings*, *Torrent Handle* and *Resume Data*. + +**Session** + +| *Session* types and functions | +| --------------------------- | +| `session_handle` | +| `session_proxy` | +| `session` | +| `session_params` | +| `write_session_params()` | +| `read_session_params()` | +| `write_session_params_buf()` | + +**Settings** + +| *Settings* types and functions | +| --------------------------- | +| `settings_pack` | +| `setting_by_name()` | +| `name_for_setting()` | +| `default_settings()` | +| `high_performance_seed()` | +| `min_memory_usage()` | +| `generate_fingerprint()` | + +**Torrent Handle** + +| *Torrent Handle* types and functions | +| --------------------------- | +| `block_info` | +| `partial_piece_info` | +| `torrent_handle` | +| `hash_value()` | + +**Resume Data** + +| *Resume Data* types and functions | +| --------------------------- | +| `read_resume_data()` | +| `write_resume_data_buf()` | +| `write_resume_data()` | +| `write_torrent_file_buf()` | +| `write_torrent_file()` | +| `write_torrent_flags_t` | + +How much effort would it be to provide a similar interface on top of the Codex client. Well, it is a substantial work. Some parameters are related to low level internals that still can be configured by the client (yet, from the Open Source client, only [[qBittorrent]] client exposes some settings from the settings pack), and some other parameters might not be relevant to us, e.g. does related to trackers or low level internals of the peer-exchange protocol like for instance `unchoke_interval`. + +The number of settings available is overwhelmingly big - making one-to-one analysis does not make sense in my opinion - upon deciding which client we would like to support (if any), only then we should evaluate the details of the required interface. + +The approach that could be considered is to first decide which protocol extensions are potentially attractive to Codex starting with the items discussed in [[What BitTorrent has that Codex does not]]. Only after knowing which of those extensions we want to support, it makes sense to focus on related technicalities in the *settings pack* and the *session* management. After identifying the settings that are relevant to Codex, we will need to make sure that the user interface adjusts itself accordingly, when Codex integration is enabled. + +To wrap up this section, here are the screenshots of the list of parameters from the [[qBittorrent]] client (the one with the most complete set of options): + +![[Pasted image 20241110195407.png]] + +![[Pasted image 20241110195544.png]] + +![[Pasted image 20241110200105.png]] + +![[Pasted image 20241110200233.png]] + +![[Pasted image 20241110200313.png]] +### Alerts + +BitTorrent has quite extensive set of `Alerts`: + +![[Pasted image 20241105084831.png]] + +### Python Bindings + +I also shortly checked if the Python bindings provided by the library are complete. After building [[libtorrent Python bindings]], we can extract the published public API from Python like this: + +```bash +$ python +Python 3.12.3 (main, Sep 11 2024, 14:17:37) [GCC 13.2.0] on linux +Type "help", "copyright", "credits" or "license" for more information. +>>> import libtorrent +>>> print(dir(libtorrent)) +['__doc__', '__file__', '__loader__', '__name__', '__package__', '__path__', '__spec__', '__version__', 'add_files', 'add_magnet_uri', ... +``` + +The Python bindings look pretty complete and most of the core functions have direct equivalents in Python. Separating *core functions* from *type definitions* and *alerts*, we get: + +**86 core functions** + +| function name | +| --------------------------- | +| `add_files` | +| `add_magnet_uri` | +| `add_torrent_params` | +| `announce_entry` | +| `bdecode` | +| `bdecode_category` | +| `bencode` | +| `client_fingerprint` | +| `create_smart_ban_plugin` | +| `create_torrent` | +| `create_ut_metadata_plugin` | +| `create_ut_pex_plugin` | +| `default_settings` | +| `dht_lookup` | +| `dht_settings` | +| `dht_state` | +| `enc_level` | +| `enc_policy` | +| `error_category` | +| `error_code` | +| `file_entry` | +| `file_open_mode` | +| `file_slice` | +| `file_storage` | +| `find_metric_idx` | +| `fingerprint` | +| `generate_fingerprint` | +| `generic_category` | +| `get_bdecode_category` | +| `get_http_category` | +| `get_i2p_category` | +| `get_libtorrent_category` | +| `get_socks_category` | +| `get_upnp_category` | +| `high_performance_seed` | +| `http_category` | +| `i2p_category` | +| `identify_client` | +| `ip_filter` | +| `kind` | +| `libtorrent_category` | +| `load_torrent_buffer` | +| `load_torrent_file` | +| `load_torrent_parsed` | +| `make_magnet_uri` | +| `min_memory_usage` | +| `open_file_state` | +| `operation_name` | +| `parse_magnet_uri` | +| `parse_magnet_uri_dict` | +| `pe_settings` | +| `peer_class_type_filter` | +| `peer_id` | +| `portmap_protocol` | +| `portmap_transport` | +| `protocol_type` | +| `protocol_version` | +| `read_resume_data` | +| `read_session_params` | +| `session` | +| `session_params` | +| `session_stats_metrics` | +| `session_status` | +| `set_piece_hashes` | +| `sha1_hash` | +| `sha256_hash` | +| `socks_category` | +| `stats_channel` | +| `stats_metric` | +| `system_category` | +| `torrent_flags` | +| `torrent_handle` | +| `torrent_info` | +| `torrent_status` | +| `tracker_source` | +| `upnp_category` | +| `version` | +| `version_major` | +| `version_minor` | +| `write_flags` | +| `write_resume_data` | +| `write_resume_data_buf` | +| `write_session_params` | +| `write_session_params_buf` | +| `write_torrent_file` | +| `write_torrent_file_buf` | + +32 type definitions: + +| type name | +| ------------- | +| `add_piece_flags_t` | +| `add_torrent_params_flags_t` | +| `bandwidth_mixed_algo_t` | +| `choking_algorithm_t` | +| `create_torrent_flags_t` | +| `deadline_flags_t` | +| `deprecated_move_flags_t` | +| `event_t` | +| `file_flags_t` | +| `file_progress_flags_t` | +| `info_hash_t` | +| `io_buffer_mode_t` | +| `listen_on_flags_t` | +| `metric_type_t` | +| `mmap_write_mode_t` | +| `move_flags_t` | +| `operation_t` | +| `options_t` | +| `pause_flags_t` | +| `peer_class_type_filter_socket_type_t` | +| `performance_warning_t` | +| `proxy_type_t` | +| `reannounce_flags_t` | +| `reason_t` | +| `save_resume_flags_t` | +| `save_state_flags_t` | +| `seed_choking_algorithm_t` | +| `session_flags_t` | +| `socket_type_t` | +| `status_flags_t` | +| `storage_mode_t` | +| `suggest_mode_t` | + +**99 Alerts** + +| alert type name | +| ------------- | +| `alert` | +| `add_torrent_alert` | +| `alert_category` | +| `alerts_dropped_alert` | +| `anonymous_mode_alert` | +| `block_downloading_alert` | +| `block_finished_alert` | +| `block_timeout_alert` | +| `block_uploaded_alert` | +| `cache_flushed_alert` | +| `dht_announce_alert` | +| `dht_bootstrap_alert` | +| `dht_get_peers_alert` | +| `dht_get_peers_reply_alert` | +| `dht_immutable_item_alert` | +| `dht_live_nodes_alert` | +| `dht_log_alert` | +| `dht_mutable_item_alert` | +| `dht_outgoing_get_peers_alert` | +| `dht_pkt_alert` | +| `dht_put_alert` | +| `dht_reply_alert` | +| `dht_sample_infohashes_alert` | +| `dht_stats_alert` | +| `external_ip_alert` | +| `fastresume_rejected_alert` | +| `file_completed_alert` | +| `file_error_alert` | +| `file_prio_alert` | +| `file_progress_alert` | +| `file_rename_failed_alert` | +| `file_renamed_alert` | +| `hash_failed_alert` | +| `i2p_alert` | +| `incoming_connection_alert` | +| `invalid_request_alert` | +| `listen_failed_alert` | +| `listen_failed_alert_socket_type_t` | +| `listen_succeeded_alert` | +| `listen_succeeded_alert_socket_type_t` | +| `log_alert` | +| `lsd_error_alert` | +| `metadata_failed_alert` | +| `metadata_received_alert` | +| `oversized_file_alert` | +| `peer_alert` | +| `peer_ban_alert` | +| `peer_blocked_alert` | +| `peer_connect_alert` | +| `peer_disconnected_alert` | +| `peer_error_alert` | +| `peer_info_alert` | +| `peer_log_alert` | +| `peer_snubbed_alert` | +| `peer_unsnubbed_alert` | +| `performance_alert` | +| `picker_log_alert` | +| `piece_availability_alert` | +| `piece_finished_alert` | +| `piece_info_alert` | +| `portmap_alert` | +| `portmap_error_alert` | +| `portmap_log_alert` | +| `read_piece_alert` | +| `request_dropped_alert` | +| `save_resume_data_alert` | +| `save_resume_data_failed_alert` | +| `scrape_failed_alert` | +| `scrape_reply_alert` | +| `session_stats_alert` | +| `session_stats_header_alert` | +| `socks5_alert` | +| `state_changed_alert` | +| `state_update_alert` | +| `stats_alert` | +| `storage_moved_alert` | +| `storage_moved_failed_alert` | +| `torrent_added_alert` | +| `torrent_alert` | +| `torrent_checked_alert` | +| `torrent_conflict_alert` | +| `torrent_delete_failed_alert` | +| `torrent_deleted_alert` | +| `torrent_error_alert` | +| `torrent_finished_alert` | +| `torrent_log_alert` | +| `torrent_need_cert_alert` | +| `torrent_paused_alert` | +| `torrent_removed_alert` | +| `torrent_resumed_alert` | +| `tracker_alert` | +| `tracker_announce_alert` | +| `tracker_error_alert` | +| `tracker_list_alert` | +| `tracker_reply_alert` | +| `tracker_warning_alert` | +| `udp_error_alert` | +| `unwanted_block_alert` | +| `url_seed_alert` | + diff --git a/10 Notes/BEP3 - The BitTorrent Protocol Specification.md b/10 Notes/BEP3 - The BitTorrent Protocol Specification.md new file mode 100644 index 0000000..8d1f169 --- /dev/null +++ b/10 Notes/BEP3 - The BitTorrent Protocol Specification.md @@ -0,0 +1,445 @@ +--- +tags: + - bittorrent +related-to: + - "[[Learn BitTorrent]]" +--- +#bittorrent + +| related-to | [[Protocol v1 clarifications]], [[BitTorrent DHT clarifications]] | +| ---------- | ----------------------------------------------------------------- | + +[BEP3](http://bittorrent.org/beps/bep_0003.html) describes the original, or `v1` of the BitTorrent protocol. + +### ChatGPT clarifications + +### Question 1 + +Let’s look at the BitTorrent protocol spec BEP3. Here is a fragment: + +”The peer wire protocol consists of a handshake followed by a never-ending stream of length-prefixed messages. The handshake starts with character ninteen (decimal) followed by the string 'BitTorrent protocol'. The leading character is a length prefix, put there in the hope that other new protocols may do the same and thus be trivially distinguishable from each other.“ +#### Answer + +The BitTorrent protocol handshake described in BEP3 is designed to establish a connection between peers and ensure that both ends are compatible BitTorrent clients. Here’s a breakdown of the process: + +1. **Handshake Message Structure**: + + - The handshake message starts with a single byte, 19 in decimal (or 0x13 in hexadecimal), representing the length of the protocol identifier string. + + - This byte is followed by the string "BitTorrent protocol", which is 19 bytes long. Together, these identify the protocol to the receiving peer. + +2. **Purpose of the Length Prefix**: + + - The initial byte (19) acts as a length prefix, signifying that the following 19 bytes contain the protocol identifier. This approach was intended to allow future protocols to adopt a similar structure, making it easy to distinguish between BitTorrent and other protocols that might operate in the same space. + + - This allows peers to recognize BitTorrent-specific connections and helps avoid potential conflicts with other protocols that might emerge. + +3. **Follow-up After the Handshake**: + + - After this initial identifier, additional handshake data is exchanged, including reserved bytes for extensions, the info_hash (identifying the torrent), and the peer ID. Once the handshake completes, both peers start exchanging regular protocol messages for data transfer. + +In essence, this structured handshake is BitTorrent’s way of both signaling its protocol and laying the foundation for a secure and compatible connection between peers. + +### Question 1a + +Which peer starts? I understand that the same message is returned by the other peer, right? +#### Answer + +In the BitTorrent protocol, any peer can initiate a connection by sending the handshake message to another peer. There isn’t a designated “starting” peer since both peers are essentially equal. However, typically: + +1. **Initiation**: + + - When a peer (let’s call it Peer A) discovers another peer (Peer B) via a tracker or peer exchange (PEX), it can initiate a TCP connection and send the handshake message to Peer B. + +2. **Handshake Message Exchange**: + + - Peer B, upon receiving this handshake message, checks the protocol string and verifies the info_hash to ensure it matches the torrent both are sharing. + + - If the handshake is valid, Peer B responds with its own handshake message, mirroring Peer A’s structure. This confirms compatibility and readiness to exchange data blocks for the torrent. + +3. **Symmetry of Handshake**: + + - The handshake message structure is identical from both sides, so after the initial connection, both peers have exchanged the same type of message. This mutual exchange verifies the connection and establishes both peers as legitimate participants in the torrent. + +In practice, once this handshake is complete, either peer can begin sending subsequent messages as part of the data exchange, like interested, not interested, or request messages. + +### Question 2 + +Can you further explain this part of the protocol, ideally with some examples? + +“Next comes the 20 byte sha1 hash of the bencoded form of the info value from the metainfo file. (This is the same value which is announced as info_hash to the tracker, only here it's raw instead of quoted here). If both sides don't send the same value, they sever the connection. The one possible exception is if a downloader wants to do multiple downloads over a single port, they may wait for incoming connections to give a download hash first, and respond with the same one if it's in their list.” +#### Answer + +The section you’re asking about in the BitTorrent protocol handshake describes the info_hash field, which is critical for confirming that both peers are participating in the same torrent. Here’s a step-by-step breakdown of what’s happening: + +1. **What is `info_hash`?** + + - The info_hash is a 20-byte SHA1 hash that uniquely identifies the specific torrent. It’s derived from the “info” dictionary in the .torrent file, which contains metadata about the torrent’s content, such as file names, sizes, and piece information. + + - This hash is the same one that the peer announces to a tracker when joining the swarm, helping to ensure that all peers are part of the same torrent. + +2. **Handshake Process Using `info_hash`** + + - During the handshake, each peer includes this 20-byte info_hash in the message. When Peer A sends the handshake to Peer B, it includes its info_hash, and Peer B responds with a handshake containing its own info_hash. + + - Each peer then checks whether the info_hash from the other peer matches its own. If the hashes match, it confirms that both peers are sharing the same torrent and can proceed to data exchange. If the hashes don’t match, the peers terminate the connection because they are not part of the same torrent. + +3. **Why This Check is Necessary** + + - This verification prevents peers from mistakenly connecting to other torrents that may be using the same port or other shared resources. The info_hash acts as a unique identifier, so both peers can be sure they are trying to share the same files. + +4. **Handling Multiple Torrents on a Single Port (Exception)** + + - Sometimes, a peer may download multiple torrents but only has one open port. In this case, when another peer connects and provides an info_hash, the receiving peer may check if it has that info_hash in its list of active torrents. If so, it proceeds with the handshake as normal. If not, it severs the connection. + + - This flexibility allows the peer to manage multiple torrents efficiently without needing multiple ports, accommodating multiple incoming handshake requests. + +**Example Walkthrough** + +Imagine Peer A is downloading a torrent with the `info_hash` `abc123...` (20-byte SHA1 hash). Peer B, which is also downloading the same torrent, wants to connect: + +1. **Peer A** initiates the handshake to **Peer B**: + + - Peer A sends: `{19, "BitTorrent protocol", 8 reserved bytes, info_hash="abc123..."}`. + +2. **Peer B** receives the handshake, verifies the info_hash, and sends a handshake back: + + - Peer B sends: {19, "BitTorrent protocol", 8 reserved bytes, info_hash="abc123..."}. + +3. **Both Peers Compare `info_hash` Values**: + + - Peer A checks that the `info_hash` from Peer B matches `abc123...`, confirming they’re both in the same torrent swarm. + + - If the `info_hash` didn’t match, both peers would close the connection. + +This `info_hash` exchange ensures that peers can only connect if they are sharing the same torrent, helping to maintain the integrity and organization of different torrent swarms. + +### Question 2a + +What is “download hash” in this context? +#### Answer + +In this context, “download hash” refers to the `info_hash` of a specific torrent being downloaded. It’s another way of describing the unique identifier for a torrent. Here’s how it works: + +- The `info_hash` (or “download hash” in this case) is a SHA1 hash generated from the “info” dictionary in the .torrent file. + +- When multiple torrents are being downloaded on a single port, a peer waits to receive an incoming handshake from another peer with a specific `info_hash`. It can then check if that info_hash (download hash) matches any torrents it’s currently downloading. + +- If the received `info_hash` matches one in the peer’s list, it can proceed with the handshake; if not, it severs the connection. + +This approach allows peers to manage connections for multiple torrents simultaneously while ensuring each connection is specific to a single torrent. + +### Question 3 + +And also, please explain with example the last part of the handshake: + +“After the download hash comes the 20-byte peer id which is reported in tracker requests and contained in peer lists in tracker responses. If the receiving side's peer id doesn't match the one the initiating side expects, it severs the connection.” +#### Answer + +The final part of the BitTorrent handshake includes a 20-byte peer ID immediately following the `info_hash`. This peer ID uniquely identifies each peer in the network, allowing them to recognize one another and track each peer’s activity within the swarm. Here’s how this part of the handshake works and why it’s important: + +**Purpose of the Peer ID** + +1. **Identification**: Each BitTorrent client generates a unique peer ID for itself when it starts a new torrent session. This ID helps peers keep track of each other and manage connections. + +2. **Tracker Communication**: The peer ID is reported to the tracker when a client joins the torrent. The tracker then provides a list of other peers, each with their own peer IDs, to help them connect. + +3. **Verification During Handshake**: When peers establish a connection, they exchange these peer IDs to confirm each other’s identities. + +**Example Walkthrough** + +Imagine two peers, Peer A and Peer B, trying to connect within the same torrent: + +1. **Peer A Starts the Handshake**: + + - Peer A sends the handshake message: `{19, "BitTorrent protocol", 8 reserved bytes, info_hash="abc123...", peer_id="A12345..."}`. + +• This message includes Peer A’s own peer ID ("A12345..."), which is unique to Peer A. + +2. **Peer B Receives the Handshake:** + + - Peer B, which also has the same info_hash for this torrent, receives the handshake and extracts Peer A’s peer ID from it. + + - Peer B then sends back its own handshake with its peer ID: {19, "BitTorrent protocol", 8 reserved bytes, info_hash="abc123...", peer_id="B67890..."}. + +3. **Checking the Peer ID**: + + - Peer A receives Peer B’s handshake and verifies the peer ID "B67890..." from Peer B’s message. + + - Both peers now have each other’s peer IDs and confirm that they’re connected to the correct nodes within the torrent swarm. + +**Closing the Connection if IDs Don’t Match** + +If Peer A expects a specific peer ID but receives one that doesn’t match (for example, if Peer B is impersonating a different peer), Peer A will sever the connection. This ensures that connections are only maintained between known and trusted peers, as verified by their unique peer IDs. + +### Question 3a + +Ok, so what happens after handshake. Following this: + +“That's it for handshaking, next comes an alternating stream of length prefixes and messages. Messages of length zero are keepalives, and ignored. Keepalives are generally sent once every two minutes, but note that timeouts can be done much more quickly when data is expected.“ + +Peers start sending ”peer messages”. Can you give some example flows? And also, what are those “length prefixes” mentioned above? +#### Answer + +After the handshake, BitTorrent peers begin exchanging a series of peer messages to coordinate downloading and uploading. These messages contain a length prefix to indicate the message’s size, allowing peers to read messages correctly from the data stream. Here’s how this works and what kind of messages typically flow between peers: + +**Understanding Length Prefixes** + +Each message in the peer protocol includes a length prefix at the beginning, which specifies the number of bytes that follow: + +- The length prefix is typically a 4-byte integer. + +- For example, if a message’s length prefix is 0009, it indicates that the next 9 bytes make up the message. + +A special case is the keep-alive message: it has a length prefix of zero and no body, which simply tells the peer that the connection is still active without performing any specific action. + +**Common Peer Message Types and Example Flows** + +Here are some typical message types and how they might flow between peers after the handshake: + + + +1. **Interested / Not Interested Messages** + + - **Purpose**: After connecting, a peer will send an interested message if it wants pieces from the other peer, or not interested if it doesn’t need anything at the moment. + + - **Flow Example**: + + - Peer A sends interested to Peer B if it’s looking for pieces. + + - Peer B responds with unchoke if it’s willing to upload to Peer A (choking controls upload permissions). + +2. **Have Message** + +- **Purpose**: A peer sends have messages to inform the other peer that it has successfully downloaded a specific piece. + +- **Flow Example**: + + - Peer A completes a piece and sends a have message with the piece index to Peer B. + + - Peer B can then decide to mark Peer A as a potential source for that piece. + +3. **Request Message** + + - **Purpose**: To request specific pieces of data, the peer sends a request message specifying the index of the piece it wants. + + - **Flow Example:** + + - Peer A sends a request message to Peer B for a piece. + + - Peer B, if unchoked, responds with a piece message containing the data for the requested piece. + +4. **Piece Message** + + - **Purpose**: The piece message contains the actual data being transferred, typically a specific piece or chunk of the file. + + - **Flow Example**: + + - Peer B sends a piece message to Peer A with the data after receiving a request from Peer A. + + - Peer A receives the piece, verifies it (using a hash check), and may send another request for the next piece. + +5. **Choke / Unchoke Messages** + + - **Purpose**: Peers use choke and unchoke messages to control the upload permissions to other peers, helping manage bandwidth and prioritize uploads. + + - **Flow Example**: + + • If Peer B’s upload slots are full, it sends a choke message to Peer A, temporarily halting uploads. + + - When an upload slot frees up, Peer B sends an unchoke message to Peer A, allowing it to resume downloading. + +6. **Keep-Alive Message** + + - **Purpose**: To keep the connection open, a peer will send a keep-alive message (length prefix of zero) if no other messages have been sent within a two-minute interval. + + - **Flow Example**: + + - Peer A sends a keep-alive message to Peer B after a period of inactivity to signal that the connection is still active. + +**Example Peer Message Flow** + +Imagine Peer A and Peer B connecting to share a file in the torrent: + +1. **Handshake**: Both peers complete the handshake and verify they’re part of the same torrent. + +2. **Interest and Permission**: + + - Peer A sends an interested message to Peer B, indicating it wants to download pieces. + + - Peer B responds with an unchoke message, granting permission to download. + +3. **Data Requests**: + + - Peer A sends a request message asking for a piece. + + - Peer B responds with a piece message containing the data. + +4. **Notification of Completion**: + + - After receiving and verifying a piece, Peer A sends a have message to inform Peer B that it has that piece. + + - Periodic Keep-Alive: + + - If there is a lull in communication, Peer A and Peer B may exchange keep-alive messages to ensure the connection stays active. + +This cycle of requesting, receiving, and sharing updates continues until Peer A has downloaded all the necessary pieces of the file from Peer B and potentially other peers in the swarm. + +### Question 4 + +How do the peers know which peer has which pieces? Initially, it may happen that there is only one “seeder”, right? +#### Answer + +In the BitTorrent protocol, peers keep track of which pieces each peer in the swarm has through a system of bitfields and have messages: + +1. **Bitfield Message (Initial Knowledge)** + + - When a peer first connects to another peer, it sends a bitfield message if it has any pieces of the file. This message is a binary string where each bit represents a piece, indicating whether the peer has that specific piece (1 for presence, 0 for absence). + + - **Example**: If the torrent file has 8 pieces, a bitfield of 10100011 would mean that the peer has pieces 0, 2, 6, and 7. + + - If the peer doesn’t have any pieces (for example, if it just started downloading), it may skip sending a bitfield or send an empty bitfield, depending on the client. + +2. **Have Messages (Dynamic Updates)** + + - As peers download pieces, they send have messages to inform other connected peers of newly acquired pieces. + + - **Example**: If Peer A downloads piece 3, it sends a have message with the piece index (3) to each peer it’s connected to. This allows other peers to update their view of which pieces Peer A has. + +3. **Initial Seeding and Peer Awareness** + + - In a new swarm, there might only be one seeder (a peer that has the entire file). The seeder’s bitfield will show all bits set to 1, indicating it has every piece. + + - Peers that connect to this seeder will quickly learn that it has all pieces and can request them in the order specified by the piece selection strategy (e.g., rarest first, random). + +4. **Piece Selection Strategy** + + - Once a peer knows which pieces each peer has, it uses a piece selection strategy (like rarest-first) to decide which pieces to request. This helps distribute pieces more evenly across the swarm and prevents all peers from downloading the same piece from the seeder. + +Through the initial bitfield exchange and subsequent have messages, peers dynamically update their knowledge of which pieces each peer in the swarm has, allowing them to efficiently download and share pieces among themselves. + +### Question 5 + +So, how does it all start? Let’s say Peer A found a tracker-less torrent file, and wants to download that file. As you see I mentioning tracker-less setup, which means that tracker file will not have `announces` field, but rather `nodes` filed. Thus, in this scenario we are using DHTs. I would like to learn a complete exchange, with all the messages for a file with 3 pieces, and a SWARM with 3 peers. Can you help? +#### Answer + +In a tracker-less BitTorrent setup, peers use the Distributed Hash Table (DHT) to find each other without a central tracker. Here’s a step-by-step example of how Peer A could begin downloading a file with 3 pieces from a swarm of 3 peers using DHT. + +##### Initial Setup + +- **File Info**: The torrent file describes a file split into 3 pieces. + +- **Swarm**: Three peers (A, B, and C) are in the swarm. + +- **DHT**: The torrent has no announce field but contains a nodes field, listing known DHT nodes. + +##### Steps in the Exchange + +**Step 1: Peer A Joins the DHT Network** + +1. **Bootstrapping**: + + - Peer A begins by connecting to DHT nodes listed in the nodes field. These bootstrap nodes help Peer A join the DHT network by introducing it to other nodes. + +2. **Finding Peers for the Torrent**: + + - Peer A generates a unique DHT key for the torrent based on the info_hash (SHA1 hash of the torrent’s “info” section). + + - Peer A uses a find_node DHT query, asking nearby DHT nodes for peers with the same info_hash. Through several find_node and get_peers DHT requests, Peer A eventually locates Peers B and C, who are also sharing this torrent. + +**Step 2: Peer A Connects to Peers B and C** + +1. **Peer-to-Peer Handshake**: + + - Peer A initiates a connection to Peer B by sending the BitTorrent handshake, including the protocol identifier, info_hash, and Peer A’s ID. + + - Peer B responds with its own handshake, verifying that both are part of the same torrent. + + - Peer A repeats this with Peer C. + +**Step 3: Exchanging Bitfields** + +1. **Bitfield Exchange**: + + - Peer B has all three pieces (is a seeder) and sends a bitfield 111 (indicating it has all pieces). + + - Peer C has only the first two pieces and sends a bitfield 110. + + - Peer A has no pieces yet and sends a bitfield 000. + +2. **Updating Knowledge**: + + - After the bitfield exchange, Peer A knows which pieces each peer has: + + - Peer B can provide any piece (0, 1, or 2). + + - Peer C can only provide pieces 0 and 1. + +**Step 4: Piece Requests and Data Transfer** + +1. **Piece Selection Strategy**: + + - Peer A decides to request pieces from the rarest to the most common. It finds that piece 2 is rare (only available from Peer B). + +2. **Requesting Pieces**: + + - **Requesting Piece 2**: + + - Peer A sends a request message to Peer B for piece 2. + + - Peer B responds with a piece message containing piece 2. + + - **Requesting Piece 0**: + + - Peer A then sends a request message to Peer C for piece 0. + + - Peer C responds with a piece message containing piece 0. + + - **Requesting Piece 1**: + + Finally, Peer A requests piece 1 from Peer C, who sends it over in a piece message. + +3. **Announcing Newly Downloaded Pieces:** + + - Each time Peer A completes a piece, it sends a have message to both Peer B and Peer C to inform them of its new pieces. + + - After completing all three pieces, Peer A has the full file and can start serving as a seeder. + +**Step 5: Ongoing Keep-Alives and Maintenance** + +1. **Keep-Alive Messages**: + + - If there is a period of inactivity, the peers send keep-alive messages every two minutes to maintain the connection. + +2. **Maintaining DHT Membership**: + + - Peer A periodically sends DHT ping requests to refresh its connection with other DHT nodes, ensuring it remains discoverable in the swarm. + +This complete exchange shows how Peer A starts with a DHT lookup to find peers, establishes connections, exchanges bitfields to learn which pieces each peer has, and requests pieces in a way that optimizes download efficiency. Through this process, Peer A downloads the full file and becomes part of the swarm, ready to upload pieces to other peers. + +### Question 5a + +In **Step 3 - Exchanging Bitfields** - peers are sending their bitfields to other peers. But how does it happen? How peers know that they should send their bitfields to other peers? This is not a broadcast, right? What is the trigger? +#### Answer + +In the BitTorrent protocol, the exchange of bitfields between peers is not automatic or broadcasted. Instead, bitfields are triggered by the completion of the handshake between two peers. Here’s how it happens: + +1. **Handshake Completion**: + + - When two peers complete the handshake, they confirm they’re part of the same torrent by verifying the info_hash. + + - Once the handshake is successful, both peers now recognize each other as legitimate participants in the torrent. + +2. **Bitfield Message Trigger**: + + - Immediately following the handshake, each peer sends a bitfield message to the other. This message provides a snapshot of which pieces the peer already possesses. + + - If a peer has no pieces (such as a newly joined downloader), it might send an empty or zeroed bitfield, depending on the client implementation. + +3. **Purpose of Bitfield Exchange**: + + - This initial bitfield exchange is essential because it allows both peers to understand what pieces each other has, which guides future request and have messages. + + - After the initial exchange, any newly completed pieces are communicated with have messages, not a repeated bitfield. + +So, the handshake acts as the trigger for the bitfield message. This is a direct, one-to-one exchange and not a broadcast. Each peer sends its bitfield only to those peers it connects with individually, ensuring that the swarm doesn’t experience unnecessary network load from broadcasting. diff --git a/10 Notes/BEP5 - DHT Protocol.md b/10 Notes/BEP5 - DHT Protocol.md index aca954d..a3b063a 100644 --- a/10 Notes/BEP5 - DHT Protocol.md +++ b/10 Notes/BEP5 - DHT Protocol.md @@ -4,16 +4,18 @@ tags: - dht link: http://bittorrent.org/beps/bep_0005.html related-to: + - "[[BitTorrent DHT clarifications]]" - "[[Learn BitTorrent]]" + - "[[Protocol v1 clarifications]]" --- #bittorrent #dht -| link | http://bittorrent.org/beps/bep_0005.html | -| ---------- | ---------------------------------------- | -| related-to | [[Learn BitTorrent]] | +| link | http://bittorrent.org/beps/bep_0005.html | +| ---------- | --------------------------------------------------------------------------------------- | +| related-to | [[BitTorrent DHT clarifications]], [[Learn BitTorrent]], [[Protocol v1 clarifications]] | For the DHT protocol, there are four **queries**: -- `ping`: to check if another node (one from its DHT routing table) is online and reachable, +- [[ping]]: to check if another node (one from its DHT routing table) is online and reachable, - `find_node`: find the contact information for a node given its ID, - `get_peers`: get peers associated with a torrent [[Infohash|infohash]], - `announce_peer`: to *announce* that the peer, controlling the querying node, is downloading a torrent on a port. diff --git a/10 Notes/BEP52 - The BitTorrent Protocol Specification v2.md b/10 Notes/BEP52 - The BitTorrent Protocol Specification v2.md new file mode 100644 index 0000000..e9c293f --- /dev/null +++ b/10 Notes/BEP52 - The BitTorrent Protocol Specification v2.md @@ -0,0 +1,85 @@ +--- +tags: + - bittorrent +link: http://bittorrent.org/beps/bep_0052.html +related-to: + - "[[Learn BitTorrent]]" + - "[[libtorrent-rasterbar]]" +--- +#bittorrent + +| link | http://bittorrent.org/beps/bep_0052.html | +| ---------- | ---------------------------------------------- | +| related-to | [[Learn BitTorrent]], [[libtorrent-rasterbar]] | + + +Dated 10-Jan-2008, [BEP52](http://bittorrent.org/beps/bep_0052.html) describes `v2` of the BitTorrent protocol. Below I attempt to enumerate the most important changes. + +> I am not trying to be exhaustive, not very precise here, as we will basically focus on the version 2 of the protocol. If we want to support `v1` we will learn it better while working on it. + +### ChatGPT clarifications + +### Question 1 + +Ok, let’s now switch to BitTorrent v2 (BEP52: http://bittorrent.org/beps/bep_0052.html). + +Can you summarise the differences between BitTorrent protocol version 2 (BEP52) and version 1 (BEP3)? +#### Answer + +BitTorrent v2 (BEP52) introduces several key improvements over the original BitTorrent protocol (v1, as defined in BEP3), focusing on enhanced security, data integrity, and efficiency. Here’s a summary of the main differences: + +1. **Piece Hashing and Integrity**: + + - **V1 (BEP3)**: Uses SHA-1 hashes for each piece. The entire torrent is hashed as one, and the `info_hash` is an SHA-1 hash of the “info” dictionary. + + - **V2 (BEP52)**: Switches to SHA-256 hashes, providing stronger security. V2 introduces merkle trees for piece verification within each file, which allows smaller hashes and more flexible integrity checks. Each file has its own hash tree, improving the handling of multi-file torrents. + +2. **Improved Data Integrity with Merkle Trees**: + + - **V1**: Uses a single SHA-1 hash per piece, stored sequentially in the .torrent file, meaning each piece’s hash is validated individually but does not connect them hierarchically. + + - **V2**: Uses merkle trees for each file, storing SHA-256 hashes in a hierarchical tree structure. This allows peers to verify individual pieces more efficiently without needing all hashes, which is particularly helpful for partial downloads. + +3. **File-Based Hashing**: + + - **V1**: In multi-file torrents, only pieces are hashed, not individual files, meaning no direct file-level validation. + + - **V2**: Each file is hashed independently, creating a separate merkle tree for each. This enables file-level validation, so files can be verified independently, improving integrity and security for multi-file torrents. + +4. **Hybrid Torrents (Backward Compatibility)**: + + - **V1**: Not applicable; designed with no forward compatibility. + + - **V2**: Supports hybrid torrents, which are torrents compatible with both v1 and v2 clients. Hybrid torrents contain both SHA-1 (for v1) and SHA-256 (for v2) hashes, allowing legacy and v2 clients to participate in the same swarm. + +5. **Reduced Hash List Size for Large Files**: + + - **V1**: The .torrent file stores a linear list of hashes, which grows with the number of pieces. + + - **V2**: By using merkle trees, V2 stores only a root hash for each file, significantly reducing the size of the hash list for large files. Peers can request only the relevant parts of the tree they need for validation, which is more efficient. + +6. **Improvements in Piece Alignment and Flexibility**: + + - **V1**: Pieces are aligned across files, meaning a piece can contain parts of multiple files, which complicates file validation and partial downloads. + + - **V2**: Pieces are aligned to individual files, meaning each piece belongs to a single file. This makes it easier to download and verify single files within a multi-file torrent. + +7. **Increased Security and Reduced Collision Risk**: + + - **V1**: Uses SHA-1, which is now considered weak and vulnerable to collision attacks. + + - **V2**: Uses SHA-256, which provides significantly stronger security against collisions, helping protect the integrity of torrents and reducing the risk of hash collisions. + +##### Summary Table + +| Feature | BitTorrent v1 (BEP3) | BitTorrent v2 (BEP52) | +| ---------------------- | ------------------------------ | -------------------------------- | +| Hash Algorithm | SHA-1 | SHA-256 | +| Piece Hashing | Linear hash list per piece | Merkle trees per file | +| File-Level Integrity | Not supported | Supported with file-based hashes | +| Backward Compatibility | None | Hybrid torrents supported | +| Hash List Efficiency | Linear growth with pieces | Reduced size with merkle trees | +| Piece Alignment | Spans multiple files | Aligned per file | +| Security | Weaker (SHA-1 vulnerabilities) | Stronger (SHA-256) | + +These improvements in BitTorrent v2 (BEP52) focus on enhancing data integrity, security, and efficiency, especially for larger torrents and multi-file torrents. Hybrid torrents further ease the transition, allowing v2 and legacy clients to coexist within the same swarms. \ No newline at end of file diff --git a/10 Notes/BitTorrent Clients Usage Statistics.md b/10 Notes/BitTorrent Clients Usage Statistics.md index 1f283d7..8fbdd95 100644 --- a/10 Notes/BitTorrent Clients Usage Statistics.md +++ b/10 Notes/BitTorrent Clients Usage Statistics.md @@ -14,6 +14,6 @@ As of 2024, the popularity of BitTorrent clients can be quantified through vario 1. **uTorrent and BitTorrent**: These two clients, owned by the same company, collectively have around **150 million active monthly users**. uTorrent alone is particularly popular and widely used worldwide, especially in countries like Russia, India, and the United States. 2. [[qBittorrent]]: This open-source client is another popular choice, especially among users looking for an ad-free experience. [[qBittorrent]] receives around **6 million visits monthly**, with most users coming from countries like China, the United States, and Russia. It’s particularly popular with younger users, with over 35% of its visitors aged 25-34 . -3. [[Deluge (BitTorrent)]]: Although smaller, Deluge remains popular with an estimated **272,800 monthly visitors**. The majority of its users are from the United States, Brazil, and France. It attracts a similar age demographic as other clients, with a strong preference among those aged 25-34 . +3. [[Deluge]]: Although smaller, Deluge remains popular with an estimated **272,800 monthly visitors**. The majority of its users are from the United States, Brazil, and France. It attracts a similar age demographic as other clients, with a strong preference among those aged 25-34 . These numbers highlight the reach and usage patterns of the most popular BitTorrent clients. While uTorrent and BitTorrent dominate in terms of sheer volume, [[qBittorrent]] has a dedicated user base that values its open-source and ad-free nature, and Deluge maintains a smaller but loyal following. diff --git a/10 Notes/BitTorrent DHT clarifications.md b/10 Notes/BitTorrent DHT clarifications.md new file mode 100644 index 0000000..e073d7c --- /dev/null +++ b/10 Notes/BitTorrent DHT clarifications.md @@ -0,0 +1,155 @@ +--- +tags: + - bittorrent + - dht +related-to: + - "[[Learn BitTorrent]]" +--- +#bittorrent + +| related-to | [[Learn BitTorrent]] | +| ---------- | -------------------- | + +### Question 1 + +The BitTorrent Enhancement proposal for DHTs (BEP-5) includes the following paragraph: + +“The return value for a query for peers includes an opaque value known as the "token." For a node to announce that its controlling peer is downloading a torrent, it must present the token received from the same queried node in a recent query for peers. When a node attempts to "announce" a torrent, the queried node checks the token against the querying node's IP address. This is to prevent malicious hosts from signing up other hosts for torrents. Since the token is merely returned by the querying node to the same node it received the token from, the implementation is not defined. Tokens must be accepted for a reasonable amount of time after they have been distributed. The BitTorrent implementation uses the SHA1 hash of the IP address concatenated onto a secret that changes every five minutes and tokens up to ten minutes old are accepted.” + +Can you explain it better or make it clearer? I really do not see how those Tokens are working… + +#### Answer + +The use of tokens in the BitTorrent DHT (Distributed Hash Table) serves as a security measure to verify that a node genuinely intends to join a torrent swarm. Here’s a clearer explanation: + +##### How Tokens Work + +1. **Token Issuance:** + - When a node (let’s call it Node A) queries another node (Node B) in the DHT for peers associated with a torrent, Node B responds with a list of peers. Along with the peer list, Node B includes an opaque “token” in its response. + + - This token is generated uniquely by Node B and is associated with the IP address of Node A at the time of the request. + +2. **Using the Token:** + + - If Node A later wants to announce itself as a peer for the torrent (effectively telling Node B, “I’m now part of this swarm”), it must send this previously received token back to Node B. + + - Node B checks if the token matches what it previously issued for Node A’s IP address. This step ensures that the token is only valid for the original requester, which is Node A in this case. + +3. **Security Check:** + + - This token mechanism prevents “malicious” nodes from misrepresenting other IP addresses as part of the torrent swarm. For example, without the token mechanism, an attacker could attempt to announce other IP addresses to the swarm, making it appear as though other nodes are participating when they’re not. + + - Since tokens are specific to the requesting IP and are time-sensitive, they are difficult to forge or reuse from a different IP address. + +4. **Token Expiry:** + + - Tokens have a limited validity, usually around five to ten minutes. BitTorrent’s implementation, as mentioned, uses a SHA1 hash of the requester’s IP address combined with a secret that changes every five minutes. This time-bound validity ensures that if a token is misused, it won’t be valid for long, adding another layer of security. + +In summary, the token is a temporary key used to confirm that a peer intending to announce itself actually performed a recent lookup from its current IP. This prevents unauthorized nodes from announcing other IPs, helping to keep the DHT swarm authentic and secure. + +### Question 2 + +OK, later, I read: + +“Upon inserting the first node into its routing table and when starting up thereafter, the node should attempt to find the closest nodes in the DHT to itself. It does this by issuing find_node messages to closer and closer nodes until it cannot find any closer. The routing table should be saved between invocations of the client software.” + +When calling `find_node` to “closer and closer” nodes, where are we starting from? +#### Answer + +In the DHT (Distributed Hash Table) protocol, the process of finding “closer and closer” nodes during a find_node search is centered on locating nodes that have identifiers (node IDs) numerically closer to the node initiating the search. Here’s how it works: + +1. **Starting Point - Known Nodes in the Routing Table:** + + - When a node (let’s call it Node A) joins the DHT, it typically starts with an initial set of known nodes. These nodes may come from a pre-existing list, prior DHT connections, or even cached nodes from a previous session if the client saves its routing table. + + - Node A then starts by querying these known nodes to find nodes that are progressively closer (in terms of XOR distance) to its own ID. + +2. **XOR Distance for Proximity:** + + - In the DHT, node IDs are compared using an XOR metric, where the “distance” between two nodes is the XOR result of their IDs. The closer the XOR result is to zero, the closer the node IDs are. + + - Node A issues find_node requests to its known nodes, asking for nodes that are closer to its own ID. Each find_node response provides a list of nodes that are closer to Node A’s ID than previous nodes. + +3. **Iterative Process - Narrowing Down:** + + - Node A repeats this process iteratively: with each step, it queries nodes that are progressively closer to its ID until it can no longer find any nodes that are closer. + + - This process effectively narrows down the search space in the DHT and helps Node A locate its own neighborhood in the distributed network. + +4. **Building the Routing Table:** + + - As Node A locates closer nodes, it adds them to its routing table, arranging them in buckets based on their XOR distance to its own ID. This routing table structure is then used for efficiently finding peers and performing lookups in the DHT. + +In summary, the search starts with known nodes and iteratively progresses toward nodes with IDs numerically closer to the querying node’s own ID, refining the list of nodes to build a complete, locally optimized view of the network. This method ensures that the node can quickly locate nearby nodes in the DHT structure. + +### Question 3 + +But here we are talking about starting a fresh node, and adding the first node (itself), how does it know which ID to use as a starting point of this ”closer and closer” search to start subsequent nodes? Is it random from the whole ID space? +#### Answer + +When a fresh DHT node (with no pre-existing routing table) starts, it generally performs the following steps to join the network: + +1. **Choosing an Initial ID:** + + - Each DHT node generates its own unique ID when it first joins the network. This ID is typically selected randomly from the entire ID space. The ID does not directly influence which nodes it will connect to at first but serves as a unique identifier for locating its “closest” nodes in the network. + +2. **Starting with Bootstrap Nodes:** + + - Since the new node has no information about the existing network, it must begin by contacting bootstrap nodes. These are predefined nodes whose IP addresses are embedded in the client software or are provided by a trusted external source. + + - The bootstrap nodes help the fresh node “see” the network by responding with a list of other nodes that are relatively close to the new node in the ID space. + +3. **Executing the find_node Search:** + + - Using these bootstrap nodes as a starting point, the new node begins issuing find_node requests. The goal is to locate nodes that are progressively closer to its own ID by comparing XOR distances. + + - Each response provides a list of nodes with IDs closer to the new node’s own ID, allowing it to iteratively narrow down and populate its routing table. + +4. **Building a Routing Table:** + + - As the new node receives responses, it builds a routing table by adding nodes in “buckets” based on their proximity (XOR distance) to its own ID. This process enables the new node to gradually become part of the DHT by efficiently locating nearby peers in the network. + +By using bootstrap nodes and performing iterative find_node queries, the new node can establish its place in the DHT without needing prior knowledge of any specific starting ID. The randomly generated ID ensures that each node’s location within the DHT is unique and well-distributed across the ID space, which supports the overall balance and resilience of the network. + +### Question 4 + +It says in the context of KRPC protocol that: + +“The transaction ID should be encoded as a short string of binary numbers, typically 2 characters are enough as they cover 2^16 outstanding queries.” + +What is a “string of binary numbers”? Can you give an example? + +> This is an excellent example of how **not** to write specs. The term “string of binary numbers” is nonsense in best case, or an oxymoron in the worst. Binary is already a representation of a number. A number can `33` - here using *base-10* or *decimal* encoding. The same number written in *base-16*, or *hexadecimal* encoding is `11` and in *base-2* or binary encoding it is `00010001`. But what is a binary number? It is a shortcut to say *a number in binary encoding*. So what would be a string of binary numbers. Would “1110011, 11011111, 000111” qualify? +> This is one of many examples, where the language in the spec is unclear and ambiguous. + +> It is amusing to see how ChatGPT tries hard to make sense out of it. Admittedly doing quite a good job here… +#### Answer + +In the context of the KRPC (Kademlia RPC) protocol used by BitTorrent DHT, a “string of binary numbers” refers to a sequence of bytes (not just 0s and 1s) that represent a transaction ID. This ID is used to match requests with responses and is typically short, only 2 bytes long. Each byte can represent values from 0 to 255, so with two bytes (16 bits), you can represent 65,536 unique transaction IDs, which is sufficient for managing active requests in the network. + +Example + +A “string of binary numbers” could look like this in hexadecimal (which is often used to represent binary data more compactly): + + • 0x01 0xAB or in binary: 00000001 10101011 + +Here, 0x01AB represents a 2-byte transaction ID. Since the KRPC protocol expects a transaction ID to be short and efficient, these IDs are kept at 2 bytes, allowing enough room to track multiple requests at once without using excessive bandwidth. + +In many programming languages, these IDs would simply be handled as byte arrays or short strings and treated as opaque binary data, meaning there’s no need for it to be human-readable. + +#### Followup question + +So, we need 4 characters, right? What do they say that 2 characters are sufficient? + +#### Answer + +When the KRPC protocol documentation refers to using “2 characters” to encode the transaction ID, it typically means 2 **bytes**, not literal ASCII characters. Since each byte can represent 256 values (0-255), two bytes provide a range of 65,536 possible transaction IDs, which is often enough to manage outstanding queries. + +Here’s how this works in binary and hexadecimal formats: + +- Binary: `00000001 10101011` (16 bits total, where each “character” is actually 8 bits or 1 byte). + +- Hexadecimal: `0x01AB` (which is still 2 bytes or 16 bits, representing the same value as above). + +In hexadecimal, this range covers values from 0x0000 to 0xFFFF, making it possible to manage a high number of concurrent requests without needing more than 2 bytes. + diff --git a/10 Notes/Codex-BitTorrent Integration Intro.md b/10 Notes/Codex-BitTorrent Integration Intro.md index dc42136..11cf404 100644 --- a/10 Notes/Codex-BitTorrent Integration Intro.md +++ b/10 Notes/Codex-BitTorrent Integration Intro.md @@ -3,7 +3,7 @@ tags: - bittorrent --- -Let's gather some background data about [[Learn BitTorrent|BitTorrent]] clients. This is something that ChatGPT is quite good at so I compiled a bit of data you can easily check for yourself. If something looks funky, please feel free to adjust and clarify, but just for having quick check of where we are, it seems to be a reasonable overview. I also saw @Giuliano Mega immediately jumping into [[Deluge (BitTorrent)]] (also @Mark sees that as a good start), so for my personal conviction we are diving into the right thing I wanted to have some arguments on why - however short and imprecise it is not. +Let's gather some background data about [[Learn BitTorrent|BitTorrent]] clients. This is something that ChatGPT is quite good at so I compiled a bit of data you can easily check for yourself. If something looks funky, please feel free to adjust and clarify, but just for having quick check of where we are, it seems to be a reasonable overview. I also saw @Giuliano Mega immediately jumping into [[Deluge]] (also @Mark sees that as a good start), so for my personal conviction we are diving into the right thing I wanted to have some arguments on why - however short and imprecise it is not. The most popular BitTorrent clients as of recent years include: @@ -16,7 +16,7 @@ The most popular BitTorrent clients as of recent years include: 3. **BitTorrent** 1. The official client from BitTorrent Inc., it shares many similarities with µTorrent since the same company owns both. 2. It’s user-friendly and offers both free and paid versions with additional features. -4. **[[Deluge (BitTorrent)]]** +4. **[[Deluge]]** 1. An open-source client known for its extensive plugin support, making it highly customizable. 2. It has a simple interface and is available for Windows, macOS, and Linux. 5. **Transmission** diff --git a/10 Notes/Comparison of BitTorrent clients.md b/10 Notes/Comparison of BitTorrent clients.md index e2d544a..fa8aa87 100644 --- a/10 Notes/Comparison of BitTorrent clients.md +++ b/10 Notes/Comparison of BitTorrent clients.md @@ -4,14 +4,18 @@ tags: --- #bittorrent -On Wikipedia we can get a decent overview of [[Learn BitTorrent|BitTorrent]] clients: https://en.wikipedia.org/wiki/Comparison_of_BitTorrent_clients. - -> It should be noted here, that Wikipedia sources are often quite outdated. - -From this comparative, it does feel appropriate to further constraint our research to the following three clients: +Recall that we focus on the following three: - [[qBittorrent]] - [[Transmission]] -- [[Deluge (BitTorrent)]] +- [[Deluge]] -All clients have similar capabilities and provide a similar set of features. Looking at look-and-feel of the website, Transmission looks the most attractive. If I would be a regular, non-dirty-hacker user, I would be attracted by [[Transmission]]. Other clients feel to me more niche, yet I would put [[qBitTorrent]] on the second position. [[Deluge (BitTorrent)]], still potentially attractive, with significantly lower user-base, being quite feature complete, still looks a bit ”hackie”. Both [[Deluge (BitTorrent)]] and [[qBittorrent]] depend on [[libtorrent-rasterbar]], while [[Transmission]] is using its own BitTorrent protocol implementation. \ No newline at end of file +Below a short summary. + +### Feature set + +- [[qBittorrent]] appears to be the most comprehensive. Has good support of BitTorrent version 2 features, supports magnet links (`v1`, `v2`, and `hybrid`) has good export capabilities, and provides access to an impressive set of options (both specific to qBittorrent and to [[libtorrent-rasterbar|libtorrent]]). It has the highest user base, looks reasonably good on Ubuntu. +- [[Transmission]] has best support on macos and looks really good there. Also provides versions based on QT and GTK. User-base similar to that of [[qBittorrent]]. Exposes limited number of settings, besides the most important protocol extensions. On the other hand, less options makes it less overwhelming for a regular user. +- [[Deluge]] - Python-based client which the lowest user base (300k, comparing to around 6M for the other two clients). Feature-wise can be placed somewhere between [[qBittorrent]] and [[Transmission]]. Because it is Phyton, it may feel easier to work with it cross-platform. Yet, feels more buggy than the other two. + +Both [[Deluge]] and [[qBittorrent]] depend on [[libtorrent-rasterbar]], while [[Transmission]] is using its own BitTorrent protocol implementation. \ No newline at end of file diff --git a/10 Notes/Deluge (BitTorrent).md b/10 Notes/Deluge (BitTorrent).md deleted file mode 100644 index 65f772c..0000000 --- a/10 Notes/Deluge (BitTorrent).md +++ /dev/null @@ -1,26 +0,0 @@ ---- -tags: - - bittorrent ---- -Deluge is a [[Learn BitTorrent|BitTorrent]] client. - -Official link: https://deluge-torrent.org. -Git: https://git.deluge-torrent.org/deluge (they are not on GitHub!) - -At [https://deluge-torrent.org/about/](https://deluge-torrent.org/about/) we can read that Deluge is able to run on headless machines with the user-interfaces being able to connect remotely from any platform. - -From @Giuliano Mega: - -I'm also finding that the python bindings are incomplete and expose only a subset of the API btw one way to approach the integration would be by going top down on how Deluge uses libtorrent and then looking at the minimum needed to get it running. The Deluge core is actually a lot simpler than I expected I think this may be less effort than trying to build the entire API from scratch on top of Codex e.g. if we can get enough to run the Deluge daemon on top of Codex, then all the rest (GTK UI, Web UI) sort of works… Other links: - -- simple experiment setup: [https://github.com/gmega/bt-experiment](https://github.com/gmega/bt-experiment) -- Deluge fork with instrumentation for metrics: [https://github.com/gmega/deluge](https://github.com/gmega/deluge) -- some notes on libtorrent: [https://hackmd.io/NhVe1A5HT92NALDufacuiA](https://hackmd.io/NhVe1A5HT92NALDufacuiA) -- how to setup a dev env with Deluge + libtorrent: [https://hackmd.io/ESDTgprbSPmViMxc5yKTiQ](https://hackmd.io/ESDTgprbSPmViMxc5yKTiQ) - -Related: - -- [https://github.com/codex-storage/nim-codex/issues/959](https://github.com/codex-storage/nim-codex/issues/959) - **Codex/BitTorrent integration** -- [https://github.com/codex-storage/nim-codex/issues/951](https://github.com/codex-storage/nim-codex/issues/951) - **Control BitTorrent** -- [Libtorrent and Deluge from sources](https://hackmd.io/ESDTgprbSPmViMxc5yKTiQ) -- [controlling Deluge using its RPC interface](https://github.com/gmega/bt-experiment/blob/c6af36b349f0211df69781233d387de229d68f62/experiment.py#L91) \ No newline at end of file diff --git a/10 Notes/Deluge.md b/10 Notes/Deluge.md new file mode 100644 index 0000000..57f8844 --- /dev/null +++ b/10 Notes/Deluge.md @@ -0,0 +1,135 @@ +--- +tags: + - bittorrent +link: https://deluge-torrent.org +source: https://git.deluge-torrent.org/deluge +related-to: + - "[[Learn BitTorrent]]" + - "[[libtorrent-rasterbar]]" +--- +#bittorrent + +| link | https://deluge-torrent.org | +| ---------- | ------------------------------------- | +| source | https://git.deluge-torrent.org/deluge | +| related-to | [[Learn BitTorrent]] | + +Deluge is a [[Learn BitTorrent|BitTorrent]] client. + +### Building + +I largely follow the instructions from [Setup tutorial for Deluge development](https://deluge.readthedocs.io/en/latest/devguide/tutorials/01-setup.html) with small changes. + +First you may need to install: + +```bash +sudo apt install intltool closure-compiler +``` + +> In my case there were already in my system, probably installed with other deps. + +My changes come from the fact that I have installed `libtorrent` from sources. See in [[libtorrent-rasterbar]] for the instructions on how to build libtorrent from sources. Then in [[libtorrent Python bindings]] I describe a follow up on how to create the Python bindings for libtorrent. There I create a python environment but I leave the location of the Python virtual environment somehow open. Now to follow the instructions here, please make sure that the `.venv` folder is at the top-level of your `deluge` directory. In principle, this directory can be put anywhere - and most probably it is better to put it outside of the `deluge` repo: here I follow the practice as given in the above mentioned [Setup tutorial for Deluge development](https://deluge.readthedocs.io/en/latest/devguide/tutorials/01-setup.html). + +Install required dependencies: + +```bash +sudo apt install python3-geoip python3-dbus python3-gi python3-gi-cairo gir1.2-gtk-3.0 gir1.2-ayatanaappindicator3-0.1 python3-pygame libnotify4 librsvg2-common xdg-utils +``` + +> Here, because we have built [[libtorrent-rasterbar|libtorrent]] from source, make sure you omit `python3-libtorrent` from the original instructions. + +Then run: + +```bash +$ source .venv/bin/activate +(deluge) $ uv pip install -e . +Resolved 22 packages in 1.15s + Built deluge @ file:///home/codex/code/deluge + Built rencode==1.0.6 +Prepared 21 packages in 5.05s +Installed 21 packages in 9ms + + attrs==24.2.0 + + automat==24.8.1 + + cffi==1.17.1 + + constantly==23.10.4 + + cryptography==43.0.3 + + deluge==2.1.1.dev127 (from file:///home/codex/code/deluge) + + hyperlink==21.0.0 + + idna==3.10 + + incremental==24.7.2 + + mako==1.3.6 + + markupsafe==3.0.2 + + pyasn1==0.6.1 + + pyasn1-modules==0.4.1 + + pycparser==2.22 + + pyopenssl==24.2.1 + + pyxdg==0.28 + + rencode==1.0.6 + + service-identity==24.2.0 + + twisted==24.10.0 + + typing-extensions==4.12.2 + + zope-interface==7.1.1 +``` + +Finally in `.venv/pyvenv.cfg` set: + +```bash +include-system-site-packages = true +``` + +Check if it runs: + +```bash +deluge-gtk +``` + +And this is how it looks like: + +![[Pasted image 20241107082113.png]] + +#### Preferences + +- Network: + ![[Pasted image 20241107095618.png]] + +- Plugins - something specific to Deluge, maybe it is a way to create a *Codex Plugin*. + This looks like a reach ecosystem: https://deluge-torrent.org/plugins/ + + ![[Pasted image 20241107095922.png]] + + +Optionally, you may prefer to leave `include-system-site-packages` set to `false`. +Following [this instructions](https://pygobject.gnome.org/getting_started.html#ubuntu-logo-ubuntu-debian-logo-debian): + +```bash +$ sudo apt install libgirepository1.0-dev gcc libcairo2-dev pkg-config python3-dev gir1.2-gtk-4.0 +$ uv pip install pycairo +$ uv pip install PyGObject +``` + +We should then be able to uninstall: + +```bash +$ sudo apt remove python3-gi python3-gi-cairo gir1.2-gtk-4.0 +``` + +Transmission does support version 2 torrent files but not version 2 magnet links (hybrid magnets are supported). +### Other notes + +At [https://deluge-torrent.org/about/](https://deluge-torrent.org/about/) we can read that Deluge is able to run on headless machines with the user-interfaces being able to connect remotely from any platform. + +From @Giuliano Mega: + +I'm also finding that the python bindings are incomplete and expose only a subset of the API btw one way to approach the integration would be by going top down on how Deluge uses libtorrent and then looking at the minimum needed to get it running. The Deluge core is actually a lot simpler than I expected I think this may be less effort than trying to build the entire API from scratch on top of Codex e.g. if we can get enough to run the Deluge daemon on top of Codex, then all the rest (GTK UI, Web UI) sort of works… Other links: + +- simple experiment setup: [https://github.com/gmega/bt-experiment](https://github.com/gmega/bt-experiment) +- Deluge fork with instrumentation for metrics: [https://github.com/gmega/deluge](https://github.com/gmega/deluge) +- some notes on libtorrent: [https://hackmd.io/NhVe1A5HT92NALDufacuiA](https://hackmd.io/NhVe1A5HT92NALDufacuiA) +- how to setup a dev env with Deluge + libtorrent: [https://hackmd.io/ESDTgprbSPmViMxc5yKTiQ](https://hackmd.io/ESDTgprbSPmViMxc5yKTiQ) + +Related: + +- [https://github.com/codex-storage/nim-codex/issues/959](https://github.com/codex-storage/nim-codex/issues/959) - **Codex/BitTorrent integration** +- [https://github.com/codex-storage/nim-codex/issues/951](https://github.com/codex-storage/nim-codex/issues/951) - **Control BitTorrent** +- [Libtorrent and Deluge from sources](https://hackmd.io/ESDTgprbSPmViMxc5yKTiQ) +- [controlling Deluge using its RPC interface](https://github.com/gmega/bt-experiment/blob/c6af36b349f0211df69781233d387de229d68f62/experiment.py#L91) \ No newline at end of file diff --git a/10 Notes/How BitTorrent-Codex integration may look like?.md b/10 Notes/How BitTorrent-Codex integration may look like?.md index dbd6223..491738e 100644 --- a/10 Notes/How BitTorrent-Codex integration may look like?.md +++ b/10 Notes/How BitTorrent-Codex integration may look like?.md @@ -1 +1,86 @@ -From [[Learn BitTorrent]] we got a decent overview how BitTorrent protocol looks like. \ No newline at end of file +--- +tags: + - bittorrent +--- +From [[Learn BitTorrent]] we got a decent overview how BitTorrent protocol looks like. + +It turns out there are many similarities between BitTorrent and [Codex](https://docs.codex.storage/learn/whitepaper) at the protocol level. + +The following diagram makes the similarity and some differences easier to observe and should be a good introduction to the further discussion: + +![[Codex_BitTorrent.svg]] + +> If you are browsing this documentation on the web, it might be more convenient to see an online version where you can freely pan and zoom the content: https://link.excalidraw.com/readonly/OOttLqCGKn5smHduI3b9. +> If you are already using obsidian, you can just access the high-res original vector graphic on your file system. + +## Discussion + +Let's briefly look at the similarities and differences between BitTorrent and Codex. + +### Content + +BitTorrent clearly focuses on (relatively) small or moderate size content. It also allows to *publish* whole directory tree. Publishing directories remains a relevant feature since it has been subject to deep changes in the version 2 of the protocol. +### Blocks + +Both Codex and BitTorrent operate on fixed size blocks on the physical layers. For BitTorrent, the block size is `16KiB` and is fixed. For Codex it is `64KiB` (and is tightly coupled with the storage proofs). Although in the end, what's get exchanged are `16KiB` blocks, `piece`, `request`, `bitfield` and `have` messages operate on so called pieces. The `piece length` is a key in the `info` dictionary and it must be a power of two and at least `16KiB`. + +### Manifest + +Both protocols use a *manifest* containing metadata regarding the content. + +### DHT + +Both Codex and BitTorrent use DHT to locate the content. They do it differently though. +In Codex, the manifest file and the dataset itself get separate CIDs (Content IDentifiers) that DHT maps to the SPR records of the announcing peers. To discover peers that store (part of) the dataset, a node first asks DHT for the SPR of the peer having the metadata file. In the metadata file, the node finds the CID to the dataset itself (basically a multihash of the corresponding merkle root), and subsequently queries DHT for the SPR of the corresponding node. After that the exchange protocol starts. + +In BitTorrent, the `infohash` - a hash of the `info` directory attribute from the manifest (or `.torrent`) file is used to directly query the DHT for the urls of the corresponding nodes. Here, we are particularly interested in the *tracker-less* configuration. In this case, the input to find the content to be downloaded (a file or a directory tree) is the `infohash` which can be delivered to the node out-of-band or, commonly, via a so-called *magnet link*: + +``` +magnet:?xt=urn:btmh: +``` + +Using the `infohash`, the node discovers the relevant peers and then uses protocol extension [BEP-9: Extension for Peers to Send Metadata Files](https://www.bittorrent.org/beps/bep_0009.html). **Supporting protocol extensions may be one of the more important BitTorrent aspects that we may need to address in Codex if we want to remain inter-operable with other clients supporting them.** + +### Resuming downloads + +In the [libtorrent tutorial](https://www.libtorrent.org/tutorial-ref.html) we can read the following: + +> Since bittorrent downloads pieces of files in random order, it's not trivial to resume a partial download. When resuming a download, the bittorrent engine must restore the state of the downloading torrent, specifically which parts of the file(s) are downloaded. There are two approaches to doing this: +> +> 1. read every piece of the downloaded files from disk and compare it against its expected hash. +> 2. save, to disk, the state of which pieces (and partial pieces) are downloaded, and load it back in again when resuming. +> +> If no resume data is provided with a torrent that's added, libtorrent will employ (1) by default. + +**Thus, BitTorrent does not have any protocol level for resuming downloads.** + +[[libtorrent-rasterbar|libtorrent]] uses so called *sessions* to manage torrent downloads. Together with [torrent_handles](https://www.libtorrent.org/reference-Torrent_Handle.html#torrent_handle) it provides the means to serialise the relevant state data. + +### Integrating Codex and BitTorrent + +Despite the conceptual similarities between Codex and BitTorrent, trying to perform some surgical cuts in the Codex protocol to make it able to talk to other BitTorrent clients does not seem to be the best way. The most intuitive approach seems to be providing a library with a similar interface as [[libtorrent-rasterbar|libtorrent]] (at least for the clients that depend on it), yet using Codex under the hood. Doing this alone, however, will allow the involved clients to interoperate only if all of them selected "Codex Durability Engine" in the settings instead of using the default BitTorrent protocol. What if a client using the default BitTorrent protocol wants to join a swarm with nodes using Codex protocol? + +![[Code-BitTorrent interoperability.svg]] + + +The most logical (so it seems) approach will be to expect that the user will either: + +1. use two separate clients: one for regular BitTorrent downloads, and one for the clients using Codex under the hood. E.g. the users can use regular BitTorrent clients to do best-effort download of the content they are interested in, and then use Codex for more reliable, durable storage options. If we can, with Codex, achieve better performance and stability than BitTorrent, we can gradually see users transitioning to a more reliable protocol. +2. use one client with integrated Codex as an optional setting/plugin. This way, the regular BotTorrent Swarms will be handled with the standard BitTorrent protocol, while, if enabled, at the same time we will be seeding and download the content from the clients having Codex protocol available. Further extending the client with the Marketplace support, we can not only provide durability, but also handle other scenarios like backups, or high-volume storage. + +Option (2) seems to be more pragmatic and moreover it makes us more independent from the details of the [[libtorrent-rasterbar|libtorrent]] API (or any other BitTorrent library out there). The following drawing shows three how we could bridge the the two communities: + +![[Code-BitTorrent interoperability-3.svg]] + +The nodes in the middle take care for making the content discoverable on both protocols, while the node on the edges are operating using native protocols. + +Two DHTs thus. This duplication allows us to keep the original protocols intact. Mostly: we still have to look at things that BitTorrent *has* and we Codex *doesn't* and what other BitTorrent protocol extensions may have serious impact on the relative performance. + +Can we use one DHT. No. Unless we want to switch the whole exchange protocol to the one of BitTorrent - and we should have ambition to perform better than BitTorrent. + +But imagine, we want to "hot-swap" the BitTorrent protocol with Codex in such a way that the original client (UI/CLI) remains unchanged, supporting all the original BitTorrent options. I fail to find a clear advantage of such approach, especially that it does not seem to improve intra-operability: original BitTorrent clients will still not be able to talk to the *Codexified* BitTorrent clients - their exchange protocols, and their DHTs will be different. + +What does make sense though, is take [[libtorrent-rasterbar]] and BitTorrent protocol extensions as a compass to understand what we need to add/change in the Codex API to support the functionalities that BitTorrent users expect. Here, we will have two groups: + +1. Things that BitTorrent clients provide and what is not available in Codex - e.g. resuming downloads, session management, or bandwidth utilisation, and - let's call them this way - *integrations* that allow the users to observe and take actions based on the internal protocol events. Some of those things we describe already above, the rest we will cover in the section [[What BitTorrent has that Codex does not]]. +2. Low level protocol options penetrating downs to the peer-exchange level. We could take this part as an additional study in order to search for potential improvement points. Some if not all of this work has been done when designing Codex protocol, yet some aspects may have been missed. This part in turn will be covered in [[APIs and protocol optimisations]]. \ No newline at end of file diff --git a/10 Notes/How big the effort would be?.md b/10 Notes/How big the effort would be?.md new file mode 100644 index 0000000..f0216e9 --- /dev/null +++ b/10 Notes/How big the effort would be?.md @@ -0,0 +1,6 @@ +We can only start answering it after we: + +1. Revise the Codex peer-exchange protocol and see which BitTorrent extensions we want to adopt in Codex. +2. Settle down on the which Open Source client we want to use. + +Once we have this, we can start incrementally realise the Codex/BitTorrent integration. E.g. with can start with integrating Codex as it is and then incrementally add extensions we want to support. This approach would allow us to bootstrap the session management. diff --git a/10 Notes/Is it all worth it?.md b/10 Notes/Is it all worth it?.md new file mode 100644 index 0000000..cfb82af --- /dev/null +++ b/10 Notes/Is it all worth it?.md @@ -0,0 +1,8 @@ +Some remarks/questions to consider. + +1. BitTorrent is old technology. Open Source BitTorrent clients also feel a bit archaic (old Boost code). +2. Is the community behind BitTorrent the community we want? Are the current BitTorrent user interested in anything else than dirty stuff? Does BitTorrent have any serious use besides streaming pirated content? Are the current BitTorrent user egger to pay? If we cannot draw constructive conclusions from those questions, then it does not really matter how many users are currently behind BitTorrent. This is even more important for the Open Source clients covering only portion of the traffic. +3. BitTorrent has a bit dark history. Also past occurrences where malware or mining software ended up on users computers. Current ad-based marketing model also makes impression that the community is maybe in worse condition than usage numbers try to cover. +4. What is the future of BitTorrent. Is the technology on its decay, or do we want to believe it is actually coming back to live (I was not really able to answer that question myself). + +In the end we also need to ask what are the benefits for the BitTorrent user when enabling Codex/Marketplace integration. This is mostly the durability of the data and opportunity for storage providers to monetise, which contrasting with BitTorrent's best-effort usage. diff --git a/10 Notes/Learn BitTorrent.md b/10 Notes/Learn BitTorrent.md index d54c427..c4410dc 100644 --- a/10 Notes/Learn BitTorrent.md +++ b/10 Notes/Learn BitTorrent.md @@ -19,13 +19,20 @@ In order to imagine, what do we mean by *BitTorrent - Codex Integration*, we nee BitTorrent spec is build incrementally from so called [BitTorrent Enhancement Proposals (BEPs)](http://bittorrent.org/beps/bep_0000.html). Each BEP adds something to the BitTorrent Protocol. The most important BEPs to study in order to get a good initial grip on the BitTorrent protocol are: -- [BEP3 - The BitTorrent Protocol Specification](http://bittorrent.org/beps/bep_0003.html) +- [[BEP3 - The BitTorrent Protocol Specification]] +- [[BEP52 - The BitTorrent Protocol Specification v2]] - [[BEP5 - DHT Protocol]] -- [BEP9 - Extension for Peers to Send Metadata Files](http://bittorrent.org/beps/bep_0009.html) -- [BEP10 -Extension Protocol](http://bittorrent.org/beps/bep_0010.html), see also [extension protocol for bittorrent](https://www.rasterbar.com/products/libtorrent/extension_protocol.html) -- [BEP11 - Peer Exchange (PEX)](http://bittorrent.org/beps/bep_0011.html) -- [BEP23 - Tracker Returns Compact Peer Lists](http://bittorrent.org/beps/bep_0023.html) -- [BEP29 - uTorrent transport protocol](http://bittorrent.org/beps/bep_0029.html) +- [BEP9 - Extension for Peers to Send Metadata Files](https://bittorrent.org/beps/bep_0009.html) +- [BEP10 -Extension Protocol](https://bittorrent.org/beps/bep_0010.html), see also [extension protocol for BitTorrent](https://www.rasterbar.com/products/libtorrent/extension_protocol.html) +- [BEP11 - Peer Exchange (PEX)](https://bittorrent.org/beps/bep_0011.html) +- [BEP23 - Tracker Returns Compact Peer Lists](https://bittorrent.org/beps/bep_0023.html) +- [BEP29 - uTorrent transport protocol](https://bittorrent.org/beps/bep_0029.html) + +### libtorrent + +[[libtorrent-rasterbar|libtorrent]] is also an excellent source of information about the protocol and it will be the main source of learning the APIs and potential integration points. + +> The further you go the more you realise that libtorrent **is** probably the best source of information about the BitTorrent protocol. Perhaps one can even say that most of the protocol developments in the are of BitTorrent happens in libtorrent. Simply speaking, libtorrent is BitTorrent. ### Papers @@ -33,7 +40,6 @@ Selection of some more important BitTorrent papers: 1. [[Incentives Build Robustness in BitTorrent]] - original "BitTorrent" paper by [[Bram Cohen]]. 2. [[The Bittorrent P2P File-Sharing System - Measurements And Analysis]] - Paper from the creators of the [[Tribler]] protocol. - ### Books Not so many decent books about the BitTorrent protocol. You can find some *publications* from Springer and occasionally IEEE that focus on some aspects of BitTorrent or its performance. @@ -43,7 +49,6 @@ Below some recommendations: 1. [[The World of Peer-to-Peer (P2P)]]. Community book, free. 2. [[BitTorrent chapter from book P2P and Grids to Services on the Web]]. - ### BitTorrent Token Not sure how to categorise this, especially it is just in the area of the BitTorrent ambitions. Yet, clearly, they want to be on the same market as we are: [Whitepaper](https://www.bittorrent.com/btt/btt-docs/BitTorrent_(BTT)_White_Paper_v0.8.7_Feb_2019.pdf). diff --git a/10 Notes/Modern Python Tools.md b/10 Notes/Modern Python Tools.md new file mode 100644 index 0000000..59fb37f --- /dev/null +++ b/10 Notes/Modern Python Tools.md @@ -0,0 +1,18 @@ +--- +tags: + - python/tools + - python/virtualenv +related-to: + - "[[Deluge]]" +--- +#python/tools #python/virtualenv + +| related-to | [[Deluge]] | +| ---------- | ----------------------- | + +Some modern Python tooling to look at: + +- [uv](https://docs.astral.sh/uv/), +- [pipx](https://pipx.pypa.io/stable/) +- [tox - automation project](https://tox.wiki/en/4.23.0/index.html) + diff --git a/10 Notes/Protocol v1 clarifications.md b/10 Notes/Protocol v1 clarifications.md new file mode 100644 index 0000000..e9787e2 --- /dev/null +++ b/10 Notes/Protocol v1 clarifications.md @@ -0,0 +1,16 @@ +--- +tags: + - bittorrent +related-to: + - "[[Learn BitTorrent]]" + - "[[libtorrent-rasterbar]]" +--- +#bittorrent + +| related-to | [[Learn BitTorrent]], [[libtorrent-rasterbar]], [[BitTorrent DHT clarifications]] | +| ---------- | --------------------------------------------------------------------------------- | + +Here I include some clarifying notes. I found the BitTorrent documents’ language quite far from being formal and it looks like one can only learn the detail by looking into some “reference” implementation (currently that reference is most probably [[libtorrent-rasterbar]]). + +Before doing that, I used ChatGPT to see how much clarification we can get with its help. It looks that ChatGPT has pretty good “understanding” of BitTorrent and one can find its explanations way more formal and precise than the original spec (or it is just more intelligent than I am…). Thus, in this section I include some conversations with ChatGPT including my questions (which I want to believe also capture some interesting points). I am providing a similar transcript for the [[Protocol v2 clarifications]], and for [[BitTorrent DHT clarifications]]. + diff --git a/10 Notes/Transmission.md b/10 Notes/Transmission.md index e6670c9..049610d 100644 --- a/10 Notes/Transmission.md +++ b/10 Notes/Transmission.md @@ -8,12 +8,12 @@ related-to: #bittorrent -| link | https://transmissionbt.com | -| ---------- | ------------------------------------------- | +| link | https://transmissionbt.com | +| ---------- | -------------------------------------------- | | source | https://github.com/transmission/transmission | -| related-to | | +| related-to | | -[[Learn BitTorrent]] client. One of the three popular clients, the other are [[Deluge (BitTorrent)]] and [[qBittorrent]], with the most attractive looking website. Comparing to the other two, Transmission does not depend on [[libtorrent-rasterbar]], but rather has its own implementation of the BitTorrent protocol. +[[Learn BitTorrent]] client. One of the three popular clients, the other are [[Deluge]] and [[qBittorrent]], with the most attractive looking website. Comparing to the other two, Transmission does not depend on [[libtorrent-rasterbar]], but rather has its own implementation of the BitTorrent protocol. > As noted in GitHub’s REDME, [Transmission's documentation](https://github.com/transmission/transmission/blob/main/docs/README.md) is currently out-of-date, but the team has recently begun a new project to update it and is looking for volunteers. If you're interested, please feel free to submit pull requests! @@ -23,11 +23,26 @@ The latest release is from 30 May 2024. ### How does it look like? -TBD... +![[Pasted image 20241107085814.png]] +or in light mode: + +![[Pasted image 20241107091409.png]] +![[Pasted image 20241107092811.png]] + +Transmission does not seem to support version 2 torrent files and magnet links. + +#### Settings + +- Network + ![[Pasted image 20241107094119.png]] +- Peers + ![[Pasted image 20241107094219.png]] +Transmission has the most limited settings that are available to the user. +There are also GTK and QT versions, which on a Mac look just terrible... (I did not yet check how they build and look on ubuntu). ### Building -Some loose notes for now... TBD.. +Some loose notes for now... TBF... ```bash CMake Error in gtk/CMakeLists.txt: diff --git a/10 Notes/What BitTorrent has that Codex does not.md b/10 Notes/What BitTorrent has that Codex does not.md new file mode 100644 index 0000000..26f09e9 --- /dev/null +++ b/10 Notes/What BitTorrent has that Codex does not.md @@ -0,0 +1,63 @@ +--- +tags: + - bittorrent +--- +#bittorrent + +### BitTorrent protocol extensions + +If we look at the settings in all three clients, we will easily find something all three BitTorrent client I investigated has some must-haves: + +![[protocol-extensions.png]] + +All have the following options (all can be enabled or disabled): + +- DHT support, +- peer-exchange or PEX, +- Local Peer Discovery +- Encryption + +Additionally, Deluge lists here: + +- NAT PMP (Port Mapping Protocol), +- UPnP (Universal Plug and Play), + +### DHT + +In our case DHT needs to be always on unless we want to support centralised trackers. We already discuss some differences in the DHT usage in [[How BitTorrent-Codex integration may look like?]]. + +### Peer-Exchange or PEX + +Peer Exchange (PEX) provides an alternative peer discovery mechanism for swarms once peers have bootstrapped via other mechanisms such as DHT or Tracker announces. + +It provides a more up-to-date view of the swarm than most other sources and also reduces the need to query other sources frequently. + +We do not seem to have anything like this in Codex... We just use [BitSwap](https://specs.ipfs.tech/bitswap-protocol/). + +See [BEP11 - Peer Exchange (PEX)](https://www.bittorrent.org/beps/bep_0011.html). + +### Local Service Discovery + +Another addition to the BitTorrent protocol that Codex does not seem to have. Local Service Discovery (LSD) or Local Peer Discovery as it is named by sometimes, provides a SSDP-like (http over udp-multicast) mechanism to announce the presence in specific swarms to local neighbours. See [BEP14 - Local Service Discovery](https://www.bittorrent.org/beps/bep_0014.html). + +In a way, it is related to NAT PMP. + +### NAT PMP (Port Mapping Protocol) + +NAT Port Mapping Protocol (NAT-PMP) is a network protocol for establishing network address translation (NAT) settings and port forwarding configurations automatically without user effort. + +I do not believe we implement support that in our clients. Do we? + +### uTorrent transport protocol (uTP) + +What is not shown on the picture above and what is also supported by all BitTorrent clients is [BEP29 - uTorrent transport protocol](https://www.bittorrent.org/beps/bep_0029.html). + +A transport protocol with delay based congestion control. See separate [article](http://www.libtorrent.org/utp.html). The motivation for uTP is for BitTorrent clients to not disrupt internet connections, while still utilizing the unused bandwidth fully. + +We do not have this in Codex as far as I know. + +### Superseeding + +The superseeding ([BEP16](https://www.bittorrent.org/beps/bep_0016.html)) is a feature designed to help a torrent initiator with limited bandwidth to "pump up" a large torrent, reducing the amount of data it needs to upload in order to spawn new seeds in the torrent. + +Simply speaking, the node has ability to "focus" on the content it really wants to get out, without loosing bandwidth for multiple smaller seeds. *Super-seed mode is **NOT** recommended for general use.* \ No newline at end of file diff --git a/10 Notes/What are existing (open source) BitTorrent clients?.md b/10 Notes/What are existing (open source) BitTorrent clients?.md new file mode 100644 index 0000000..4c665c2 --- /dev/null +++ b/10 Notes/What are existing (open source) BitTorrent clients?.md @@ -0,0 +1,17 @@ +--- +tags: + - bittorrent +--- +#bittorrent + +On Wikipedia we can get a decent overview of [[Learn BitTorrent|BitTorrent]] clients: https://en.wikipedia.org/wiki/Comparison_of_BitTorrent_clients. + +> It should be noted here, that Wikipedia sources are often quite outdated. + +Three BitTorrent clients stick out based on their popularity and feature set: + +- [[qBittorrent]] +- [[Transmission]] +- [[Deluge]] + +We summarise them all shortly in [[Comparison of BitTorrent clients]]. \ No newline at end of file diff --git a/10 Notes/libtorrent Python bindings.md b/10 Notes/libtorrent Python bindings.md new file mode 100644 index 0000000..4197bda --- /dev/null +++ b/10 Notes/libtorrent Python bindings.md @@ -0,0 +1,61 @@ + +> If you are here because you are trying to build Deluge client, before proceeding, please check build instructions in [[Deluge]] regarding the location of the python virtual environment folder (`.venv` in this case). + +To build a debug version with python bindings, first (re)build and (re)install the debug target: + +```bash +b2 crypto=openssl cxxstd=14 debug +sudo b2 install --prefix=/usr/local +``` + +Then check [[Modern Python Tools|Python Tools]], and install `uv`: + +```bash +curl -LsSf https://astral.sh/uv/install.sh | sh +``` + +Then create a virtual env indicating the python version you want to use: + +```bash +uv venv --python 3.12.3 +``` + +It will create a folder called `.venv` in your current folder. + +Activate the environment: + +```bash +source .venv/bin/activate +``` + +> Make sure you have the correct path to the `.venv` folder, If you are not in the folder directly containing `.venv` folder. + +Then, install `setuptools`: + +```bash +uv pip install setuptools +``` + +Also make sure that you have: + +```bash +sudo apt install libboost-python-dev +``` + +as specified in [libtorrent python binding](http://www.libtorrent.org/python_binding.html) in *prerequisites*. + +Finally, from `bindings/python` run: + +```bash +python setup.py build_ext --b2-args=variant=debug install +``` + +Python bindings should be ready to use. You should be able to test it with: + +```bash +python +Python 3.12.3 (main, Sep 11 2024, 14:17:37) [GCC 13.2.0] on linux +Type "help", "copyright", "credits" or "license" for more information. +>>> import libtorrent +>>> +``` diff --git a/10 Notes/libtorrent-rasterbar.md b/10 Notes/libtorrent-rasterbar.md index 996ac2e..123d189 100644 --- a/10 Notes/libtorrent-rasterbar.md +++ b/10 Notes/libtorrent-rasterbar.md @@ -4,7 +4,7 @@ tags: link: http://www.libtorrent.org/ source: https://github.com/arvidn/libtorrent related-to: - - "[[Deluge (BitTorrent)]]" + - "[[Deluge]]" - "[[qBittorrent]]" --- #bittorrent @@ -12,9 +12,25 @@ related-to: | link | http://www.libtorrent.org/ | | ---------- | ---------------------------------------- | | source | https://github.com/arvidn/libtorrent | -| related-to | [[Deluge (BitTorrent)]], [[qBittorrent]] | +| related-to | [[Deluge]], [[qBittorrent]] | -Open Source BitTorrent library. +Open Source BitTorrent library. The place to go to learn about the internals of the BitTorrent protocol, to find clarifications and answers to question you cannot find elsewhere. + +### Selected Docs + +- [BitTorrent v2](https://blog.libtorrent.org/2020/09/bittorrent-v2/)- after you know the foundation (e.g. [[BEP3 - The BitTorrent Protocol Specification]], [[BEP52 - The BitTorrent Protocol Specification v2]], [[Learn BitTorrent]]) this already 3 years old document is still probably the best in overview of the changes introduced in the BitTorrent protocol version 2. +- [libtorrent intorduction](https://libtorrent.org/features-ref.html) and some followup links from here: + - [DHT Extensions](https://www.libtorrent.org/dht_extensions.html) and [BEP5](https://www.bittorrent.org/beps/bep_0005.html) + - [Extension Protocol](https://libtorrent.org/extension_protocol.html), [libtorrent overview (aka manual)](https://libtorrent.org/manual-ref.html), [BEP10 - Extension Protocol](https://www.bittorrent.org/beps/bep_0010.html) + - uTorrent metadata transfer protocol [BEP 9](https://www.bittorrent.org/beps/bep_0009.html) (i.e. magnet links). + - uTP implementation ([BEP 29](https://www.bittorrent.org/beps/bep_0029.html)). See separate [article](https://libtorrent.org/utp.html). +- [libtorrent tutorial](https://libtorrent.org/tutorial-ref.html) - a good to place to get some intuition on how the library is used. +- [examples](https://libtorrent.org/examples.html) - good as a follow up to the intro. + +### Other intersting docs + +- [Question about Bittorrent V2 File Hashes #7604](https://github.com/arvidn/libtorrent/discussions/7604) - a good clarification about blocks, padding, and "piece layers". +- From [BitTorrent v2](https://blog.libtorrent.org/2020/09/bittorrent-v2/) mentioned above, we can see how impactful libtorrent is for the BitTorrent protocol version 2: [# Draft: base protocol with merkle trees and new hash algorithms](https://github.com/bittorrent/bittorrent.org/pull/59) Languages: @@ -23,9 +39,22 @@ Languages: ### Building I have followed the instructions from [https://www.libtorrent.org/building.html](https://www.libtorrent.org/building.html). -I've followed sections *building from git*, *building with boost build*, and then immediately *Step 4: Installing libtorrent*. Documentation not really clear - hard to see the logical order. +Relevant sections: +- *building from git*, +- *building with boost build*, and then immediately: +- *Step 4: Installing libtorrent*. -Building steps: +Documentation not really clear - hard to see the logical order. + +### Summary of the building steps + +Before anything, install pre-requisites: + +```bash +sudo apt install build-essential cmake git ninja-build pkg-config libboost-dev libssl-dev zlib1g-dev libgl1-mesa-dev +``` + +Then follow with: ```bash git clone --recurse-submodules https://github.com/arvidn/libtorrent.git @@ -34,3 +63,5 @@ echo "using gcc ;" >>~/user-config.jam b2 crypto=openssl cxxstd=14 release sudo b2 install --prefix=/usr/local ``` + +See also [[libtorrent Python bindings]]. \ No newline at end of file diff --git a/10 Notes/ping.md b/10 Notes/ping.md new file mode 100644 index 0000000..1999089 --- /dev/null +++ b/10 Notes/ping.md @@ -0,0 +1,38 @@ +--- +tags: + - bittorrent + - dht +related-to: + - "[[BEP5 - DHT Protocol]]" +--- +#bittorrent #dht + +| related-to | [[BEP5 - DHT Protocol]] | +| ---------- | ----------------------- | + +`ping` query is used to check if another node (one from its DHT routing table) is online and reachable. + +### ChatGPT clarifications + +### Question 1 + +In a DHT `ping` query, there is only one argument, the `id` of the “sender”. How the node knows where to send the ping to? Does it send it to all nodes in its routing table? +#### Answer + +In a DHT ping query, the purpose is for one node to check if another node is still online and reachable. The ping query has only one argument, the id of the sender, which identifies the node that initiated the ping. + +**How the Ping Target is Chosen** + +The ping query is not sent to all nodes in the routing table. Instead, it is typically sent to: + +- **A Specific Node**: The node chooses one specific target node from its routing table, usually one it hasn’t heard from recently or one that it wants to verify as active. + +- **Nodes to Maintain the Routing Table**: Nodes also use pings to verify or refresh nodes in their routing table, ensuring that inactive nodes are removed and replaced by active ones. + +**When a Ping is Sent** + +1. **Maintaining the Routing Table**: Periodically, nodes check entries in their routing table by pinging other nodes, especially if they haven’t been active recently. This ensures that stale entries are updated or removed. + +2. **As Part of Lookup or Join Procedures**: During node lookups or network joins, nodes may use pings to verify the status of other nodes before including them in the results or adding them to their own routing table. + +In summary, a ping is directed to specific nodes rather than all nodes in the routing table. This selective pinging keeps the routing table accurate without creating unnecessary network traffic. \ No newline at end of file diff --git a/10 Notes/qBittorrent.md b/10 Notes/qBittorrent.md index 6ccee3a..ce2d87b 100644 --- a/10 Notes/qBittorrent.md +++ b/10 Notes/qBittorrent.md @@ -26,6 +26,8 @@ Note about macOS support: I am not sure if that information is completely up-to-date. I did not test building the client on macOS, yet, there seem to be instructions available: [Compilation macOS (x86_64, arm64, cross compilation)](https://github.com/qbittorrent/qBittorrent/wiki/Compilation-macOS-(x86_64,-arm64,-cross-compilation)). +This is the most advanced and comprehensive open source client and has similar user coverage as [[Transmission]]. + Languages: ![[Pasted image 20241023155058.png]] @@ -34,6 +36,23 @@ Languages: ![[qBittorrent-ubuntu.png]] +#### Preferences + +- Connection + + ![[Pasted image 20241107093730.png]] + +- BitTorrent + + ![[Pasted image 20241107093241.png]] + +- Advanced + + ![[Pasted image 20241107093434.png]] + + See also [settings_pack](https://www.libtorrent.org/reference-Settings.html) and [Explanation of Options in qBittorrent](https://github.com/qbittorrent/qBittorrent/wiki/Explanation-of-Options-in-qBittorrent#Advanced). + + ### Building Machine: [[Linux Machine]]. @@ -49,10 +68,11 @@ sudo apt install --no-install-recommends qtbase5-dev qttools5-dev libqt5svg5-dev Unfortunately, this did not work. When subsequently running: ```bash -cmake -G "Ninja" -B build -DCMAKE_BUILD_TYPE=RelWithDebInfo -DCMAKE_INSTALL_PREFIX=/usr/local +$ cmake -G "Ninja" -B build -DCMAKE_BUILD_TYPE=RelWithDebInfo -DCMAKE_INSTALL_PREFIX=/usr/local +$ cmake --build build ``` -I was getting the following error: +I was getting the following error already with the first command above: ```bash CMake Error at cmake/Modules/CheckPackages.cmake:49 (find_package): @@ -88,4 +108,8 @@ set(Qt6_DIR "~/Qt/6.8.0/gcc_64/lib/cmake/Qt6/") set(Qt6GuiTools_DIR "~/Qt/6.8.0/gcc_64/lib/cmake/Qt6GuiTools/") ``` -After that I was able to successfully compile the client. +After that I was able to successfully compile the client. I skipped installation step: + +```bash +$ sudo cmake --install build +``` diff --git a/90 Extras/91 Templates/Page Template.md b/90 Extras/91 Templates/Page Template.md new file mode 100644 index 0000000..57b7726 --- /dev/null +++ b/90 Extras/91 Templates/Page Template.md @@ -0,0 +1,9 @@ +--- +tags: + - bittorrent +related-to: [] +--- +#bittorrent + +| related-to | | +| ---------- | --- | diff --git a/90 Extras/91 Templates/Q & A.md b/90 Extras/91 Templates/Q & A.md new file mode 100644 index 0000000..d945d51 --- /dev/null +++ b/90 Extras/91 Templates/Q & A.md @@ -0,0 +1,6 @@ +### Question 1 + +Question…. +#### Answer + +Answer… \ No newline at end of file diff --git a/90 Extras/91 Templates/Authors Table.md b/90 Extras/91 Templates/Table Template.md similarity index 75% rename from 90 Extras/91 Templates/Authors Table.md rename to 90 Extras/91 Templates/Table Template.md index 1e0be3e..8e9d7cf 100644 --- a/90 Extras/91 Templates/Authors Table.md +++ b/90 Extras/91 Templates/Table Template.md @@ -1,5 +1,5 @@ -| authors | | +| source | | | ---------- | --- | | link | | | related-to | | diff --git a/90 Extras/92 Assets/Code-BitTorrent interoperability-3.svg b/90 Extras/92 Assets/Code-BitTorrent interoperability-3.svg new file mode 100644 index 0000000..3f77cc9 --- /dev/null +++ b/90 Extras/92 Assets/Code-BitTorrent interoperability-3.svg @@ -0,0 +1,11 @@ + + + + + + + + BitTorrentBitTorrentDHTDHTCodexBitTorrentCodexBitTorrentCodexBitTorrentCodexCodexCodex WorldBitTorrent World \ No newline at end of file diff --git a/90 Extras/92 Assets/Code-BitTorrent interoperability.svg b/90 Extras/92 Assets/Code-BitTorrent interoperability.svg new file mode 100644 index 0000000..e01abf6 --- /dev/null +++ b/90 Extras/92 Assets/Code-BitTorrent interoperability.svg @@ -0,0 +1,11 @@ + + + + + + + + CodexCodexCodexBitTorrentBitTorrentDHTDHT?? \ No newline at end of file diff --git a/90 Extras/92 Assets/Codex_BitTorrent.svg b/90 Extras/92 Assets/Codex_BitTorrent.svg new file mode 100644 index 0000000..38dcd82 --- /dev/null +++ b/90 Extras/92 Assets/Codex_BitTorrent.svg @@ -0,0 +1,10 @@ + + + + + + + + FILE— OR —Dictionary TreeFILEBlocks(16KiB)pieces(power of 2>= 16KiB)piece layersMerkle TreeMManifest:- info - piece length - file tree- piece layersmultihash of bencodedform of the info dictionarymagnet:?xt=urn:btmh:<tagged-info-hash>&dn=<name>&tr=<tracker-url>optionalmagnet linkDHTget_peerspeersDownloading/uploadingcontent from/to peersBitTorrent v2CodexFILEBlocks(64KiB)Merkle TreeMManifest:- content size (bytes)- block size (64KiB)- Tree CIDget_peersTree CIDManifest CIDDHTSPRManifest CIDTree CIDSPRDownloading/uploadingcontent from/to peersWe could also usein magnet linksManifest CIDmagnet:?xt=urn:btmh:< >&dn=<name>&tr=<tracker-url>optionalmagnet linkManifest CIDinfohashinfohashURLCodex MarketplaceFILEBlocks(64KiB)Merkle TreeTree CIDSlotsEC SlotsStorage ProofsMerkle TreeMarketplace- Redundancy- Auditing - Repair- Incentives,- DispersalMManifest:- content size (bytes)- block size (64KiB)- Tree CIDget_peersManifest CIDDHTSPRManifest CIDTree CIDSPRDownloading/uploadingcontent from/to peersWe could also usein magnet linksManifest CIDmagnet:?xt=urn:btmh:< >&dn=<name>&tr=<tracker-url>optionalmagnet linkManifest CID \ No newline at end of file diff --git a/90 Extras/92 Assets/Pasted image 20241105084831.png b/90 Extras/92 Assets/Pasted image 20241105084831.png new file mode 100644 index 0000000..17f56d6 Binary files /dev/null and b/90 Extras/92 Assets/Pasted image 20241105084831.png differ diff --git a/90 Extras/92 Assets/Pasted image 20241107082113.png b/90 Extras/92 Assets/Pasted image 20241107082113.png new file mode 100644 index 0000000..94e833b Binary files /dev/null and b/90 Extras/92 Assets/Pasted image 20241107082113.png differ diff --git a/90 Extras/92 Assets/Pasted image 20241107085814.png b/90 Extras/92 Assets/Pasted image 20241107085814.png new file mode 100644 index 0000000..c6c5bef Binary files /dev/null and b/90 Extras/92 Assets/Pasted image 20241107085814.png differ diff --git a/90 Extras/92 Assets/Pasted image 20241107091409.png b/90 Extras/92 Assets/Pasted image 20241107091409.png new file mode 100644 index 0000000..2e0e948 Binary files /dev/null and b/90 Extras/92 Assets/Pasted image 20241107091409.png differ diff --git a/90 Extras/92 Assets/Pasted image 20241107091551.png b/90 Extras/92 Assets/Pasted image 20241107091551.png new file mode 100644 index 0000000..2e0e948 Binary files /dev/null and b/90 Extras/92 Assets/Pasted image 20241107091551.png differ diff --git a/90 Extras/92 Assets/Pasted image 20241107092811.png b/90 Extras/92 Assets/Pasted image 20241107092811.png new file mode 100644 index 0000000..5f88645 Binary files /dev/null and b/90 Extras/92 Assets/Pasted image 20241107092811.png differ diff --git a/90 Extras/92 Assets/Pasted image 20241107093241.png b/90 Extras/92 Assets/Pasted image 20241107093241.png new file mode 100644 index 0000000..79e3425 Binary files /dev/null and b/90 Extras/92 Assets/Pasted image 20241107093241.png differ diff --git a/90 Extras/92 Assets/Pasted image 20241107093434.png b/90 Extras/92 Assets/Pasted image 20241107093434.png new file mode 100644 index 0000000..b7dcf88 Binary files /dev/null and b/90 Extras/92 Assets/Pasted image 20241107093434.png differ diff --git a/90 Extras/92 Assets/Pasted image 20241107093730.png b/90 Extras/92 Assets/Pasted image 20241107093730.png new file mode 100644 index 0000000..537f486 Binary files /dev/null and b/90 Extras/92 Assets/Pasted image 20241107093730.png differ diff --git a/90 Extras/92 Assets/Pasted image 20241107094119.png b/90 Extras/92 Assets/Pasted image 20241107094119.png new file mode 100644 index 0000000..4f0388b Binary files /dev/null and b/90 Extras/92 Assets/Pasted image 20241107094119.png differ diff --git a/90 Extras/92 Assets/Pasted image 20241107094219.png b/90 Extras/92 Assets/Pasted image 20241107094219.png new file mode 100644 index 0000000..7d1e45f Binary files /dev/null and b/90 Extras/92 Assets/Pasted image 20241107094219.png differ diff --git a/90 Extras/92 Assets/Pasted image 20241107095618.png b/90 Extras/92 Assets/Pasted image 20241107095618.png new file mode 100644 index 0000000..97ec0bc Binary files /dev/null and b/90 Extras/92 Assets/Pasted image 20241107095618.png differ diff --git a/90 Extras/92 Assets/Pasted image 20241107095922.png b/90 Extras/92 Assets/Pasted image 20241107095922.png new file mode 100644 index 0000000..2c73766 Binary files /dev/null and b/90 Extras/92 Assets/Pasted image 20241107095922.png differ diff --git a/90 Extras/92 Assets/Pasted image 20241110195407.png b/90 Extras/92 Assets/Pasted image 20241110195407.png new file mode 100644 index 0000000..b7dcf88 Binary files /dev/null and b/90 Extras/92 Assets/Pasted image 20241110195407.png differ diff --git a/90 Extras/92 Assets/Pasted image 20241110195544.png b/90 Extras/92 Assets/Pasted image 20241110195544.png new file mode 100644 index 0000000..fe1fc33 Binary files /dev/null and b/90 Extras/92 Assets/Pasted image 20241110195544.png differ diff --git a/90 Extras/92 Assets/Pasted image 20241110200105.png b/90 Extras/92 Assets/Pasted image 20241110200105.png new file mode 100644 index 0000000..3c58970 Binary files /dev/null and b/90 Extras/92 Assets/Pasted image 20241110200105.png differ diff --git a/90 Extras/92 Assets/Pasted image 20241110200233.png b/90 Extras/92 Assets/Pasted image 20241110200233.png new file mode 100644 index 0000000..1609977 Binary files /dev/null and b/90 Extras/92 Assets/Pasted image 20241110200233.png differ diff --git a/90 Extras/92 Assets/Pasted image 20241110200313.png b/90 Extras/92 Assets/Pasted image 20241110200313.png new file mode 100644 index 0000000..70948d6 Binary files /dev/null and b/90 Extras/92 Assets/Pasted image 20241110200313.png differ diff --git a/90 Extras/92 Assets/protocol-extensions.png b/90 Extras/92 Assets/protocol-extensions.png new file mode 100644 index 0000000..653b71d Binary files /dev/null and b/90 Extras/92 Assets/protocol-extensions.png differ diff --git a/README.md b/README.md index 4f037d1..8c1c4cc 100644 --- a/README.md +++ b/README.md @@ -14,21 +14,26 @@ source: https://github.com/marcinczenko/bittorrent-codex-docs In this repository your will find a complete documentation of our BitTorrent/Codex investigation. +## How to read that document + +The most important part to read are the 3 points at the beginning of the [Objectives](#objectives) section below followed by a introductory [[Is it all worth it?]]. I tried to keep that part concise yet informative. + +On the way, you will find a number references to a more detailed, sometimes low-level, and technical content. I hope this will be useful supportive information that will help us decide the next steps and assist in making more informative, strategic decisions about Codex/BitTorrent integration. + ### Objectives I was trying to answer the following questions: 1. [[How BitTorrent-Codex integration may look like? ]] 2. [[What are existing (open source) BitTorrent clients?]] -3. [[What are the challenges ahead and how big the effort would be?]] +3. [[How big the effort would be?]] Answering those three questions should give us answer to the big ultimate question: [[Is it all worth it?]] -Getting there in short time is not easy, especially without recent hands-on experience with the BitTorrent protocol. Thus in this quest, the following actions are being taken: +Getting there in short time is not easy, especially without recent hands-on experience with the BitTorrent protocol. For this reason, in the quest, the following two actions has being taken: -1. Identity [[Learn BitTorrent|the best resources to learn BitTorrent protocol]], to get some intuition about the protocol. This is mostly to answer the question number (1) above. +1. Identify [[Learn BitTorrent|the best resources to learn BitTorrent protocol]], to get some intuition about the protocol. This is mostly to answer the question number (1) above. 2. [[Comparison of BitTorrent clients|Compare existing BitTorrent clients]], trying to focus on those that are open sourced. We want to learn about their market share, how do they look and feel, and how easy would it be to work with their codebase. -3. Chose one of the client, and try to cat it through and learn how easy is to do what we think we want to do. What are the obstacles, and unexpected difficulties? This is the major part of our investigation. ### Obsidian