From 08233a7ce2ce1d922c82ec178e82610c72f9feed Mon Sep 17 00:00:00 2001 From: Sergei Tikhomirov Date: Fri, 13 Oct 2023 16:02:12 +0200 Subject: [PATCH 01/18] create incentivization-outline.md --- incentivization-outline.md | 230 +++++++++++++++++++++++++++++++++++++ 1 file changed, 230 insertions(+) create mode 100644 incentivization-outline.md diff --git a/incentivization-outline.md b/incentivization-outline.md new file mode 100644 index 0000000..6d583d3 --- /dev/null +++ b/incentivization-outline.md @@ -0,0 +1,230 @@ +Our goal is to add an incentivization scheme to Waku to make it (more) incentive compatible. +In what follows, we abbreviate incentivization as i13n. + +We aim to answer the following questions: + +1. what is the structure of the protocols in question? +2. what is the desired behavior of protocol participants? +3. what deviations from the desired behavior occur without incentivization? +4. what incentivization tools do we have? +5. what tools are appropriate for our purpose? +6. what parameters can we chose? what are our restrictions? +7. suggest a concrete i13n architecture. +8. how do we check if we've solved the problem? + +# Overview + +Waku implements a modular decentralized censorship resistant P2P communications protocol. +Waku consists of multiple protocols (see [architecture](https://waku.org/about/architect)). +We focus on the main four are Relay (a P2P protocol), and three light protocols: Filter, Store, and Lightpush, which have a client-server architecture (aka request-response). + +A Waku node is a node that runs at least one of the Waku protocols. +A full Waku node is a node that runs Relay. +A light Waku node as a node that only runs client-side of one of the light protocols. +See also: https://github.com/waku-org/research/issues/28 + +In light protocols, a client sends a request to a server. +A server (a Relay node) performs some actions and returns a response, in particular: +- [[Filter]]: the server will relay (only) messages that pass a filter to the client; +- [[Store]]: the server responds with messages broadcast within the specified time frame; +- [[Lightpush]]: the server publishes the client's message to the Relay network. + +Waku aims to function on widely available hardware. +Hardware requirements for light nodes are lower than for full nodes. +In particular, bandwidth consumption should be limited (estimated at 10 Mbps). +See also: https://github.com/waku-org/research/issues/31 + +# Store protocol + +We first focus on i13n of the Store protocol. +Similar techniques may be later applied to other protocols. + +Store is a client-server protocol. +A client asks the node to respond with relevant messages previously relayed through the Relay protocol. +A relevant message is a message that has been broadcast via Relay within the specified time frame. +The response may be split into multiple parts, as specified by pagination parameters. + +TODO: Strictly speaking, the definition of relevant is inconsistent because there is no consensus over messages. A message may be broadcast but not received by some nodes. Does this happen often? Can and should we do something about it? + +## Desired behavior + +The desired behavior of a Store-server node is to store all non-ephemeral messages forever. + +TODO: address the obvious concern that storing everything forever is unsustainable. Should there be some cut-off time after which old messages are no longer stored? + +Let's say, a client issues a request to the server. +We want the following to happen: + +- the server responds quickly; +- all the messages in the response are relevant; +- the response contains only relevant messages. + +TODO: is this the full definition of the desired behavior? + +### RLN as a proxy metric of message relevance + +RLN (rate limiting nullifiers) is a method of spam prevention in Relay. +The message sender generates a proof of enrollment in some membership set. +Multiple proofs generated within one epoch lead to punishment. +This technique limits the message rate from each node to at most one message per epoch. +See also: https://rfc.vac.dev/spec/17/ + +In the i13n context, we can't prove whether a message has indeed been broadcast in the past. +Instead, we use RLN proofs as a proxy metric. +A valid RLN proof signifies that the message has been generated by a node with an active membership during a particular eposh. +TODO: make sure the above is correct: what exactly does RLN prove? + +## Deviations from the desired behavior + +There are multiple ways for a node to deviate from the desired behavior. +TODO: are we talking only about the server here, or should also discuss client (e.g., DoS)? +### Slow response +The server takes too long to respond. +Possible reasons: +- the server is offline accidentally; +- the request describes too many relevant messages (the server is overwhelmed); +- the server is malicious and deliberately delays the response; +- the server doesn't have some of the relevant messages and tries to request them from other nodes. + +### Incomplete response +A relevant message is missing from the response. +Possible explanations: +- the server didn't receive the message when it was broadcast; +- the server deliberately withholds the message. + +Contrary to blockchains, Relay doesn't have consensus over relayed messages. +Therefore, it's impossible to distinguish between the two scenarios above. +TODO: given this fact, what's the best we can aim for? + +### Irrelevant response +The response contains a message that is not relevant. +There are two scenarios here depending on whether RLN proofs are enforced. +If RLN is not enforced, a server may insert any number or irrelevant messages into the response. +If RLN is enforced, a server can only do so as long as it has a valid membership to generate the respective proofs. +This doesn't eliminate the attackbut limits its consequences. + +TODO: what are the powers of a malicious server when it comes to generating proofs for irrelevant messages? Can the server generate proofs for past epochs? + +## Privacy considerations + +Light protocols, in general, have weaker privacy properties than P2P protocols. +In a client-server exchange, a client wants to selectively interact with the network. +By doing so, it often reveals what it is interested in (e.g., subscribes to particular topics). + +A malicious Store server can spy on a client in the following ways: +- track what time frames a client is interested in; +- analyze the timing of requests; +- link requests done by the same client. + +TODO: expand in the context of an incentivized protocol. + + +# Cost-benefit analysis +The goal of i13n is to make nodes more likely to exhibit the desired behavior. +An incentive scheme links the payoffs to whether nodes follow the protocol or not. +Good behavior should be rewarded, bad behavior punished. + +An incentive scheme should balance the costs and benefits for a node. +Rewards should compensate the cost of good behavior. +Punishments should offset the benefits that bad behavior may bring. + +Let us analyze the costs and benefit of a server that are specific to the Store protocol: +- storage; +- bandwidth; +- computation. + +Let us assume a constant flow of messages per epoch and a constant flow of requests for older messages. +There are two processes: storing incoming messages, and serving old messages to clients. + +The cost of storing incoming messages for one epoch is composed of: +- storage: + - storage costs of all older messages: proportional to cumulative (message size x time stored); + - storage costs of newly arrived messages: proportional to message size; + - a constant cost for I/O operations (storing new messages); +- bandwidth (download) for receiving new messages: proportional to the total size of incoming messages per epoch; +- computational costs of receiving and storing new messages. + +(Strictly speaking, the I/O cost may not always be constant due to caching, disk fragmentation, etc.) + +The cost of storing messages to clients, per epoch, is composed of: +- storage: none (it's accounted for as storing cost); +- bandwidth + - upload: proportional to (number of clients) x (length of time frame requested) x (message size); + - download: proportional to the number of requests; +- computational cost of handling requests. + +TODO: write this down mathematically. + +Storage is likely the dominating cost. +Storage costs is proportional to the amount of information stored and the time it is stored for. +A cumulative cost of storing a single message grows linearly with time. +Assuming a constant stream of new messages, the total storage cost is quadratic in time. + +The number of messages in a response may be approximated by the length of the time frame requested. +This assumes that messages are broadcast in the Relay network at a constant rate. + +Computation: the server spends computing cycles while handling requests. +This costs likely depends not only on the computation itself, but also at the database structure. +For example, retrieving old or rarely requested messages from the local database may be more expensive than fresh or popular ones due to caching. + +TODO: In file storage, I store a file and I pay for the ability to query it later. In Store, Alice relays a message, a server stores is, and later Bob queries it (and pays for it under an i13n scheme). Is there a mismatch between who incurs costs and who pays for it? Shall we think of ways to make Alice incur some costs too? See: https://github.com/waku-org/research/issues/32 + +# Incentivization tools + +We can think of incentivization tools as a two-by-two matrix: +- rewards vs punishment; +- monetary vs reputation. + +In other words, there are four quadrants: +- monetary reward: the client pays the server; +- monetary punishment: the server makes a deposit in advance and gets slashed in case of misbehavior; +- reputation reward: the server's reputation increases if it behaves well; +- reputation punishment: the server's reputation decreases if it behaves badly. + +Reputation can only work if there are tangible benefits of having a high reputation and drawbacks of having a low reputation. +For example: +- clients are more likely to connect to servers with high reputation; +- clients disconnect from servers with low reputation. +Assuming there is a monetary aspect too, low-reputation servers miss out on potential revenue or lose their deposit. +Reputation, however, assumes ether a repeated interaction (i.e., local reputation), or some amount of trust / centralization (centrally managed rankings). + +Monetary i13n tools, in turn, pose a key question: how to ensure atomicity between performance and reward or punishment? +In other words, if the client pays first, the server may take the money and not provide the servers. +Analogously, if the payment is due after the fact, the client can refuse to pay. +Linking payments with behavior involves a certain amount of trust as well. + +This issue is somewhat linked to the problem of Lightning watchtower incentivization (see https://www.talaia.watch/). + +A general observation: if monetary flows are dependent on events in the past, and there is no consensus on what exactly happened in the past, the scheme can be exploited. +TODO: can we use some on-chain component here as a semi-trusted arbiter? + +## Payment methods + +What we want from a payment method (order of priority to be discussed): +- wide distribution (many people already have it); +- high liquidity (i.e., easy to buy or sell at a reasonable exchange rate); +- low latency; +- high security. + +Let's list all (decentralized) payment options that we have: +- proof-of-work: outsource-able, unavailable for consumer hardware - or is it? (Equihash etc) +- proof-of-X (storage, etc) +- cryptocurrency: + - ETH + - a token on Ethereum (ERC20) + - a token on another EVM blockchain + - a token on an EVM-based rollup + - a token on a non-EVM blockchain (BTC / Lightning?) + +# Related work + +Decentralized storage is not a new idea. What is relevant for us? + +1. Federated real-time messaging (IRC, mailing lists). There is no "sync" in IRC; there are simply logs of prior conversations optionally hosted wherever. +2. Centralized file storage (FTP, later Dropbox). Requires trust in availability, but not necessarily confidentiality: content can be encrypted (modulo metadata). +3. P2P file-sharing: Napster, BitTorrent, eDonkey. The power of defaults, local reputation. +4. Decentralized storage in the blockchain age: Storj, Sia, Filecoin, IPFS, Codex... + +# Future work + +How to generalize i13n for Store to other Waku protocols? From edcf90319fd638abd11b6b4bf04a913367cd42a7 Mon Sep 17 00:00:00 2001 From: Sergei Tikhomirov Date: Mon, 16 Oct 2023 17:51:15 +0200 Subject: [PATCH 02/18] Revert "create incentivization-outline.md" This reverts commit 08233a7ce2ce1d922c82ec178e82610c72f9feed. --- incentivization-outline.md | 230 ------------------------------------- 1 file changed, 230 deletions(-) delete mode 100644 incentivization-outline.md diff --git a/incentivization-outline.md b/incentivization-outline.md deleted file mode 100644 index 6d583d3..0000000 --- a/incentivization-outline.md +++ /dev/null @@ -1,230 +0,0 @@ -Our goal is to add an incentivization scheme to Waku to make it (more) incentive compatible. -In what follows, we abbreviate incentivization as i13n. - -We aim to answer the following questions: - -1. what is the structure of the protocols in question? -2. what is the desired behavior of protocol participants? -3. what deviations from the desired behavior occur without incentivization? -4. what incentivization tools do we have? -5. what tools are appropriate for our purpose? -6. what parameters can we chose? what are our restrictions? -7. suggest a concrete i13n architecture. -8. how do we check if we've solved the problem? - -# Overview - -Waku implements a modular decentralized censorship resistant P2P communications protocol. -Waku consists of multiple protocols (see [architecture](https://waku.org/about/architect)). -We focus on the main four are Relay (a P2P protocol), and three light protocols: Filter, Store, and Lightpush, which have a client-server architecture (aka request-response). - -A Waku node is a node that runs at least one of the Waku protocols. -A full Waku node is a node that runs Relay. -A light Waku node as a node that only runs client-side of one of the light protocols. -See also: https://github.com/waku-org/research/issues/28 - -In light protocols, a client sends a request to a server. -A server (a Relay node) performs some actions and returns a response, in particular: -- [[Filter]]: the server will relay (only) messages that pass a filter to the client; -- [[Store]]: the server responds with messages broadcast within the specified time frame; -- [[Lightpush]]: the server publishes the client's message to the Relay network. - -Waku aims to function on widely available hardware. -Hardware requirements for light nodes are lower than for full nodes. -In particular, bandwidth consumption should be limited (estimated at 10 Mbps). -See also: https://github.com/waku-org/research/issues/31 - -# Store protocol - -We first focus on i13n of the Store protocol. -Similar techniques may be later applied to other protocols. - -Store is a client-server protocol. -A client asks the node to respond with relevant messages previously relayed through the Relay protocol. -A relevant message is a message that has been broadcast via Relay within the specified time frame. -The response may be split into multiple parts, as specified by pagination parameters. - -TODO: Strictly speaking, the definition of relevant is inconsistent because there is no consensus over messages. A message may be broadcast but not received by some nodes. Does this happen often? Can and should we do something about it? - -## Desired behavior - -The desired behavior of a Store-server node is to store all non-ephemeral messages forever. - -TODO: address the obvious concern that storing everything forever is unsustainable. Should there be some cut-off time after which old messages are no longer stored? - -Let's say, a client issues a request to the server. -We want the following to happen: - -- the server responds quickly; -- all the messages in the response are relevant; -- the response contains only relevant messages. - -TODO: is this the full definition of the desired behavior? - -### RLN as a proxy metric of message relevance - -RLN (rate limiting nullifiers) is a method of spam prevention in Relay. -The message sender generates a proof of enrollment in some membership set. -Multiple proofs generated within one epoch lead to punishment. -This technique limits the message rate from each node to at most one message per epoch. -See also: https://rfc.vac.dev/spec/17/ - -In the i13n context, we can't prove whether a message has indeed been broadcast in the past. -Instead, we use RLN proofs as a proxy metric. -A valid RLN proof signifies that the message has been generated by a node with an active membership during a particular eposh. -TODO: make sure the above is correct: what exactly does RLN prove? - -## Deviations from the desired behavior - -There are multiple ways for a node to deviate from the desired behavior. -TODO: are we talking only about the server here, or should also discuss client (e.g., DoS)? -### Slow response -The server takes too long to respond. -Possible reasons: -- the server is offline accidentally; -- the request describes too many relevant messages (the server is overwhelmed); -- the server is malicious and deliberately delays the response; -- the server doesn't have some of the relevant messages and tries to request them from other nodes. - -### Incomplete response -A relevant message is missing from the response. -Possible explanations: -- the server didn't receive the message when it was broadcast; -- the server deliberately withholds the message. - -Contrary to blockchains, Relay doesn't have consensus over relayed messages. -Therefore, it's impossible to distinguish between the two scenarios above. -TODO: given this fact, what's the best we can aim for? - -### Irrelevant response -The response contains a message that is not relevant. -There are two scenarios here depending on whether RLN proofs are enforced. -If RLN is not enforced, a server may insert any number or irrelevant messages into the response. -If RLN is enforced, a server can only do so as long as it has a valid membership to generate the respective proofs. -This doesn't eliminate the attackbut limits its consequences. - -TODO: what are the powers of a malicious server when it comes to generating proofs for irrelevant messages? Can the server generate proofs for past epochs? - -## Privacy considerations - -Light protocols, in general, have weaker privacy properties than P2P protocols. -In a client-server exchange, a client wants to selectively interact with the network. -By doing so, it often reveals what it is interested in (e.g., subscribes to particular topics). - -A malicious Store server can spy on a client in the following ways: -- track what time frames a client is interested in; -- analyze the timing of requests; -- link requests done by the same client. - -TODO: expand in the context of an incentivized protocol. - - -# Cost-benefit analysis -The goal of i13n is to make nodes more likely to exhibit the desired behavior. -An incentive scheme links the payoffs to whether nodes follow the protocol or not. -Good behavior should be rewarded, bad behavior punished. - -An incentive scheme should balance the costs and benefits for a node. -Rewards should compensate the cost of good behavior. -Punishments should offset the benefits that bad behavior may bring. - -Let us analyze the costs and benefit of a server that are specific to the Store protocol: -- storage; -- bandwidth; -- computation. - -Let us assume a constant flow of messages per epoch and a constant flow of requests for older messages. -There are two processes: storing incoming messages, and serving old messages to clients. - -The cost of storing incoming messages for one epoch is composed of: -- storage: - - storage costs of all older messages: proportional to cumulative (message size x time stored); - - storage costs of newly arrived messages: proportional to message size; - - a constant cost for I/O operations (storing new messages); -- bandwidth (download) for receiving new messages: proportional to the total size of incoming messages per epoch; -- computational costs of receiving and storing new messages. - -(Strictly speaking, the I/O cost may not always be constant due to caching, disk fragmentation, etc.) - -The cost of storing messages to clients, per epoch, is composed of: -- storage: none (it's accounted for as storing cost); -- bandwidth - - upload: proportional to (number of clients) x (length of time frame requested) x (message size); - - download: proportional to the number of requests; -- computational cost of handling requests. - -TODO: write this down mathematically. - -Storage is likely the dominating cost. -Storage costs is proportional to the amount of information stored and the time it is stored for. -A cumulative cost of storing a single message grows linearly with time. -Assuming a constant stream of new messages, the total storage cost is quadratic in time. - -The number of messages in a response may be approximated by the length of the time frame requested. -This assumes that messages are broadcast in the Relay network at a constant rate. - -Computation: the server spends computing cycles while handling requests. -This costs likely depends not only on the computation itself, but also at the database structure. -For example, retrieving old or rarely requested messages from the local database may be more expensive than fresh or popular ones due to caching. - -TODO: In file storage, I store a file and I pay for the ability to query it later. In Store, Alice relays a message, a server stores is, and later Bob queries it (and pays for it under an i13n scheme). Is there a mismatch between who incurs costs and who pays for it? Shall we think of ways to make Alice incur some costs too? See: https://github.com/waku-org/research/issues/32 - -# Incentivization tools - -We can think of incentivization tools as a two-by-two matrix: -- rewards vs punishment; -- monetary vs reputation. - -In other words, there are four quadrants: -- monetary reward: the client pays the server; -- monetary punishment: the server makes a deposit in advance and gets slashed in case of misbehavior; -- reputation reward: the server's reputation increases if it behaves well; -- reputation punishment: the server's reputation decreases if it behaves badly. - -Reputation can only work if there are tangible benefits of having a high reputation and drawbacks of having a low reputation. -For example: -- clients are more likely to connect to servers with high reputation; -- clients disconnect from servers with low reputation. -Assuming there is a monetary aspect too, low-reputation servers miss out on potential revenue or lose their deposit. -Reputation, however, assumes ether a repeated interaction (i.e., local reputation), or some amount of trust / centralization (centrally managed rankings). - -Monetary i13n tools, in turn, pose a key question: how to ensure atomicity between performance and reward or punishment? -In other words, if the client pays first, the server may take the money and not provide the servers. -Analogously, if the payment is due after the fact, the client can refuse to pay. -Linking payments with behavior involves a certain amount of trust as well. - -This issue is somewhat linked to the problem of Lightning watchtower incentivization (see https://www.talaia.watch/). - -A general observation: if monetary flows are dependent on events in the past, and there is no consensus on what exactly happened in the past, the scheme can be exploited. -TODO: can we use some on-chain component here as a semi-trusted arbiter? - -## Payment methods - -What we want from a payment method (order of priority to be discussed): -- wide distribution (many people already have it); -- high liquidity (i.e., easy to buy or sell at a reasonable exchange rate); -- low latency; -- high security. - -Let's list all (decentralized) payment options that we have: -- proof-of-work: outsource-able, unavailable for consumer hardware - or is it? (Equihash etc) -- proof-of-X (storage, etc) -- cryptocurrency: - - ETH - - a token on Ethereum (ERC20) - - a token on another EVM blockchain - - a token on an EVM-based rollup - - a token on a non-EVM blockchain (BTC / Lightning?) - -# Related work - -Decentralized storage is not a new idea. What is relevant for us? - -1. Federated real-time messaging (IRC, mailing lists). There is no "sync" in IRC; there are simply logs of prior conversations optionally hosted wherever. -2. Centralized file storage (FTP, later Dropbox). Requires trust in availability, but not necessarily confidentiality: content can be encrypted (modulo metadata). -3. P2P file-sharing: Napster, BitTorrent, eDonkey. The power of defaults, local reputation. -4. Decentralized storage in the blockchain age: Storj, Sia, Filecoin, IPFS, Codex... - -# Future work - -How to generalize i13n for Store to other Waku protocols? From 7f6b5f914c772447f5318d0d6868114122d2ca0e Mon Sep 17 00:00:00 2001 From: Sergei Tikhomirov Date: Fri, 13 Oct 2023 16:02:12 +0200 Subject: [PATCH 03/18] create incentivization-outline.md --- incentivization-outline.md | 230 +++++++++++++++++++++++++++++++++++++ 1 file changed, 230 insertions(+) create mode 100644 incentivization-outline.md diff --git a/incentivization-outline.md b/incentivization-outline.md new file mode 100644 index 0000000..6d583d3 --- /dev/null +++ b/incentivization-outline.md @@ -0,0 +1,230 @@ +Our goal is to add an incentivization scheme to Waku to make it (more) incentive compatible. +In what follows, we abbreviate incentivization as i13n. + +We aim to answer the following questions: + +1. what is the structure of the protocols in question? +2. what is the desired behavior of protocol participants? +3. what deviations from the desired behavior occur without incentivization? +4. what incentivization tools do we have? +5. what tools are appropriate for our purpose? +6. what parameters can we chose? what are our restrictions? +7. suggest a concrete i13n architecture. +8. how do we check if we've solved the problem? + +# Overview + +Waku implements a modular decentralized censorship resistant P2P communications protocol. +Waku consists of multiple protocols (see [architecture](https://waku.org/about/architect)). +We focus on the main four are Relay (a P2P protocol), and three light protocols: Filter, Store, and Lightpush, which have a client-server architecture (aka request-response). + +A Waku node is a node that runs at least one of the Waku protocols. +A full Waku node is a node that runs Relay. +A light Waku node as a node that only runs client-side of one of the light protocols. +See also: https://github.com/waku-org/research/issues/28 + +In light protocols, a client sends a request to a server. +A server (a Relay node) performs some actions and returns a response, in particular: +- [[Filter]]: the server will relay (only) messages that pass a filter to the client; +- [[Store]]: the server responds with messages broadcast within the specified time frame; +- [[Lightpush]]: the server publishes the client's message to the Relay network. + +Waku aims to function on widely available hardware. +Hardware requirements for light nodes are lower than for full nodes. +In particular, bandwidth consumption should be limited (estimated at 10 Mbps). +See also: https://github.com/waku-org/research/issues/31 + +# Store protocol + +We first focus on i13n of the Store protocol. +Similar techniques may be later applied to other protocols. + +Store is a client-server protocol. +A client asks the node to respond with relevant messages previously relayed through the Relay protocol. +A relevant message is a message that has been broadcast via Relay within the specified time frame. +The response may be split into multiple parts, as specified by pagination parameters. + +TODO: Strictly speaking, the definition of relevant is inconsistent because there is no consensus over messages. A message may be broadcast but not received by some nodes. Does this happen often? Can and should we do something about it? + +## Desired behavior + +The desired behavior of a Store-server node is to store all non-ephemeral messages forever. + +TODO: address the obvious concern that storing everything forever is unsustainable. Should there be some cut-off time after which old messages are no longer stored? + +Let's say, a client issues a request to the server. +We want the following to happen: + +- the server responds quickly; +- all the messages in the response are relevant; +- the response contains only relevant messages. + +TODO: is this the full definition of the desired behavior? + +### RLN as a proxy metric of message relevance + +RLN (rate limiting nullifiers) is a method of spam prevention in Relay. +The message sender generates a proof of enrollment in some membership set. +Multiple proofs generated within one epoch lead to punishment. +This technique limits the message rate from each node to at most one message per epoch. +See also: https://rfc.vac.dev/spec/17/ + +In the i13n context, we can't prove whether a message has indeed been broadcast in the past. +Instead, we use RLN proofs as a proxy metric. +A valid RLN proof signifies that the message has been generated by a node with an active membership during a particular eposh. +TODO: make sure the above is correct: what exactly does RLN prove? + +## Deviations from the desired behavior + +There are multiple ways for a node to deviate from the desired behavior. +TODO: are we talking only about the server here, or should also discuss client (e.g., DoS)? +### Slow response +The server takes too long to respond. +Possible reasons: +- the server is offline accidentally; +- the request describes too many relevant messages (the server is overwhelmed); +- the server is malicious and deliberately delays the response; +- the server doesn't have some of the relevant messages and tries to request them from other nodes. + +### Incomplete response +A relevant message is missing from the response. +Possible explanations: +- the server didn't receive the message when it was broadcast; +- the server deliberately withholds the message. + +Contrary to blockchains, Relay doesn't have consensus over relayed messages. +Therefore, it's impossible to distinguish between the two scenarios above. +TODO: given this fact, what's the best we can aim for? + +### Irrelevant response +The response contains a message that is not relevant. +There are two scenarios here depending on whether RLN proofs are enforced. +If RLN is not enforced, a server may insert any number or irrelevant messages into the response. +If RLN is enforced, a server can only do so as long as it has a valid membership to generate the respective proofs. +This doesn't eliminate the attackbut limits its consequences. + +TODO: what are the powers of a malicious server when it comes to generating proofs for irrelevant messages? Can the server generate proofs for past epochs? + +## Privacy considerations + +Light protocols, in general, have weaker privacy properties than P2P protocols. +In a client-server exchange, a client wants to selectively interact with the network. +By doing so, it often reveals what it is interested in (e.g., subscribes to particular topics). + +A malicious Store server can spy on a client in the following ways: +- track what time frames a client is interested in; +- analyze the timing of requests; +- link requests done by the same client. + +TODO: expand in the context of an incentivized protocol. + + +# Cost-benefit analysis +The goal of i13n is to make nodes more likely to exhibit the desired behavior. +An incentive scheme links the payoffs to whether nodes follow the protocol or not. +Good behavior should be rewarded, bad behavior punished. + +An incentive scheme should balance the costs and benefits for a node. +Rewards should compensate the cost of good behavior. +Punishments should offset the benefits that bad behavior may bring. + +Let us analyze the costs and benefit of a server that are specific to the Store protocol: +- storage; +- bandwidth; +- computation. + +Let us assume a constant flow of messages per epoch and a constant flow of requests for older messages. +There are two processes: storing incoming messages, and serving old messages to clients. + +The cost of storing incoming messages for one epoch is composed of: +- storage: + - storage costs of all older messages: proportional to cumulative (message size x time stored); + - storage costs of newly arrived messages: proportional to message size; + - a constant cost for I/O operations (storing new messages); +- bandwidth (download) for receiving new messages: proportional to the total size of incoming messages per epoch; +- computational costs of receiving and storing new messages. + +(Strictly speaking, the I/O cost may not always be constant due to caching, disk fragmentation, etc.) + +The cost of storing messages to clients, per epoch, is composed of: +- storage: none (it's accounted for as storing cost); +- bandwidth + - upload: proportional to (number of clients) x (length of time frame requested) x (message size); + - download: proportional to the number of requests; +- computational cost of handling requests. + +TODO: write this down mathematically. + +Storage is likely the dominating cost. +Storage costs is proportional to the amount of information stored and the time it is stored for. +A cumulative cost of storing a single message grows linearly with time. +Assuming a constant stream of new messages, the total storage cost is quadratic in time. + +The number of messages in a response may be approximated by the length of the time frame requested. +This assumes that messages are broadcast in the Relay network at a constant rate. + +Computation: the server spends computing cycles while handling requests. +This costs likely depends not only on the computation itself, but also at the database structure. +For example, retrieving old or rarely requested messages from the local database may be more expensive than fresh or popular ones due to caching. + +TODO: In file storage, I store a file and I pay for the ability to query it later. In Store, Alice relays a message, a server stores is, and later Bob queries it (and pays for it under an i13n scheme). Is there a mismatch between who incurs costs and who pays for it? Shall we think of ways to make Alice incur some costs too? See: https://github.com/waku-org/research/issues/32 + +# Incentivization tools + +We can think of incentivization tools as a two-by-two matrix: +- rewards vs punishment; +- monetary vs reputation. + +In other words, there are four quadrants: +- monetary reward: the client pays the server; +- monetary punishment: the server makes a deposit in advance and gets slashed in case of misbehavior; +- reputation reward: the server's reputation increases if it behaves well; +- reputation punishment: the server's reputation decreases if it behaves badly. + +Reputation can only work if there are tangible benefits of having a high reputation and drawbacks of having a low reputation. +For example: +- clients are more likely to connect to servers with high reputation; +- clients disconnect from servers with low reputation. +Assuming there is a monetary aspect too, low-reputation servers miss out on potential revenue or lose their deposit. +Reputation, however, assumes ether a repeated interaction (i.e., local reputation), or some amount of trust / centralization (centrally managed rankings). + +Monetary i13n tools, in turn, pose a key question: how to ensure atomicity between performance and reward or punishment? +In other words, if the client pays first, the server may take the money and not provide the servers. +Analogously, if the payment is due after the fact, the client can refuse to pay. +Linking payments with behavior involves a certain amount of trust as well. + +This issue is somewhat linked to the problem of Lightning watchtower incentivization (see https://www.talaia.watch/). + +A general observation: if monetary flows are dependent on events in the past, and there is no consensus on what exactly happened in the past, the scheme can be exploited. +TODO: can we use some on-chain component here as a semi-trusted arbiter? + +## Payment methods + +What we want from a payment method (order of priority to be discussed): +- wide distribution (many people already have it); +- high liquidity (i.e., easy to buy or sell at a reasonable exchange rate); +- low latency; +- high security. + +Let's list all (decentralized) payment options that we have: +- proof-of-work: outsource-able, unavailable for consumer hardware - or is it? (Equihash etc) +- proof-of-X (storage, etc) +- cryptocurrency: + - ETH + - a token on Ethereum (ERC20) + - a token on another EVM blockchain + - a token on an EVM-based rollup + - a token on a non-EVM blockchain (BTC / Lightning?) + +# Related work + +Decentralized storage is not a new idea. What is relevant for us? + +1. Federated real-time messaging (IRC, mailing lists). There is no "sync" in IRC; there are simply logs of prior conversations optionally hosted wherever. +2. Centralized file storage (FTP, later Dropbox). Requires trust in availability, but not necessarily confidentiality: content can be encrypted (modulo metadata). +3. P2P file-sharing: Napster, BitTorrent, eDonkey. The power of defaults, local reputation. +4. Decentralized storage in the blockchain age: Storj, Sia, Filecoin, IPFS, Codex... + +# Future work + +How to generalize i13n for Store to other Waku protocols? From 60f3aaa62ef3ad128c2245061c566d37e008f9d3 Mon Sep 17 00:00:00 2001 From: Sergei Tikhomirov Date: Tue, 17 Oct 2023 13:01:55 +0200 Subject: [PATCH 04/18] rename incentivization-outline.md to incentivization.md --- incentivization-outline.md => incentivization.md | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename incentivization-outline.md => incentivization.md (100%) diff --git a/incentivization-outline.md b/incentivization.md similarity index 100% rename from incentivization-outline.md rename to incentivization.md From 25bed7304ae5b182e1fed7fc6dd5e8b7fe9e1b49 Mon Sep 17 00:00:00 2001 From: Sergei Tikhomirov Date: Tue, 17 Oct 2023 17:24:29 +0200 Subject: [PATCH 05/18] add notes on client as attacker --- incentivization.md | 22 ++++++++++++++++------ 1 file changed, 16 insertions(+), 6 deletions(-) diff --git a/incentivization.md b/incentivization.md index 6d583d3..3cff8be 100644 --- a/incentivization.md +++ b/incentivization.md @@ -71,14 +71,14 @@ See also: https://rfc.vac.dev/spec/17/ In the i13n context, we can't prove whether a message has indeed been broadcast in the past. Instead, we use RLN proofs as a proxy metric. -A valid RLN proof signifies that the message has been generated by a node with an active membership during a particular eposh. +A valid RLN proof signifies that the message has been generated by a node with an active membership during a particular epoch. TODO: make sure the above is correct: what exactly does RLN prove? ## Deviations from the desired behavior There are multiple ways for a node to deviate from the desired behavior. -TODO: are we talking only about the server here, or should also discuss client (e.g., DoS)? -### Slow response +We look at potential misbehavior from the server side and from the client side separately. +### Server: Slow response The server takes too long to respond. Possible reasons: - the server is offline accidentally; @@ -86,7 +86,7 @@ Possible reasons: - the server is malicious and deliberately delays the response; - the server doesn't have some of the relevant messages and tries to request them from other nodes. -### Incomplete response +### Server: Incomplete response A relevant message is missing from the response. Possible explanations: - the server didn't receive the message when it was broadcast; @@ -96,15 +96,25 @@ Contrary to blockchains, Relay doesn't have consensus over relayed messages. Therefore, it's impossible to distinguish between the two scenarios above. TODO: given this fact, what's the best we can aim for? -### Irrelevant response +### Server: Irrelevant response The response contains a message that is not relevant. There are two scenarios here depending on whether RLN proofs are enforced. If RLN is not enforced, a server may insert any number or irrelevant messages into the response. If RLN is enforced, a server can only do so as long as it has a valid membership to generate the respective proofs. -This doesn't eliminate the attackbut limits its consequences. +This doesn't eliminate the attack but limits its consequences. TODO: what are the powers of a malicious server when it comes to generating proofs for irrelevant messages? Can the server generate proofs for past epochs? +### Client: Too many requests +The client sends many request to the server within a short period of time. +This may be seen as a DoS attack. + +### Client: Request is too large +The client sends a response that incurs excessive expenses on the server. +For example, the request covers a very long period in history, or, more generically, +a period that contains many messages. +This may also be seen as a DoS attack. + ## Privacy considerations Light protocols, in general, have weaker privacy properties than P2P protocols. From 8d7b6efd433dd843fa354a74788138b01f4440c7 Mon Sep 17 00:00:00 2001 From: Sergei Tikhomirov Date: Wed, 1 Nov 2023 18:21:25 +0100 Subject: [PATCH 06/18] incentivization outline majorly restructured --- incentivization.md | 319 ++++++++++++++++++++++++++------------------- 1 file changed, 183 insertions(+), 136 deletions(-) diff --git a/incentivization.md b/incentivization.md index 3cff8be..2b4799b 100644 --- a/incentivization.md +++ b/incentivization.md @@ -1,27 +1,84 @@ -Our goal is to add an incentivization scheme to Waku to make it (more) incentive compatible. -In what follows, we abbreviate incentivization as i13n. +Waku is a family of decentralized communication protocols. +The Waku network consists of independent nodes running the corresponding protocols. +Waku needs incentivization (i13n) to ensure proper node behavior in the absence of any centralized coordinator. -We aim to answer the following questions: +In this document, we overview the problem of i13n in decentralized systems. +We classify the possible methods of i13n and give example used in prior successful P2P networks. +We then briefly introduce Waku and outline the unique i13n challenges it presents. -1. what is the structure of the protocols in question? -2. what is the desired behavior of protocol participants? -3. what deviations from the desired behavior occur without incentivization? -4. what incentivization tools do we have? -5. what tools are appropriate for our purpose? -6. what parameters can we chose? what are our restrictions? -7. suggest a concrete i13n architecture. -8. how do we check if we've solved the problem? +We then go into more detail into one of the Waku's protocols, Store, responsible for archival storage. +We propose an i13n scheme for Store and implement an MVP solution. +We discuss the choices we have made for the MVP version, and what design options may be considered in the future. -# Overview +# Classification of i13n tools -Waku implements a modular decentralized censorship resistant P2P communications protocol. -Waku consists of multiple protocols (see [architecture](https://waku.org/about/architect)). -We focus on the main four are Relay (a P2P protocol), and three light protocols: Filter, Store, and Lightpush, which have a client-server architecture (aka request-response). +We can think of incentivization tools as a two-by-two matrix: +- rewards vs punishment; +- monetary vs reputation. -A Waku node is a node that runs at least one of the Waku protocols. -A full Waku node is a node that runs Relay. -A light Waku node as a node that only runs client-side of one of the light protocols. -See also: https://github.com/waku-org/research/issues/28 +In other words, there are four quadrants: +- monetary reward: the client pays the server; +- monetary punishment: the server makes a deposit in advance and gets slashed in case of misbehavior; +- reputation reward: the server's reputation increases if it behaves well; +- reputation punishment: the server's reputation decreases if it behaves badly. + +Reputation can only work if there are tangible benefits of having a high reputation and drawbacks of having a low reputation. +For example: +- clients are more likely to connect to servers with high reputation; +- clients disconnect from servers with low reputation. + +In the presence of monetary rewards, low-reputation servers miss out on potential revenue or lose their deposit. +Without the monetary aspects, low-reputation nodes can't get as much benefit from the network. +Reputation either assumes a repeated interaction (i.e., local reputation), or some amount of trust (centrally managed rankings). + +Ideally, monetary motivation should be atomically linked with performance. +A node should be rewarded if and only if it performed the desired action. +Analogously, it should be punished if and only it it misbehaved. +In other words, if the client pays first, the server cannot deny service, +and if the client pays after the fact, it's impossible to default on the obligation. + +In blockchain networks, the desired behavior of miners or validators can be automatically verified and rewarded with native tokens (or punished by slashing). +Enforcing atomicity in decentralized data-focused networks can be challenging: +it is non-trivial to prove that a certain piece of data was sent or received. +Therefore, such cases may warrant a combination of monetary and reputation-based approaches. + +# Related work + +There have been many example of incentivized decentralized systems. + +## Early P2P file-sharing + +Early P2P file-sharing networks employed reputation-based approaches and stickly defaults. +For instance, in BitTorrent, a peer by default shares pieces of a file before having received it in whole. +At the same time, the bandwidth that a peer can use depends on how much is has shared previously. +This policy rewards nodes who share by allowing them to download file faster. +While this reward is not monetary, it has proven to be sufficient in practice. + +## Blockchains + +The key innovation of Bitcoin, inherited and built upon by later blockchain networks, is the introduction of native monetary i13 mechanism. +In the case of Bitcoin, miners create new blocks and are automatically rewarded with newly mined coins, as prescribed by the protocol. +An invalid block will be rejected by other nodes and not rewarded, which incentivizes good behavior. +There are no intrinsic monetary punishments in Bitcoin, only rewards. +However, mining nodes are required to expend physical resources for block generation. + +Proof-of-stake consensus algorithms introduced intrinsic monetary punishments in the blockchain context. +A validator locks up (stakes) native tokens and gets rewarded for validating new blocks. +In case of misbehavior, the deposit is automatically taken away (i.e., the bad actor is slashed). + +## Decentralized storage + +Multiple decentralized storage networks have appeared in recent years, including Codex, Storj, Sia, Filecoin, IPFS. +They combine the techniques from early P2P file-sharing and blockchain-inspired reward mechanisms. + +# Waku + +Waku is a family of protocols (see [architecture](https://waku.org/about/architect)) for a modular decentralized censorship-resistant P2P communications network. +The backbone of Waku is the Relay protocol ([RLN-Relay](https://rfc.vac.dev/spec/17/) is an spam-protected version of the protocol). +Additionally, there are three light (client-server, request-response) protocols: Filter, Store, and Lightpush. + +There is no strict definition of a full node vs a light node in Waku (see https://github.com/waku-org/research/issues/28). +In this document, we may refer to a node that is running Relay and Store (server-side) as a full node, and to a node that is running a client-side of any of the light protocols as a light node. In light protocols, a client sends a request to a server. A server (a Relay node) performs some actions and returns a response, in particular: @@ -29,14 +86,21 @@ A server (a Relay node) performs some actions and returns a response, in particu - [[Store]]: the server responds with messages broadcast within the specified time frame; - [[Lightpush]]: the server publishes the client's message to the Relay network. -Waku aims to function on widely available hardware. -Hardware requirements for light nodes are lower than for full nodes. -In particular, bandwidth consumption should be limited (estimated at 10 Mbps). -See also: https://github.com/waku-org/research/issues/31 +## Waku i13n challenges -# Store protocol +As a communication protocol, Waku lacks consensus or a native token. +These properties bring Waku closer to purely reputation-incentivized file-sharing systems. +Our goal nevertheless is to combine monetary and reputation-based incentives in Waku. +The rationale for that is that monetary incentives have demonstrated their robustness in blockchain networks, +and are well-suited for a network designed to scale well beyond the initial phase when it's mainly maintained by enthusiasts for altruistic reasons. -We first focus on i13n of the Store protocol. +In our i13n framework, currently Waku only operates under reputation-based rewards and punishments. +While [RLN-Relay](https://rfc.vac.dev/spec/17/) adds monetary punishments for spammers, slashing is yet to be activated. + + +# Waku Store + +In this section, we design a monetary-based i13n scheme for Waku Store. Similar techniques may be later applied to other protocols. Store is a client-server protocol. @@ -44,14 +108,6 @@ A client asks the node to respond with relevant messages previously relayed thro A relevant message is a message that has been broadcast via Relay within the specified time frame. The response may be split into multiple parts, as specified by pagination parameters. -TODO: Strictly speaking, the definition of relevant is inconsistent because there is no consensus over messages. A message may be broadcast but not received by some nodes. Does this happen often? Can and should we do something about it? - -## Desired behavior - -The desired behavior of a Store-server node is to store all non-ephemeral messages forever. - -TODO: address the obvious concern that storing everything forever is unsustainable. Should there be some cut-off time after which old messages are no longer stored? - Let's say, a client issues a request to the server. We want the following to happen: @@ -59,89 +115,44 @@ We want the following to happen: - all the messages in the response are relevant; - the response contains only relevant messages. -TODO: is this the full definition of the desired behavior? +From a security standpoint, each Store node should enforce limits on requests as to not be DoS-ed. -### RLN as a proxy metric of message relevance +As Waku doesn't intent to establish consensus over past messages, +we can only rely on heuristics to determine whether a message had been relayed earlier. +To decrease the chance of missing some messages, a client may query multiple servers and combine their replies (union of all messages; messages reported by some majority of servers, etc). -RLN (rate limiting nullifiers) is a method of spam prevention in Relay. -The message sender generates a proof of enrollment in some membership set. -Multiple proofs generated within one epoch lead to punishment. -This technique limits the message rate from each node to at most one message per epoch. -See also: https://rfc.vac.dev/spec/17/ +# Store i13n MVP -In the i13n context, we can't prove whether a message has indeed been broadcast in the past. -Instead, we use RLN proofs as a proxy metric. -A valid RLN proof signifies that the message has been generated by a node with an active membership during a particular epoch. -TODO: make sure the above is correct: what exactly does RLN prove? +We propose Store-i13n-MVP - the simplest version of i13n in Store. -## Deviations from the desired behavior +In broad strokes: +- client: I want this piece of history +- server (after internal calculations): here is the price +- client: pays (if price is ok; otherwise conversation ends) +- server: responds with data +- client: checks the data: if data is irrelevant - decreases server's reputation +- client (optionally): queries another server; compares responses; maybe decreases reputation of both (?) if responses diverge. Or queries 3 servers and assumes that messages returned by 2/3 or 3/3 are "real" ("Never Take Two Chronometers to Sea"). -There are multiple ways for a node to deviate from the desired behavior. -We look at potential misbehavior from the server side and from the client side separately. -### Server: Slow response -The server takes too long to respond. -Possible reasons: -- the server is offline accidentally; -- the request describes too many relevant messages (the server is overwhelmed); -- the server is malicious and deliberately delays the response; -- the server doesn't have some of the relevant messages and tries to request them from other nodes. +# Evaluation -### Server: Incomplete response -A relevant message is missing from the response. -Possible explanations: -- the server didn't receive the message when it was broadcast; -- the server deliberately withholds the message. +We measure the performance of our i13-ed Store protocol by the following metrics... -Contrary to blockchains, Relay doesn't have consensus over relayed messages. -Therefore, it's impossible to distinguish between the two scenarios above. -TODO: given this fact, what's the best we can aim for? +TODO: how do we check if we've solved the problem? -### Server: Irrelevant response -The response contains a message that is not relevant. -There are two scenarios here depending on whether RLN proofs are enforced. -If RLN is not enforced, a server may insert any number or irrelevant messages into the response. -If RLN is enforced, a server can only do so as long as it has a valid membership to generate the respective proofs. -This doesn't eliminate the attack but limits its consequences. +# Future work -TODO: what are the powers of a malicious server when it comes to generating proofs for irrelevant messages? Can the server generate proofs for past epochs? +Store-i13n-MVP is the simplest protocol we can start with. +Let us now outline the design choices to be made if we were to go beyond MVP. -### Client: Too many requests -The client sends many request to the server within a short period of time. -This may be seen as a DoS attack. - -### Client: Request is too large -The client sends a response that incurs excessive expenses on the server. -For example, the request covers a very long period in history, or, more generically, -a period that contains many messages. -This may also be seen as a DoS attack. - -## Privacy considerations - -Light protocols, in general, have weaker privacy properties than P2P protocols. -In a client-server exchange, a client wants to selectively interact with the network. -By doing so, it often reveals what it is interested in (e.g., subscribes to particular topics). - -A malicious Store server can spy on a client in the following ways: -- track what time frames a client is interested in; -- analyze the timing of requests; -- link requests done by the same client. - -TODO: expand in the context of an incentivized protocol. - - -# Cost-benefit analysis -The goal of i13n is to make nodes more likely to exhibit the desired behavior. -An incentive scheme links the payoffs to whether nodes follow the protocol or not. -Good behavior should be rewarded, bad behavior punished. +## Price discovery An incentive scheme should balance the costs and benefits for a node. Rewards should compensate the cost of good behavior. Punishments should offset the benefits that bad behavior may bring. -Let us analyze the costs and benefit of a server that are specific to the Store protocol: -- storage; -- bandwidth; -- computation. +In the MVP i13n protocol, clients and servers establish a free market by negotiating prices. +A server should understand its true costs to negotiate effectively. +The costs of a Store server are storage, bandwidth, and computation. Let us assume a constant flow of messages per epoch and a constant flow of requests for older messages. There are two processes: storing incoming messages, and serving old messages to clients. @@ -163,8 +174,6 @@ The cost of storing messages to clients, per epoch, is composed of: - download: proportional to the number of requests; - computational cost of handling requests. -TODO: write this down mathematically. - Storage is likely the dominating cost. Storage costs is proportional to the amount of information stored and the time it is stored for. A cumulative cost of storing a single message grows linearly with time. @@ -177,40 +186,41 @@ Computation: the server spends computing cycles while handling requests. This costs likely depends not only on the computation itself, but also at the database structure. For example, retrieving old or rarely requested messages from the local database may be more expensive than fresh or popular ones due to caching. -TODO: In file storage, I store a file and I pay for the ability to query it later. In Store, Alice relays a message, a server stores is, and later Bob queries it (and pays for it under an i13n scheme). Is there a mismatch between who incurs costs and who pays for it? Shall we think of ways to make Alice incur some costs too? See: https://github.com/waku-org/research/issues/32 +## RLN as a proxy for message relevance -# Incentivization tools +RLN (rate limiting nullifiers) is a method of spam prevention in Relay ([RLN-Relay](https://rfc.vac.dev/spec/17/)). +The message sender generates a proof of enrollment in some membership set. +Multiple proofs generated within one epoch lead to punishment. +This technique limits the message rate from each node to at most one message per epoch. -We can think of incentivization tools as a two-by-two matrix: -- rewards vs punishment; -- monetary vs reputation. +In the i13n context, we can't prove whether a message has indeed been broadcast in the past. +Instead, we use RLN proofs as a proxy metric. +A valid RLN proof signifies that the message has been generated by a node with an active membership during a particular epoch. +Note that a malicious node with a valid membership can generate messages but not broadcast them. +Such messages would not be "relevant" (i.e., other nodes would be unaware of them), but they would satisfy the RLN-based heuristic. -In other words, there are four quadrants: -- monetary reward: the client pays the server; -- monetary punishment: the server makes a deposit in advance and gets slashed in case of misbehavior; -- reputation reward: the server's reputation increases if it behaves well; -- reputation punishment: the server's reputation decreases if it behaves badly. +Ideally, we would like to punish a server that omits relevant messages. +But as this can't be proven, we resort to reputation in this case. +In other words: if a client is dissatisfied with the response, it simply won't query this server anymore. +A way for the client to know (with some certainty) whether relevant messages have been omitted is to query another server. -Reputation can only work if there are tangible benefits of having a high reputation and drawbacks of having a low reputation. -For example: -- clients are more likely to connect to servers with high reputation; -- clients disconnect from servers with low reputation. -Assuming there is a monetary aspect too, low-reputation servers miss out on potential revenue or lose their deposit. -Reputation, however, assumes ether a repeated interaction (i.e., local reputation), or some amount of trust / centralization (centrally managed rankings). +## Privacy considerations -Monetary i13n tools, in turn, pose a key question: how to ensure atomicity between performance and reward or punishment? -In other words, if the client pays first, the server may take the money and not provide the servers. -Analogously, if the payment is due after the fact, the client can refuse to pay. -Linking payments with behavior involves a certain amount of trust as well. +Light protocols, in general, have weaker privacy properties than P2P protocols. +In a client-server exchange, a client wants to selectively interact with the network. +By doing so, it often reveals what it is interested in (e.g., subscribes to particular topics). -This issue is somewhat linked to the problem of Lightning watchtower incentivization (see https://www.talaia.watch/). - -A general observation: if monetary flows are dependent on events in the past, and there is no consensus on what exactly happened in the past, the scheme can be exploited. -TODO: can we use some on-chain component here as a semi-trusted arbiter? +A malicious Store server can spy on a client in the following ways: +- track what time frames a client is interested in; +- analyze the timing of requests; +- link requests done by the same client. ## Payment methods -What we want from a payment method (order of priority to be discussed): +The MVP protocol is agnostic to payment methods. +However, some payment methods may be more suitable than others. + +What we want from a payment method: - wide distribution (many people already have it); - high liquidity (i.e., easy to buy or sell at a reasonable exchange rate); - low latency; @@ -226,15 +236,52 @@ Let's list all (decentralized) payment options that we have: - a token on an EVM-based rollup - a token on a non-EVM blockchain (BTC / Lightning?) -# Related work +Note also that there may be different market models. +One model is that each client pays for its requests. +Another model assumes that (centralized) applications built on top of Waku buy "credits" in bulk for their users, for whom using the application (which may involve querying Store servers under the hood) is free of charge. -Decentralized storage is not a new idea. What is relevant for us? +## Incentive compatibility -1. Federated real-time messaging (IRC, mailing lists). There is no "sync" in IRC; there are simply logs of prior conversations optionally hosted wherever. -2. Centralized file storage (FTP, later Dropbox). Requires trust in availability, but not necessarily confidentiality: content can be encrypted (modulo metadata). -3. P2P file-sharing: Napster, BitTorrent, eDonkey. The power of defaults, local reputation. -4. Decentralized storage in the blockchain age: Storj, Sia, Filecoin, IPFS, Codex... +In file storage, I store a file and I pay for the ability to query it later. In Store, Alice relays a message, a server stores is, and later Bob queries it (and pays for it under an i13n scheme). Is there a mismatch between who incurs costs and who pays for it? Shall we think of ways to make Alice incur some costs too? See: https://github.com/waku-org/research/issues/32 -# Future work +## Generalization for other Waku protocols -How to generalize i13n for Store to other Waku protocols? +We plan to generalize i13n for Store to other Waku protocols, in particular, to light protocols (Lightpush and Filter). + +# Appendix: Deviations from the desired behavior + +There are multiple ways for a node to deviate from the desired behavior. +We look at potential misbehavior from the server side and from the client side separately. +### Server: Slow response +The server takes too long to respond. +Possible reasons: +- the server is offline accidentally; +- the request describes too many relevant messages (the server is overwhelmed); +- the server is malicious and deliberately delays the response; +- the server doesn't have some of the relevant messages and tries to request them from other nodes. + +### Server: Incomplete response +A relevant message is missing from the response. +Possible explanations: +- the server didn't receive the message when it was broadcast; +- the server deliberately withholds the message. + +Contrary to blockchains, Relay doesn't have consensus over relayed messages. +Therefore, it's impossible to distinguish between the two scenarios above. + +### Server: Irrelevant response +The response contains a message that is not relevant. +There are two scenarios here depending on whether RLN proofs are enforced. +If RLN is not enforced, a server may insert any number or irrelevant messages into the response. +If RLN is enforced, a server can only do so as long as it has a valid membership to generate the respective proofs. +This doesn't eliminate the attack but limits its consequences. + +### Client: Too many requests +The client sends many request to the server within a short period of time. +This may be seen as a DoS attack. + +### Client: Request is too large +The client sends a response that incurs excessive expenses on the server. +For example, the request covers a very long period in history, or, more generically, +a period that contains many messages. +This may also be seen as a DoS attack. \ No newline at end of file From a5e1f59a3ba7093b09f4650e09297a372ecacb89 Mon Sep 17 00:00:00 2001 From: Sergei Tikhomirov Date: Fri, 3 Nov 2023 20:16:00 +0100 Subject: [PATCH 07/18] draft description of store incentivization MVP --- incentivization.md | 129 +++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 120 insertions(+), 9 deletions(-) diff --git a/incentivization.md b/incentivization.md index 2b4799b..4d25560 100644 --- a/incentivization.md +++ b/incentivization.md @@ -48,7 +48,7 @@ There have been many example of incentivized decentralized systems. ## Early P2P file-sharing -Early P2P file-sharing networks employed reputation-based approaches and stickly defaults. +Early P2P file-sharing networks employed reputation-based approaches and sticky defaults. For instance, in BitTorrent, a peer by default shares pieces of a file before having received it in whole. At the same time, the bandwidth that a peer can use depends on how much is has shared previously. This policy rewards nodes who share by allowing them to download file faster. @@ -125,13 +125,121 @@ To decrease the chance of missing some messages, a client may query multiple ser We propose Store-i13n-MVP - the simplest version of i13n in Store. -In broad strokes: -- client: I want this piece of history -- server (after internal calculations): here is the price -- client: pays (if price is ok; otherwise conversation ends) -- server: responds with data -- client: checks the data: if data is irrelevant - decreases server's reputation -- client (optionally): queries another server; compares responses; maybe decreases reputation of both (?) if responses diverge. Or queries 3 servers and assumes that messages returned by 2/3 or 3/3 are "real" ("Never Take Two Chronometers to Sea"). +## Current protocol + +As currently defined, the Store protocol works as follows: +1. the client sends a `HistoryQuery` to the server; +2. the server sends a `HistoryResponse` to the client. + +A response may come in multiple parts (pagination). +Pagination parameters are defined in `PagingInfo` message inside both the `HistoryQuery` and a ``HistoryResponse``. +Let us ignore the pagination considerations for now (assume it just works). + +## Proposed modification + +We proposes modification consists of three parts: +1. price negotiation; +2. reputation accounting; +3. results cross-checking. + +### Price negotiation + +Upon receiving a `HistoryQuery`, the server does the following: +1. Internally calculate the price it wants to charge. +2. Send a message to the client: "I can serve your request for N tokens". +3. If the client agrees, it pays and sends a proof of payment to the server. +4. The server sends the response. + +Price discovery to be discussed in a later section. +In particular, we will reason about which message properties (age, size, etc) should contribute to the price. + +Potential issues: +- A malicious client overwhelms a server with requests and doesn't follow through. A countermeasure: ignore requests from the same client if they come too often. +- (other attacks?) + +### Proof of payment + +If the client agrees to the price, it sends a _proof of payment_ to the server. +The nature of such proof depends on the means of payment. +Assuming the payment takes place on a blockchain, it could simply be a transaction hash. + +It's unclear if we need to ensure that a particular txid is linked to a particular request. +Including request ID into the payment (a-la "memo field") threatens privacy. +Not including it looks fine though, assuming the server keeps track which transactions it had received correspond to which requests (and responses). + +TODO: explore the idea of service credentials: +- [https://forum.vac.dev/t/vac-sustainability-and-business-workshop/116](https://forum.vac.dev/t/vac-sustainability-and-business-workshop/116 "https://forum.vac.dev/t/vac-sustainability-and-business-workshop/116") +- [https://github.com/vacp2p/research/issues/99](https://github.com/vacp2p/research/issues/99 "https://github.com/vacp2p/research/issues/99") +- [https://github.com/vacp2p/research/issues/135](https://github.com/vacp2p/research/issues/135 "https://github.com/vacp2p/research/issues/135") + +#### Who pays first? + +We have to make a design decision: who pays first? +Our options are: +1. the client pays first and trust the server to deliver; +2. the client pays after the fact: the server trusts the client; +3. the client pays partly upfront and partly after the fact; +4. there is a third party (escrow) that ensures atomicity (a trusted third party or a semi-trusted, semi-automated entity like a smart contract). + +Here are our design considerations: +- the MVP protocol should be simple; +- servers are considered to be a more "permanent" entities, that are more likely to have a long-lived ID; +- it is more important to protect the clients's privacy than the server's privacy: a client knows what server it queries in any case, while ideally the server shouldn't know who the client is. (This isn't entirely rigorous, think about it.) + +With that in mind, we suggest the scheme where the client pays first. +It is simpler than splitting the payment, which would involve a) two payments, and b) negotiating the split. +It is also simpler than a trusted third party (the centralized flavor of which we want to avoid anyway). +Comparing to "client pays after the fact", we observe that there is a balance between risk and privacy. +If the server "pays first", it assumes risk. +This risk should be decreased or paid for. +Decreasing the risk means keeping track of the clients' reputation from the server's standpoint, which may endanger clients' privacy. +Paying for the risk means increasing prices (i.e., well-behaved clients in aggregate pay for free-riders). +We suggest that the preferable design is the opposite: the client assumes the risk. +Why this is better: +- it's more likely that the server is professionalized: serving data is its business which it wouldn't want to sabotage; +- the client keeps their privacy, essentially paying for privacy with taking on more risk - this is OK, as risk is "anonymous", and reputation is not. + +### Reputation accounting + +Our protocol assumes that the client trusts the server. +In particular, the client pays first, and then hopes that the server sends back the response. +A server may technically take the money and do nothing. +To discourage this behavior, we use reputation: a client keeps track of the server's behavior. + +The MVP version could be: +- all servers start with zero reputation +- if the server honors the request, it gets +1; +- if the server does not respond after the initial query, it gets -1; +- if the server takes the money and _then_ does not respond, it gets banned (this client will never query it again). + +Potential issues: +- An attacker can establish new server identities and continually run away with clients' money. + - countermeasures: + - a client only queries "trusted" servers (centralization); + - when querying a new server, a client first sends a small (i.e. cheap) request to not risk too much. +- Think about how the ban mechanism can be abused. Can an attacker "frame" competitors' servers so that many clients ban them? + +### Results cross-checking + +> Never go to sea with two chronometers; take one or three. + +The client not only wants to receive _some_ response, it wants to receive all relevant messages and only them. +We don't have consensus over history, so it's impossible to know for sure if a message is relevant. +In non security-critical settings, a client may just accept the risk that some messages may be missing. +For more certainty, the client may query 3 independent servers and compare the results. +Only messages returned by 3/3 or 2/3 are considered relevant. + +Servers' reputation may then be adjusted, but it's not completely obvious how: +- imagine a server whose response has a message that no other response has. + - should we punish it for inserting a fake message into history? or + - should we reward it for providing the data that other are (perhaps intentionally) hiding? +- Same with a server that _misses_ some message that others have delivered: we don't know what the ground truth is anyway. + +However, in the absence of a better mechanism, we can _define_ 2/3 as a validity criteria. +Then, it follows that a server that does _not_ have the "2/3" message is either malicious (censors) or badly managed (went offline when this message was propagated). +In any case, its reputation should be decreased. + +Note: the cross-checking part is optional and may be considered to be out of scope for the MVP protocol. # Evaluation @@ -215,6 +323,9 @@ A malicious Store server can spy on a client in the following ways: - analyze the timing of requests; - link requests done by the same client. +Also, citing the [Store specification](https://rfc.vac.dev/spec/13/): +> The main security consideration ... is that a querying node have to reveal their content filters of interest to the queried node, hence potentially compromising their privacy. + ## Payment methods The MVP protocol is agnostic to payment methods. @@ -266,7 +377,7 @@ Possible explanations: - the server didn't receive the message when it was broadcast; - the server deliberately withholds the message. -Contrary to blockchains, Relay doesn't have consensus over relayed messages. +Contrary to blockchains, Relay doesn't have consensтus over relayed messages. Therefore, it's impossible to distinguish between the two scenarios above. ### Server: Irrelevant response From 33ddb367cf9f9dba5b1b8c336a925096951f7d4b Mon Sep 17 00:00:00 2001 From: Sergei Tikhomirov Date: Tue, 7 Nov 2023 14:38:06 +0100 Subject: [PATCH 08/18] edit incentivization outline around MPV protocol --- incentivization.md | 412 ++++++++++++++++++++------------------------- 1 file changed, 178 insertions(+), 234 deletions(-) diff --git a/incentivization.md b/incentivization.md index 4d25560..c60726c 100644 --- a/incentivization.md +++ b/incentivization.md @@ -18,27 +18,22 @@ We can think of incentivization tools as a two-by-two matrix: In other words, there are four quadrants: - monetary reward: the client pays the server; -- monetary punishment: the server makes a deposit in advance and gets slashed in case of misbehavior; +- monetary punishment: the server makes a deposit and gets slashed if it misbehaves; - reputation reward: the server's reputation increases if it behaves well; - reputation punishment: the server's reputation decreases if it behaves badly. -Reputation can only work if there are tangible benefits of having a high reputation and drawbacks of having a low reputation. -For example: -- clients are more likely to connect to servers with high reputation; -- clients disconnect from servers with low reputation. - +Reputation can only work if there are tangible benefits of having a high reputation. +For example, clients should be more likely to connect to servers with high reputation and disconnect from servers with low reputation. In the presence of monetary rewards, low-reputation servers miss out on potential revenue or lose their deposit. Without the monetary aspects, low-reputation nodes can't get as much benefit from the network. Reputation either assumes a repeated interaction (i.e., local reputation), or some amount of trust (centrally managed rankings). -Ideally, monetary motivation should be atomically linked with performance. -A node should be rewarded if and only if it performed the desired action. -Analogously, it should be punished if and only it it misbehaved. -In other words, if the client pays first, the server cannot deny service, -and if the client pays after the fact, it's impossible to default on the obligation. +Monetary motivation should ideally be atomically linked with performance. +If the client pays first, the server cannot deny service, +and if the client pays after the fact, it's impossible to default on this obligation. -In blockchain networks, the desired behavior of miners or validators can be automatically verified and rewarded with native tokens (or punished by slashing). -Enforcing atomicity in decentralized data-focused networks can be challenging: +In blockchains, the desired behavior of miners or validators can be automatically verified and rewarded with native tokens (or punished by slashing). +Enforcing atomicity in decentralized data-focused networks is challenging: it is non-trivial to prove that a certain piece of data was sent or received. Therefore, such cases may warrant a combination of monetary and reputation-based approaches. @@ -52,231 +47,186 @@ Early P2P file-sharing networks employed reputation-based approaches and sticky For instance, in BitTorrent, a peer by default shares pieces of a file before having received it in whole. At the same time, the bandwidth that a peer can use depends on how much is has shared previously. This policy rewards nodes who share by allowing them to download file faster. -While this reward is not monetary, it has proven to be sufficient in practice. +While this reward is not monetary, it has proven to be working in practice. ## Blockchains -The key innovation of Bitcoin, inherited and built upon by later blockchain networks, is the introduction of native monetary i13 mechanism. -In the case of Bitcoin, miners create new blocks and are automatically rewarded with newly mined coins, as prescribed by the protocol. -An invalid block will be rejected by other nodes and not rewarded, which incentivizes good behavior. +The key innovation of Bitcoin, inherited and built upon by later blockchains, is native monetary i13. +In Bitcoin, miners create new blocks and are automatically rewarded with newly mined coins. +An invalid block is rejected by other nodes and not rewarded. There are no intrinsic monetary punishments in Bitcoin, only rewards. However, mining nodes are required to expend physical resources for block generation. -Proof-of-stake consensus algorithms introduced intrinsic monetary punishments in the blockchain context. -A validator locks up (stakes) native tokens and gets rewarded for validating new blocks. -In case of misbehavior, the deposit is automatically taken away (i.e., the bad actor is slashed). +Proof-of-stake algorithms introduce intrinsic monetary punishments in the blockchain context. +A validator locks up (stakes) native tokens and gets rewarded for validating new blocks and slashed for misbehavior. ## Decentralized storage -Multiple decentralized storage networks have appeared in recent years, including Codex, Storj, Sia, Filecoin, IPFS. -They combine the techniques from early P2P file-sharing and blockchain-inspired reward mechanisms. +Decentralized storage networks, including Codex, Storj, Sia, Filecoin, IPFS, combine the techniques from early P2P file-sharing and blockchain-inspired reward mechanisms to incentivize nodes to store data. -# Waku +# Waku background Waku is a family of protocols (see [architecture](https://waku.org/about/architect)) for a modular decentralized censorship-resistant P2P communications network. -The backbone of Waku is the Relay protocol ([RLN-Relay](https://rfc.vac.dev/spec/17/) is an spam-protected version of the protocol). -Additionally, there are three light (client-server, request-response) protocols: Filter, Store, and Lightpush. +The backbone of Waku is the Relay protocol (and its spam-protected version [RLN-Relay](https://rfc.vac.dev/spec/17/)). +Additionally, there are three light (or client-server, or request-response) protocols: Filter, Store, and Lightpush. -There is no strict definition of a full node vs a light node in Waku (see https://github.com/waku-org/research/issues/28). -In this document, we may refer to a node that is running Relay and Store (server-side) as a full node, and to a node that is running a client-side of any of the light protocols as a light node. +There is no strict definition of a full node vs a light node in Waku (see [discussion](https://github.com/waku-org/research/issues/28)). +In this document, we refer to a node that is running Relay and Store (server-side) as a full node or a server, and to a node that is running a client-side of any of the light protocols as a light node or a client. In light protocols, a client sends a request to a server. A server (a Relay node) performs some actions and returns a response, in particular: - [[Filter]]: the server will relay (only) messages that pass a filter to the client; -- [[Store]]: the server responds with messages broadcast within the specified time frame; +- [[Store]]: the server responds with messages that had been broadcast within the specified time frame; - [[Lightpush]]: the server publishes the client's message to the Relay network. ## Waku i13n challenges As a communication protocol, Waku lacks consensus or a native token. These properties bring Waku closer to purely reputation-incentivized file-sharing systems. -Our goal nevertheless is to combine monetary and reputation-based incentives in Waku. -The rationale for that is that monetary incentives have demonstrated their robustness in blockchain networks, +Our goal nevertheless is to combine monetary and reputation-based incentives. +The rationale is that monetary incentives have demonstrated their robustness in blockchains, and are well-suited for a network designed to scale well beyond the initial phase when it's mainly maintained by enthusiasts for altruistic reasons. - -In our i13n framework, currently Waku only operates under reputation-based rewards and punishments. +Currently, Waku only operates under reputation-based rewards and punishments. While [RLN-Relay](https://rfc.vac.dev/spec/17/) adds monetary punishments for spammers, slashing is yet to be activated. +## Waku Store -# Waku Store +In this document, we focus on i13n for Waku Store. +Similar techniques may be later applied to other Waku light protocols. -In this section, we design a monetary-based i13n scheme for Waku Store. -Similar techniques may be later applied to other protocols. +Store is a client-server protocol that currently works as follows: +1. the client sends a `HistoryQuery` to the server; +2. the server sends a `HistoryResponse` to the client. -Store is a client-server protocol. -A client asks the node to respond with relevant messages previously relayed through the Relay protocol. -A relevant message is a message that has been broadcast via Relay within the specified time frame. -The response may be split into multiple parts, as specified by pagination parameters. +The response may be split into multiple parts, as specified by pagination parameters in `PagingInfo`. -Let's say, a client issues a request to the server. -We want the following to happen: +Let us define a relevant message as a message that has been broadcast via Relay within the time frame that the client specified. +The desired functionality of Store can be described as following: - the server responds quickly; - all the messages in the response are relevant; - the response contains only relevant messages. -From a security standpoint, each Store node should enforce limits on requests as to not be DoS-ed. +# Waku Store incentivization MVP -As Waku doesn't intent to establish consensus over past messages, -we can only rely on heuristics to determine whether a message had been relayed earlier. -To decrease the chance of missing some messages, a client may query multiple servers and combine their replies (union of all messages; messages reported by some majority of servers, etc). +In this section, we aim to define the simplest viable i13n modification to the Store protocol. -# Store i13n MVP +We propose to add the following aspects to the protocol: +1. price offer; +2. proof of payment; +3. reputation accounting. -We propose Store-i13n-MVP - the simplest version of i13n in Store. +### Price offer -## Current protocol - -As currently defined, the Store protocol works as follows: -1. the client sends a `HistoryQuery` to the server; -2. the server sends a `HistoryResponse` to the client. - -A response may come in multiple parts (pagination). -Pagination parameters are defined in `PagingInfo` message inside both the `HistoryQuery` and a ``HistoryResponse``. -Let us ignore the pagination considerations for now (assume it just works). - -## Proposed modification - -We proposes modification consists of three parts: -1. price negotiation; -2. reputation accounting; -3. results cross-checking. - -### Price negotiation - -Upon receiving a `HistoryQuery`, the server does the following: -1. Internally calculate the price it wants to charge. -2. Send a message to the client: "I can serve your request for N tokens". -3. If the client agrees, it pays and sends a proof of payment to the server. -4. The server sends the response. - -Price discovery to be discussed in a later section. -In particular, we will reason about which message properties (age, size, etc) should contribute to the price. +After the client sends a `HistoryQuery` to the server: +1. The server internally calculates the offer price and sends it to the client. +2. If the client agrees, it pays and sends a proof of payment to the server. +3. If the client does not agree, it sends a rejection message to the server. +4. If the server receives a valid proof of payment before a certain timeout, it sends the response to the client. +5. If the server receives a rejection message, or receives no message before a timeout, the server assumes that the client has rejected the offer. Potential issues: -- A malicious client overwhelms a server with requests and doesn't follow through. A countermeasure: ignore requests from the same client if they come too often. -- (other attacks?) +- The client overwhelms a server with requests but doesn't proceed with payment. Countermeasure: ignore requests from the same client if they come too often. +- The server and the client have no means to negotiate the price - see a later section on price negotiation. ### Proof of payment If the client agrees to the price, it sends a _proof of payment_ to the server. The nature of such proof depends on the means of payment. -Assuming the payment takes place on a blockchain, it could simply be a transaction hash. +Assuming the payment takes place on a blockchain, it could simply be a transaction hash (`txid`). -It's unclear if we need to ensure that a particular txid is linked to a particular request. +It's unclear whether we need to ensure that a given `txid` is linked to a particular request. Including request ID into the payment (a-la "memo field") threatens privacy. -Not including it looks fine though, assuming the server keeps track which transactions it had received correspond to which requests (and responses). - -TODO: explore the idea of service credentials: -- [https://forum.vac.dev/t/vac-sustainability-and-business-workshop/116](https://forum.vac.dev/t/vac-sustainability-and-business-workshop/116 "https://forum.vac.dev/t/vac-sustainability-and-business-workshop/116") -- [https://github.com/vacp2p/research/issues/99](https://github.com/vacp2p/research/issues/99 "https://github.com/vacp2p/research/issues/99") -- [https://github.com/vacp2p/research/issues/135](https://github.com/vacp2p/research/issues/135 "https://github.com/vacp2p/research/issues/135") +Not including it could lead to the server's confusion regarding which received payments correspond to which requests. #### Who pays first? We have to make a design decision: who pays first? Our options are: -1. the client pays first and trust the server to deliver; -2. the client pays after the fact: the server trusts the client; +1. the client pays first; +2. the client pays after the fac; 3. the client pays partly upfront and partly after the fact; -4. there is a third party (escrow) that ensures atomicity (a trusted third party or a semi-trusted, semi-automated entity like a smart contract). +4. a third party (escrow) ensures atomicity (it may be a centralized trusted third party or a semi-trusted entity like a smart contract). -Here are our design considerations: +Our design considerations are: - the MVP protocol should be simple; -- servers are considered to be a more "permanent" entities, that are more likely to have a long-lived ID; -- it is more important to protect the clients's privacy than the server's privacy: a client knows what server it queries in any case, while ideally the server shouldn't know who the client is. (This isn't entirely rigorous, think about it.) +- servers are more "permanent" entities and are more likely to have a long-lived identities; +- it is more important to protect the clients's privacy than the server's privacy: a client knows what server it queries, while the server ideally shouldn't know who the client is. -With that in mind, we suggest the scheme where the client pays first. +With that in mind, we suggest that the client pays first. It is simpler than splitting the payment, which would involve a) two payments, and b) negotiating the split. -It is also simpler than a trusted third party (the centralized flavor of which we want to avoid anyway). +It is also simpler than a trusted third party (the centralized flavor of which we want to avoid). + Comparing to "client pays after the fact", we observe that there is a balance between risk and privacy. -If the server "pays first", it assumes risk. -This risk should be decreased or paid for. -Decreasing the risk means keeping track of the clients' reputation from the server's standpoint, which may endanger clients' privacy. -Paying for the risk means increasing prices (i.e., well-behaved clients in aggregate pay for free-riders). -We suggest that the preferable design is the opposite: the client assumes the risk. -Why this is better: -- it's more likely that the server is professionalized: serving data is its business which it wouldn't want to sabotage; -- the client keeps their privacy, essentially paying for privacy with taking on more risk - this is OK, as risk is "anonymous", and reputation is not. +If the server "pays first", it assumes risk, which should be decreased or paid for. +Decreasing the risk means that the client keeps track of the clients' reputation, which endangers privacy. +Paying for the risk means higher prices (well-behaved clients pay for free-riders). + +We propose that the client assumes the risk and pays for it because: +- the server is more likely to be professionalized, so dropping paid requests would sabotage its business; +- the client pays for their privacy by assuming risk, which is acceptable (risk is "anonymous", reputation is not). ### Reputation accounting -Our protocol assumes that the client trusts the server. -In particular, the client pays first, and then hopes that the server sends back the response. -A server may technically take the money and do nothing. -To discourage this behavior, we use reputation: a client keeps track of the server's behavior. - -The MVP version could be: -- all servers start with zero reputation +We use reputation to discourage the server from taking the payment and not responding. +The client keeps track of the server's reputation: +- all servers start with zero reputation; - if the server honors the request, it gets +1; -- if the server does not respond after the initial query, it gets -1; -- if the server takes the money and _then_ does not respond, it gets banned (this client will never query it again). +- if the server does not respond to the initial request, it gets -1; +- if the server takes the money and does not respond before a timeout, the client will never query it again. Potential issues: -- An attacker can establish new server identities and continually run away with clients' money. - - countermeasures: - - a client only queries "trusted" servers (centralization); - - when querying a new server, a client first sends a small (i.e. cheap) request to not risk too much. -- Think about how the ban mechanism can be abused. Can an attacker "frame" competitors' servers so that many clients ban them? +- An attacker can establish new server identities and continue running away with clients' money. Countermeasures: + - a client only queries "trusted" servers (which however leads to centralization); + - when querying a new server, a client first sends a small (i.e. cheap) request as a test. +- The ban mechanism can theoretically be abused. For instance, a competitor may attack the victim server and cause the clients who were awaiting the response to ban that server. Countermeasure: prevent DoS-attacks. -### Results cross-checking +# Payment methods -> Never go to sea with two chronometers; take one or three. +The MVP protocol is agnostic to payment methods. +A payment method should generally have the following properties: +- wide distribution; +- good liquidity; +- low latency; +- good privacy; +- high security. -The client not only wants to receive _some_ response, it wants to receive all relevant messages and only them. -We don't have consensus over history, so it's impossible to know for sure if a message is relevant. -In non security-critical settings, a client may just accept the risk that some messages may be missing. -For more certainty, the client may query 3 independent servers and compare the results. -Only messages returned by 3/3 or 2/3 are considered relevant. +Let's list all decentralized payment options: +- ETH; +- a token on Ethereum (ERC20); +- a token on another EVM-based blockchain or a rollup; +- a token on a non-EVM blockchain (such as BTC / Lightning). -Servers' reputation may then be adjusted, but it's not completely obvious how: -- imagine a server whose response has a message that no other response has. - - should we punish it for inserting a fake message into history? or - - should we reward it for providing the data that other are (perhaps intentionally) hiding? -- Same with a server that _misses_ some message that others have delivered: we don't know what the ground truth is anyway. +Note also that there may be different market models that may motivate the choice of the payment method. +One model assumes that each client pays for its own requests. +Another model includes (centralized?) entities (i.e., developers of Waku-based apps) that pay for their users in bulk. -However, in the absence of a better mechanism, we can _define_ 2/3 as a validity criteria. -Then, it follows that a server that does _not_ have the "2/3" message is either malicious (censors) or badly managed (went offline when this message was propagated). -In any case, its reputation should be decreased. - -Note: the cross-checking part is optional and may be considered to be out of scope for the MVP protocol. +We also note that: +- eventually the protocol may support multiple payment methods; +- however, the MVP version should be simple, which likely means supporting just one payment method; +- if the initially supported payment method is an ERC-20 token, it should be easy to add other ERC-20 tokens later, including a potential WAKU token. # Evaluation -We measure the performance of our i13-ed Store protocol by the following metrics... - -TODO: how do we check if we've solved the problem? +We should think about what the success metrics for an incentivized protocol are, and how to measure them both in simulated settings, as well as in a live network. # Future work -Store-i13n-MVP is the simplest protocol we can start with. -Let us now outline the design choices to be made if we were to go beyond MVP. - +Let us now outline some of the open questions beyond MVP. ## Price discovery -An incentive scheme should balance the costs and benefits for a node. -Rewards should compensate the cost of good behavior. -Punishments should offset the benefits that bad behavior may bring. - -In the MVP i13n protocol, clients and servers establish a free market by negotiating prices. -A server should understand its true costs to negotiate effectively. +To offer a reasonable price, a server should understand its costs. The costs of a Store server are storage, bandwidth, and computation. +A Store server does two things: it stores messages, and serves messages to clients. -Let us assume a constant flow of messages per epoch and a constant flow of requests for older messages. -There are two processes: storing incoming messages, and serving old messages to clients. - -The cost of storing incoming messages for one epoch is composed of: +The cost of storing messages is composed of: - storage: - storage costs of all older messages: proportional to cumulative (message size x time stored); - - storage costs of newly arrived messages: proportional to message size; - - a constant cost for I/O operations (storing new messages); -- bandwidth (download) for receiving new messages: proportional to the total size of incoming messages per epoch; -- computational costs of receiving and storing new messages. + - the cost for I/O operations for storing new messages (roughly constant per unit time, though may fluctuate due to caching, disk fragmentation, etc.); +- bandwidth (download) for receiving new messages; +- computational costs. -(Strictly speaking, the I/O cost may not always be constant due to caching, disk fragmentation, etc.) - -The cost of storing messages to clients, per epoch, is composed of: -- storage: none (it's accounted for as storing cost); +The cost of serving messages to clients, per unit time, is composed of: - bandwidth - upload: proportional to (number of clients) x (length of time frame requested) x (message size); - download: proportional to the number of requests; @@ -285,32 +235,60 @@ The cost of storing messages to clients, per epoch, is composed of: Storage is likely the dominating cost. Storage costs is proportional to the amount of information stored and the time it is stored for. A cumulative cost of storing a single message grows linearly with time. -Assuming a constant stream of new messages, the total storage cost is quadratic in time. - The number of messages in a response may be approximated by the length of the time frame requested. -This assumes that messages are broadcast in the Relay network at a constant rate. Computation: the server spends computing cycles while handling requests. This costs likely depends not only on the computation itself, but also at the database structure. For example, retrieving old or rarely requested messages from the local database may be more expensive than fresh or popular ones due to caching. -## RLN as a proxy for message relevance +More formal calculations should be done, under certain assumptions about message flow (i.e., that it is constant). -RLN (rate limiting nullifiers) is a method of spam prevention in Relay ([RLN-Relay](https://rfc.vac.dev/spec/17/)). -The message sender generates a proof of enrollment in some membership set. -Multiple proofs generated within one epoch lead to punishment. -This technique limits the message rate from each node to at most one message per epoch. +## Price negotiation -In the i13n context, we can't prove whether a message has indeed been broadcast in the past. +If the server offers the price that is too high for the client, the client has no means to make a counter-offer. +This results in wasted bandwidth on requests that don't result in responses. +We could introduce a price negotiation step in the protocol, where the client and the server would exchange messages naming their acceptable prices until they agree of one of them decides to stop the negotiation. +We should make sure that price negotiation does not become a DoS vector (i.e., a client initiates a lengthy negotiation but ultimately rejects, wasting the server's resources). + +## Results cross-checking + +> Never go to sea with two chronometers; take one or three. + +The client wants to receive all relevant messages and only them. +Without consensus, it's impossible to check if a message is relevant. +In non security-critical settings, a client may accept the risk that some messages may be missing. +For more certainty, the client may query 3 independent servers and compare the results. +Messages returned by 3/3 or 2/3 are considered relevant. + +The servers' reputation may then be adjusted, but it's not completely obvious how. +Let A, B, C be the three servers. +Imagine there is a message that only A's response contains. +From the client's standpoint, this message is not relevant. + +How should A's reputation be adjusted? +Should A be punished for inserting a fake message into history? +Or should A be rewarded for providing a "rare" message that B and C have either missed or are intentionally censoring? + +## Preventing DoS attacks + +The client can overwhelm the server in at least two ways: +- sending many requests; +- sending a request that covers a very long time frame. + +The former can be prevented with rate limiting: the server would disconnect from such clients. +The latter can be mitigated economically, if the price depends on the length of the requested time frame. + +## Heuristics of relevance + +In the absence of consensus, we can't prove whether a message has indeed been broadcast in the past. Instead, we use RLN proofs as a proxy metric. +RLN (rate limiting nullifiers) is a method of spam prevention in Relay ([RLN-Relay](https://rfc.vac.dev/spec/17/)). +The message sender generates a proof of enrollment in a membership set. +Multiple proofs generated and revealed within one epoch lead to punishment. A valid RLN proof signifies that the message has been generated by a node with an active membership during a particular epoch. Note that a malicious node with a valid membership can generate messages but not broadcast them. -Such messages would not be "relevant" (i.e., other nodes would be unaware of them), but they would satisfy the RLN-based heuristic. - -Ideally, we would like to punish a server that omits relevant messages. -But as this can't be proven, we resort to reputation in this case. -In other words: if a client is dissatisfied with the response, it simply won't query this server anymore. -A way for the client to know (with some certainty) whether relevant messages have been omitted is to query another server. +Such messages would not be known to other nodes, but they would satisfy the RLN-based heuristic. +We may later look into other ways for the client to check message relevance. ## Privacy considerations @@ -319,80 +297,46 @@ In a client-server exchange, a client wants to selectively interact with the net By doing so, it often reveals what it is interested in (e.g., subscribes to particular topics). A malicious Store server can spy on a client in the following ways: -- track what time frames a client is interested in; +- track the topics the client is interested in; +- analyze the periods of history interesting for the client; - analyze the timing of requests; -- link requests done by the same client. +- link requests made by the same client. -Also, citing the [Store specification](https://rfc.vac.dev/spec/13/): +Citing the [Store specification](https://rfc.vac.dev/spec/13/): > The main security consideration ... is that a querying node have to reveal their content filters of interest to the queried node, hence potentially compromising their privacy. -## Payment methods +### Service credentials -The MVP protocol is agnostic to payment methods. -However, some payment methods may be more suitable than others. +Service credentials break the link between paying for the service and the service itself. +Such scheme may be explored in the context of payment methods for higher user privacy. -What we want from a payment method: -- wide distribution (many people already have it); -- high liquidity (i.e., easy to buy or sell at a reasonable exchange rate); -- low latency; -- high security. +In a credential-based scheme: +1. the client deposits funds into an on-chain pool; +2. the client generates a credential that proves the transfer in zero-knowledge; +3. the client sends the credential to the server; +4. the server uses the credential to pull funds from the pool. -Let's list all (decentralized) payment options that we have: -- proof-of-work: outsource-able, unavailable for consumer hardware - or is it? (Equihash etc) -- proof-of-X (storage, etc) -- cryptocurrency: - - ETH - - a token on Ethereum (ERC20) - - a token on another EVM blockchain - - a token on an EVM-based rollup - - a token on a non-EVM blockchain (BTC / Lightning?) +Further reading: [one](https://forum.vac.dev/t/vac-sustainability-and-business-workshop/116), [two](https://github.com/vacp2p/research/issues/99), [three](https://github.com/vacp2p/research/issues/135). -Note also that there may be different market models. -One model is that each client pays for its requests. -Another model assumes that (centralized) applications built on top of Waku buy "credits" in bulk for their users, for whom using the application (which may involve querying Store servers under the hood) is free of charge. +## Relation to long-term storage solutions -## Incentive compatibility +Decentralized file storage networks, such as Codex, could (and perhaps should) be the backend for Store servers. +Alternatives to Codex include IPFS, Filecoin, Sia, and Storj. +We should explore this landscape and understand its relevance for Store i13n. -In file storage, I store a file and I pay for the ability to query it later. In Store, Alice relays a message, a server stores is, and later Bob queries it (and pays for it under an i13n scheme). Is there a mismatch between who incurs costs and who pays for it? Shall we think of ways to make Alice incur some costs too? See: https://github.com/waku-org/research/issues/32 +## Should message senders pay? + +To ensure protocol sustainability, we should analyze its game theoretic properties. +Note that there are in fact more than two party to the Store protocol: +- the server; +- the client; +- the sender of the message. + +In particular, it is the _sender_ who imposes major costs on the server: the more messages the sender (or, indeed, multiple senders) broadcast, the higher are the Store server's storage costs. +However, it is the Store client who pays for fetching messages. +Is it fair / sustainable that the client pays for costs that the sender causes? +Would it be desired or possible to make the sender pay as well (see [issue](https://github.com/waku-org/research/issues/32))? ## Generalization for other Waku protocols -We plan to generalize i13n for Store to other Waku protocols, in particular, to light protocols (Lightpush and Filter). - -# Appendix: Deviations from the desired behavior - -There are multiple ways for a node to deviate from the desired behavior. -We look at potential misbehavior from the server side and from the client side separately. -### Server: Slow response -The server takes too long to respond. -Possible reasons: -- the server is offline accidentally; -- the request describes too many relevant messages (the server is overwhelmed); -- the server is malicious and deliberately delays the response; -- the server doesn't have some of the relevant messages and tries to request them from other nodes. - -### Server: Incomplete response -A relevant message is missing from the response. -Possible explanations: -- the server didn't receive the message when it was broadcast; -- the server deliberately withholds the message. - -Contrary to blockchains, Relay doesn't have consensтus over relayed messages. -Therefore, it's impossible to distinguish between the two scenarios above. - -### Server: Irrelevant response -The response contains a message that is not relevant. -There are two scenarios here depending on whether RLN proofs are enforced. -If RLN is not enforced, a server may insert any number or irrelevant messages into the response. -If RLN is enforced, a server can only do so as long as it has a valid membership to generate the respective proofs. -This doesn't eliminate the attack but limits its consequences. - -### Client: Too many requests -The client sends many request to the server within a short period of time. -This may be seen as a DoS attack. - -### Client: Request is too large -The client sends a response that incurs excessive expenses on the server. -For example, the request covers a very long period in history, or, more generically, -a period that contains many messages. -This may also be seen as a DoS attack. \ No newline at end of file +Think about how to generalize Store i13n to other Waku light protocols: Lightpush and Filter. From 5e2caaacb0243883bef911c4fd2241fd6732a1c3 Mon Sep 17 00:00:00 2001 From: Sergei Tikhomirov Date: Fri, 10 Nov 2023 12:33:20 +0100 Subject: [PATCH 09/18] fix definition of complete and correct response Co-authored-by: Hanno Cornelius <68783915+jm-clius@users.noreply.github.com> --- incentivization.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/incentivization.md b/incentivization.md index c60726c..3c97ade 100644 --- a/incentivization.md +++ b/incentivization.md @@ -104,7 +104,7 @@ Let us define a relevant message as a message that has been broadcast via Relay The desired functionality of Store can be described as following: - the server responds quickly; -- all the messages in the response are relevant; +- all relevant messages are in the response; - the response contains only relevant messages. # Waku Store incentivization MVP From 5d01e32e374922c6aa31d411c582e294a22e37f9 Mon Sep 17 00:00:00 2001 From: Sergei Tikhomirov Date: Fri, 10 Nov 2023 12:35:39 +0100 Subject: [PATCH 10/18] more precise wording on filter criteria Co-authored-by: Hanno Cornelius <68783915+jm-clius@users.noreply.github.com> --- incentivization.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/incentivization.md b/incentivization.md index 3c97ade..5021951 100644 --- a/incentivization.md +++ b/incentivization.md @@ -100,7 +100,7 @@ Store is a client-server protocol that currently works as follows: The response may be split into multiple parts, as specified by pagination parameters in `PagingInfo`. -Let us define a relevant message as a message that has been broadcast via Relay within the time frame that the client specified. +Let us define a relevant message as a message that has been broadcast via Relay within the time frame and matching the filter criteria that the client specified. The desired functionality of Store can be described as following: - the server responds quickly; From ec1e25b827b15a8a8a83fb1d32e9402b731a442f Mon Sep 17 00:00:00 2001 From: Sergei Tikhomirov Date: Fri, 10 Nov 2023 13:08:42 +0100 Subject: [PATCH 11/18] minor fixes --- incentivization.md | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/incentivization.md b/incentivization.md index 5021951..566187a 100644 --- a/incentivization.md +++ b/incentivization.md @@ -159,7 +159,7 @@ It is also simpler than a trusted third party (the centralized flavor of which w Comparing to "client pays after the fact", we observe that there is a balance between risk and privacy. If the server "pays first", it assumes risk, which should be decreased or paid for. -Decreasing the risk means that the client keeps track of the clients' reputation, which endangers privacy. +Decreasing the risk means that the server keeps track of the clients' reputation, which endangers privacy. Paying for the risk means higher prices (well-behaved clients pay for free-riders). We propose that the client assumes the risk and pays for it because: @@ -172,14 +172,15 @@ We use reputation to discourage the server from taking the payment and not respo The client keeps track of the server's reputation: - all servers start with zero reputation; - if the server honors the request, it gets +1; -- if the server does not respond to the initial request, it gets -1; -- if the server takes the money and does not respond before a timeout, the client will never query it again. +- if the server does not respond before the payment, it gets -1; +- if the server does not respond after the payment and before a timeout, the client will never query it again. Potential issues: - An attacker can establish new server identities and continue running away with clients' money. Countermeasures: - a client only queries "trusted" servers (which however leads to centralization); - when querying a new server, a client first sends a small (i.e. cheap) request as a test. - The ban mechanism can theoretically be abused. For instance, a competitor may attack the victim server and cause the clients who were awaiting the response to ban that server. Countermeasure: prevent DoS-attacks. +- Servers may also farm reputation by running clients and querying their own server. # Payment methods @@ -247,7 +248,7 @@ More formal calculations should be done, under certain assumptions about message If the server offers the price that is too high for the client, the client has no means to make a counter-offer. This results in wasted bandwidth on requests that don't result in responses. -We could introduce a price negotiation step in the protocol, where the client and the server would exchange messages naming their acceptable prices until they agree of one of them decides to stop the negotiation. +We could introduce a price negotiation step in the protocol, where the client and the server would exchange messages naming their acceptable prices until they agree, or one of them decides to stop the negotiation. We should make sure that price negotiation does not become a DoS vector (i.e., a client initiates a lengthy negotiation but ultimately rejects, wasting the server's resources). ## Results cross-checking From f5f89494c306d9480c0e5adf53d5d44fd5599eb4 Mon Sep 17 00:00:00 2001 From: Sergei Tikhomirov Date: Fri, 17 Nov 2023 18:03:23 +0100 Subject: [PATCH 12/18] rewrite with MVP-focus; move future work to issues --- incentivization.md | 242 +++++++++++---------------------------------- 1 file changed, 55 insertions(+), 187 deletions(-) diff --git a/incentivization.md b/incentivization.md index 566187a..ac4b1c3 100644 --- a/incentivization.md +++ b/incentivization.md @@ -92,7 +92,6 @@ While [RLN-Relay](https://rfc.vac.dev/spec/17/) adds monetary punishments for sp ## Waku Store In this document, we focus on i13n for Waku Store. -Similar techniques may be later applied to other Waku light protocols. Store is a client-server protocol that currently works as follows: 1. the client sends a `HistoryQuery` to the server; @@ -112,61 +111,62 @@ The desired functionality of Store can be described as following: In this section, we aim to define the simplest viable i13n modification to the Store protocol. We propose to add the following aspects to the protocol: -1. price offer; -2. proof of payment; -3. reputation accounting. +1. pricing: + 1. cost calculation + 2. price advertisement + 3. price negotiation +2. payment: + 1. payment itself + 2. proof of payment +3. reputation +4. results cross-checking -### Price offer +The MVP version of the protocol has no price advertisement, no price negotiation, and no results cross-checking. +Other elements are present in a minimal version. +## Pricing + +For MVP, we assume a constant price per hour of history. After the client sends a `HistoryQuery` to the server: -1. The server internally calculates the offer price and sends it to the client. +1. The server internally calculates the offer price and sends it to the client. 2. If the client agrees, it pays and sends a proof of payment to the server. 3. If the client does not agree, it sends a rejection message to the server. -4. If the server receives a valid proof of payment before a certain timeout, it sends the response to the client. +4. If the server receives a valid payment before a certain timeout, it sends the response to the client. 5. If the server receives a rejection message, or receives no message before a timeout, the server assumes that the client has rejected the offer. -Potential issues: -- The client overwhelms a server with requests but doesn't proceed with payment. Countermeasure: ignore requests from the same client if they come too often. -- The server and the client have no means to negotiate the price - see a later section on price negotiation. +### Future work -### Proof of payment +- DoS protection: a client can overwhelm a server with requests and not proceed to payment. Countermeasure: ignore requests from the same client if they come too often; generalize a reputation system to servers ranking clients. +- Cost calculation - see https://github.com/waku-org/research/issues/35 +- Price advertisement - see https://github.com/waku-org/research/issues/51 +- Price negotiation - see https://github.com/waku-org/research/issues/52 + +## Payment If the client agrees to the price, it sends a _proof of payment_ to the server. -The nature of such proof depends on the means of payment. -Assuming the payment takes place on a blockchain, it could simply be a transaction hash (`txid`). +For the MVP, each request is paid for with a separate transaction. +The transaction hash (`txid`) acts as a proof of payment. -It's unclear whether we need to ensure that a given `txid` is linked to a particular request. -Including request ID into the payment (a-la "memo field") threatens privacy. -Not including it could lead to the server's confusion regarding which received payments correspond to which requests. - -#### Who pays first? - -We have to make a design decision: who pays first? -Our options are: -1. the client pays first; -2. the client pays after the fac; -3. the client pays partly upfront and partly after the fact; -4. a third party (escrow) ensures atomicity (it may be a centralized trusted third party or a semi-trusted entity like a smart contract). +Note that client gives proof of payment before it receives the response. +Other options could be: +1. the client pays after the fact; +2. the client pays partly upfront and partly after the fact; +3. an escrow (a centralized trusted third party, or a semi-trusted entity like a smart contract) ensures atomicity . Our design considerations are: - the MVP protocol should be simple; -- servers are more "permanent" entities and are more likely to have a long-lived identities; -- it is more important to protect the clients's privacy than the server's privacy: a client knows what server it queries, while the server ideally shouldn't know who the client is. +- servers are more "permanent" entities and are more likely to have long-lived identities; +- it is more important to protect the clients's privacy than the server's privacy. -With that in mind, we suggest that the client pays first. -It is simpler than splitting the payment, which would involve a) two payments, and b) negotiating the split. -It is also simpler than a trusted third party (the centralized flavor of which we want to avoid). +In light of these criteria, we suggest that the client pays first: this is simpler than splitting the payment, more secure than trusting a third party, and (arguably) more privacy-preserving for the client than the alternative where the client pays after the fact (that would encourage servers to deanonymize clients to prevent fraud). -Comparing to "client pays after the fact", we observe that there is a balance between risk and privacy. -If the server "pays first", it assumes risk, which should be decreased or paid for. -Decreasing the risk means that the server keeps track of the clients' reputation, which endangers privacy. -Paying for the risk means higher prices (well-behaved clients pay for free-riders). +### Future work -We propose that the client assumes the risk and pays for it because: -- the server is more likely to be professionalized, so dropping paid requests would sabotage its business; -- the client pays for their privacy by assuming risk, which is acceptable (risk is "anonymous", reputation is not). +- Add more payment methods - see https://github.com/waku-org/research/issues/58 +- Design a subscription model with service credentials - see https://github.com/waku-org/research/issues/59 +- Add privacy to service credentials - see https://github.com/waku-org/research/issues/60 -### Reputation accounting +## Reputation We use reputation to discourage the server from taking the payment and not responding. The client keeps track of the server's reputation: @@ -182,162 +182,30 @@ Potential issues: - The ban mechanism can theoretically be abused. For instance, a competitor may attack the victim server and cause the clients who were awaiting the response to ban that server. Countermeasure: prevent DoS-attacks. - Servers may also farm reputation by running clients and querying their own server. -# Payment methods +### Future work -The MVP protocol is agnostic to payment methods. -A payment method should generally have the following properties: -- wide distribution; -- good liquidity; -- low latency; -- good privacy; -- high security. +Design a more comprehensive reputation system: +- local reputation - see https://github.com/waku-org/research/issues/48 +- global reputation - see https://github.com/waku-org/research/issues/49 -Let's list all decentralized payment options: -- ETH; -- a token on Ethereum (ERC20); -- a token on another EVM-based blockchain or a rollup; -- a token on a non-EVM blockchain (such as BTC / Lightning). +Reputation may also be use to rank clients to prevent DoS attacks when a client overwhelms the server with requests. +While rate limiting stops such attack, the server would need to link requests coming from one client, threatening its privacy. -Note also that there may be different market models that may motivate the choice of the payment method. -One model assumes that each client pays for its own requests. -Another model includes (centralized?) entities (i.e., developers of Waku-based apps) that pay for their users in bulk. +## Results cross-checking -We also note that: -- eventually the protocol may support multiple payment methods; -- however, the MVP version should be simple, which likely means supporting just one payment method; -- if the initially supported payment method is an ERC-20 token, it should be easy to add other ERC-20 tokens later, including a potential WAKU token. +Cross-checking is absent in MVP but should be considered later. +We can separate it into two questions: the client want to ensure that servers are a) not censoring real messages; b) not injecting fake messages into history. + +- Cross-checking the results against censorship - see https://github.com/waku-org/research/issues/57 +- Use RLN to limit fake message insertion - see https://github.com/waku-org/research/issues/38 # Evaluation We should think about what the success metrics for an incentivized protocol are, and how to measure them both in simulated settings, as well as in a live network. -# Future work +# Longer-term future work -Let us now outline some of the open questions beyond MVP. -## Price discovery - -To offer a reasonable price, a server should understand its costs. -The costs of a Store server are storage, bandwidth, and computation. -A Store server does two things: it stores messages, and serves messages to clients. - -The cost of storing messages is composed of: -- storage: - - storage costs of all older messages: proportional to cumulative (message size x time stored); - - the cost for I/O operations for storing new messages (roughly constant per unit time, though may fluctuate due to caching, disk fragmentation, etc.); -- bandwidth (download) for receiving new messages; -- computational costs. - -The cost of serving messages to clients, per unit time, is composed of: -- bandwidth - - upload: proportional to (number of clients) x (length of time frame requested) x (message size); - - download: proportional to the number of requests; -- computational cost of handling requests. - -Storage is likely the dominating cost. -Storage costs is proportional to the amount of information stored and the time it is stored for. -A cumulative cost of storing a single message grows linearly with time. -The number of messages in a response may be approximated by the length of the time frame requested. - -Computation: the server spends computing cycles while handling requests. -This costs likely depends not only on the computation itself, but also at the database structure. -For example, retrieving old or rarely requested messages from the local database may be more expensive than fresh or popular ones due to caching. - -More formal calculations should be done, under certain assumptions about message flow (i.e., that it is constant). - -## Price negotiation - -If the server offers the price that is too high for the client, the client has no means to make a counter-offer. -This results in wasted bandwidth on requests that don't result in responses. -We could introduce a price negotiation step in the protocol, where the client and the server would exchange messages naming their acceptable prices until they agree, or one of them decides to stop the negotiation. -We should make sure that price negotiation does not become a DoS vector (i.e., a client initiates a lengthy negotiation but ultimately rejects, wasting the server's resources). - -## Results cross-checking - -> Never go to sea with two chronometers; take one or three. - -The client wants to receive all relevant messages and only them. -Without consensus, it's impossible to check if a message is relevant. -In non security-critical settings, a client may accept the risk that some messages may be missing. -For more certainty, the client may query 3 independent servers and compare the results. -Messages returned by 3/3 or 2/3 are considered relevant. - -The servers' reputation may then be adjusted, but it's not completely obvious how. -Let A, B, C be the three servers. -Imagine there is a message that only A's response contains. -From the client's standpoint, this message is not relevant. - -How should A's reputation be adjusted? -Should A be punished for inserting a fake message into history? -Or should A be rewarded for providing a "rare" message that B and C have either missed or are intentionally censoring? - -## Preventing DoS attacks - -The client can overwhelm the server in at least two ways: -- sending many requests; -- sending a request that covers a very long time frame. - -The former can be prevented with rate limiting: the server would disconnect from such clients. -The latter can be mitigated economically, if the price depends on the length of the requested time frame. - -## Heuristics of relevance - -In the absence of consensus, we can't prove whether a message has indeed been broadcast in the past. -Instead, we use RLN proofs as a proxy metric. -RLN (rate limiting nullifiers) is a method of spam prevention in Relay ([RLN-Relay](https://rfc.vac.dev/spec/17/)). -The message sender generates a proof of enrollment in a membership set. -Multiple proofs generated and revealed within one epoch lead to punishment. -A valid RLN proof signifies that the message has been generated by a node with an active membership during a particular epoch. -Note that a malicious node with a valid membership can generate messages but not broadcast them. -Such messages would not be known to other nodes, but they would satisfy the RLN-based heuristic. -We may later look into other ways for the client to check message relevance. - -## Privacy considerations - -Light protocols, in general, have weaker privacy properties than P2P protocols. -In a client-server exchange, a client wants to selectively interact with the network. -By doing so, it often reveals what it is interested in (e.g., subscribes to particular topics). - -A malicious Store server can spy on a client in the following ways: -- track the topics the client is interested in; -- analyze the periods of history interesting for the client; -- analyze the timing of requests; -- link requests made by the same client. - -Citing the [Store specification](https://rfc.vac.dev/spec/13/): -> The main security consideration ... is that a querying node have to reveal their content filters of interest to the queried node, hence potentially compromising their privacy. - -### Service credentials - -Service credentials break the link between paying for the service and the service itself. -Such scheme may be explored in the context of payment methods for higher user privacy. - -In a credential-based scheme: -1. the client deposits funds into an on-chain pool; -2. the client generates a credential that proves the transfer in zero-knowledge; -3. the client sends the credential to the server; -4. the server uses the credential to pull funds from the pool. - -Further reading: [one](https://forum.vac.dev/t/vac-sustainability-and-business-workshop/116), [two](https://github.com/vacp2p/research/issues/99), [three](https://github.com/vacp2p/research/issues/135). - -## Relation to long-term storage solutions - -Decentralized file storage networks, such as Codex, could (and perhaps should) be the backend for Store servers. -Alternatives to Codex include IPFS, Filecoin, Sia, and Storj. -We should explore this landscape and understand its relevance for Store i13n. - -## Should message senders pay? - -To ensure protocol sustainability, we should analyze its game theoretic properties. -Note that there are in fact more than two party to the Store protocol: -- the server; -- the client; -- the sender of the message. - -In particular, it is the _sender_ who imposes major costs on the server: the more messages the sender (or, indeed, multiple senders) broadcast, the higher are the Store server's storage costs. -However, it is the Store client who pays for fetching messages. -Is it fair / sustainable that the client pays for costs that the sender causes? -Would it be desired or possible to make the sender pay as well (see [issue](https://github.com/waku-org/research/issues/32))? - -## Generalization for other Waku protocols - -Think about how to generalize Store i13n to other Waku light protocols: Lightpush and Filter. +- Analyze privacy issues - see https://github.com/waku-org/research/issues/61 +- Analyze decentralized storage protocols and their relevance e.g. as back-end storage for Store servers - see https://github.com/waku-org/research/issues/34 +- Analyze the role of message senders, in particular, whether they should pay for sending non-ephemeral messages - see https://github.com/waku-org/research/issues/32 +- Generalize incentivization protocol to other Waku light protocols: Lightpush and Filter. \ No newline at end of file From a656e965adb9958d73e04091435e5050824b28f7 Mon Sep 17 00:00:00 2001 From: Sergei Tikhomirov Date: Mon, 20 Nov 2023 11:57:59 +0100 Subject: [PATCH 13/18] finalize MVP-centric version with links to future work issues --- incentivization.md | 111 ++++++++++++++++++++++----------------------- 1 file changed, 53 insertions(+), 58 deletions(-) diff --git a/incentivization.md b/incentivization.md index ac4b1c3..59bfe94 100644 --- a/incentivization.md +++ b/incentivization.md @@ -1,47 +1,36 @@ Waku is a family of decentralized communication protocols. -The Waku network consists of independent nodes running the corresponding protocols. -Waku needs incentivization (i13n) to ensure proper node behavior in the absence of any centralized coordinator. +The Waku Network (TWN) consists of independent nodes running Waku protocols. +TWN needs incentivization (shortened to i13n) to ensure proper node behavior. -In this document, we overview the problem of i13n in decentralized systems. -We classify the possible methods of i13n and give example used in prior successful P2P networks. -We then briefly introduce Waku and outline the unique i13n challenges it presents. +The goal of this document is to outline and contextualize our approach to TWN i13n. +After providing an overview of Waku and relevant prior work, +we focus on Waku Store - a client-server protocol for quer +We then introduce a minimal viable addition to Store to enable i13n, and list research directions for future work. -We then go into more detail into one of the Waku's protocols, Store, responsible for archival storage. -We propose an i13n scheme for Store and implement an MVP solution. -We discuss the choices we have made for the MVP version, and what design options may be considered in the future. - -# Classification of i13n tools +# Incentivization in decentralized networks +## Incentivization tools We can think of incentivization tools as a two-by-two matrix: - rewards vs punishment; - monetary vs reputation. In other words, there are four quadrants: -- monetary reward: the client pays the server; -- monetary punishment: the server makes a deposit and gets slashed if it misbehaves; -- reputation reward: the server's reputation increases if it behaves well; -- reputation punishment: the server's reputation decreases if it behaves badly. +- monetary reward: the node gets rewarded; +- monetary punishment: the nodes deposits funds that are taken away (slashed) if it misbehaves; +- reputation reward: the node's reputation increases if it behaves well; +- reputation punishment: the node's reputation decreases if it behaves badly. -Reputation can only work if there are tangible benefits of having a high reputation. -For example, clients should be more likely to connect to servers with high reputation and disconnect from servers with low reputation. -In the presence of monetary rewards, low-reputation servers miss out on potential revenue or lose their deposit. -Without the monetary aspects, low-reputation nodes can't get as much benefit from the network. -Reputation either assumes a repeated interaction (i.e., local reputation), or some amount of trust (centrally managed rankings). +Reputation only works if high reputation brings tangible benefits. +For example, if nodes chose neighbors based on reputation, low-reputation nodes may miss out on potential revenue. +Reputation scores may be local (a node assigns scores to its neighbors) or global (each node gets a uniform score). +Global reputation in its simplest implementation involves a trusted third party, +although decentralized approaches are also possible. -Monetary motivation should ideally be atomically linked with performance. -If the client pays first, the server cannot deny service, -and if the client pays after the fact, it's impossible to default on this obligation. +## Prior work -In blockchains, the desired behavior of miners or validators can be automatically verified and rewarded with native tokens (or punished by slashing). -Enforcing atomicity in decentralized data-focused networks is challenging: -it is non-trivial to prove that a certain piece of data was sent or received. -Therefore, such cases may warrant a combination of monetary and reputation-based approaches. +We may split incentivized decentralized networks into early file-sharing, blockchains, and decentralized storage. -# Related work - -There have been many example of incentivized decentralized systems. - -## Early P2P file-sharing +### Early P2P file-sharing Early P2P file-sharing networks employed reputation-based approaches and sticky defaults. For instance, in BitTorrent, a peer by default shares pieces of a file before having received it in whole. @@ -49,46 +38,52 @@ At the same time, the bandwidth that a peer can use depends on how much is has s This policy rewards nodes who share by allowing them to download file faster. While this reward is not monetary, it has proven to be working in practice. -## Blockchains +### Blockchains -The key innovation of Bitcoin, inherited and built upon by later blockchains, is native monetary i13. -In Bitcoin, miners create new blocks and are automatically rewarded with newly mined coins. -An invalid block is rejected by other nodes and not rewarded. -There are no intrinsic monetary punishments in Bitcoin, only rewards. -However, mining nodes are required to expend physical resources for block generation. +Bitcoin has introduced native monetary i13n in a P2P network with proof-of-work (PoW). +PoW miners are automatically rewarded with newly mined coins for generating blocks. +There are no intrinsic monetary punishments in Bitcoin. +However, miners must expend physical resources before claiming the reward. +Proof-of-stake (PoS) algorithms introduce intrinsic monetary punishments. +PoS validators lock up (stake) native tokens to get rewarded for validating blocks or slashed for misbehavior. -Proof-of-stake algorithms introduce intrinsic monetary punishments in the blockchain context. -A validator locks up (stakes) native tokens and gets rewarded for validating new blocks and slashed for misbehavior. +### Decentralized storage -## Decentralized storage - -Decentralized storage networks, including Codex, Storj, Sia, Filecoin, IPFS, combine the techniques from early P2P file-sharing and blockchain-inspired reward mechanisms to incentivize nodes to store data. +Post-Bitcoin decentralized storage networks include Codex, Storj, Sia, Filecoin, IPFS. +Their i13n methods combine techniques from early P2P file-sharing with blockchain-inspired reward mechanisms. # Waku background Waku is a family of protocols (see [architecture](https://waku.org/about/architect)) for a modular decentralized censorship-resistant P2P communications network. The backbone of Waku is the Relay protocol (and its spam-protected version [RLN-Relay](https://rfc.vac.dev/spec/17/)). -Additionally, there are three light (or client-server, or request-response) protocols: Filter, Store, and Lightpush. +Additionally, there are light protocols: Filter, Store, and Lightpush. +Light protocols are also referred to as client-server protocols and request-response protocols. +A server is a node running Relay and Store (server-side). +A client is a node running a client-side of any of the light protocols as a light node or a client. +A server may sometimes be referred to as a full node, and a client as a light node. There is no strict definition of a full node vs a light node in Waku (see [discussion](https://github.com/waku-org/research/issues/28)). -In this document, we refer to a node that is running Relay and Store (server-side) as a full node or a server, and to a node that is running a client-side of any of the light protocols as a light node or a client. -In light protocols, a client sends a request to a server. -A server (a Relay node) performs some actions and returns a response, in particular: +In light protocols, a client sends a request to a server, and a server performs some actions and returns a response: - [[Filter]]: the server will relay (only) messages that pass a filter to the client; -- [[Store]]: the server responds with messages that had been broadcast within the specified time frame; +- [[Store]]: the server responds with messages broadcast earlier within the specified time frame; - [[Lightpush]]: the server publishes the client's message to the Relay network. ## Waku i13n challenges -As a communication protocol, Waku lacks consensus or a native token. -These properties bring Waku closer to purely reputation-incentivized file-sharing systems. -Our goal nevertheless is to combine monetary and reputation-based incentives. -The rationale is that monetary incentives have demonstrated their robustness in blockchains, -and are well-suited for a network designed to scale well beyond the initial phase when it's mainly maintained by enthusiasts for altruistic reasons. -Currently, Waku only operates under reputation-based rewards and punishments. +Waku lacks consensus or a native token, which brings it closer to reputation-incentivized file-sharing systems. +Indeed, currently Waku only operates under reputation-based rewards and punishments. While [RLN-Relay](https://rfc.vac.dev/spec/17/) adds monetary punishments for spammers, slashing is yet to be activated. +Monetary rewards and punishments should ideally be atomically linked with performance. +A benefit of blockchains in this respect is that the desired behavior of miners or validators can be verified on-chain. +Enforcing atomicity in decentralized data-focused networks is more challenging: +it is non-trivial to prove that a certain piece of data was sent or received. + +Our goal is to combine monetary and reputation-based incentives for Waku. +Monetary incentives have demonstrated their robustness in blockchains. +We think they are necessary to scale the network beyond the initial phase when it's maintained altruistically. + ## Waku Store In this document, we focus on i13n for Waku Store. @@ -108,8 +103,6 @@ The desired functionality of Store can be described as following: # Waku Store incentivization MVP -In this section, we aim to define the simplest viable i13n modification to the Store protocol. - We propose to add the following aspects to the protocol: 1. pricing: 1. cost calculation @@ -121,8 +114,10 @@ We propose to add the following aspects to the protocol: 3. reputation 4. results cross-checking -The MVP version of the protocol has no price advertisement, no price negotiation, and no results cross-checking. +In this document, we define the simplest viable i13n modification to the Store protocol (MVP). +The MVP protocol has no price advertisement, no price negotiation, and no results cross-checking. Other elements are present in a minimal version. +In further subsections, we list the potential direction for future work towards a fully-fledged i13n protocol. ## Pricing @@ -144,7 +139,7 @@ After the client sends a `HistoryQuery` to the server: ## Payment If the client agrees to the price, it sends a _proof of payment_ to the server. -For the MVP, each request is paid for with a separate transaction. +For the MVP, each request is paid for with a separate blockchain transaction. The transaction hash (`txid`) acts as a proof of payment. Note that client gives proof of payment before it receives the response. @@ -194,7 +189,7 @@ While rate limiting stops such attack, the server would need to link requests co ## Results cross-checking Cross-checking is absent in MVP but should be considered later. -We can separate it into two questions: the client want to ensure that servers are a) not censoring real messages; b) not injecting fake messages into history. +We can separate it into two tasks for the client: ensure that servers are a) not censoring real messages; b) not injecting fake messages into history. - Cross-checking the results against censorship - see https://github.com/waku-org/research/issues/57 - Use RLN to limit fake message insertion - see https://github.com/waku-org/research/issues/38 From 36ba5dacff516165592e81dbd018a97823a57605 Mon Sep 17 00:00:00 2001 From: Sergei Tikhomirov Date: Thu, 23 Nov 2023 12:49:29 +0100 Subject: [PATCH 14/18] clarify MVP pricing; generalize reputation scores; minor edits --- incentivization.md | 128 ++++++++++++++++++++++++++------------------- 1 file changed, 74 insertions(+), 54 deletions(-) diff --git a/incentivization.md b/incentivization.md index 59bfe94..0b83208 100644 --- a/incentivization.md +++ b/incentivization.md @@ -4,8 +4,8 @@ TWN needs incentivization (shortened to i13n) to ensure proper node behavior. The goal of this document is to outline and contextualize our approach to TWN i13n. After providing an overview of Waku and relevant prior work, -we focus on Waku Store - a client-server protocol for quer -We then introduce a minimal viable addition to Store to enable i13n, and list research directions for future work. +we focus on Waku Store - a client-server protocol for querying historical messages. +We introduce a minimal viable addition to Store to enable i13n, and list research directions for future work. # Incentivization in decentralized networks ## Incentivization tools @@ -21,9 +21,9 @@ In other words, there are four quadrants: - reputation punishment: the node's reputation decreases if it behaves badly. Reputation only works if high reputation brings tangible benefits. -For example, if nodes chose neighbors based on reputation, low-reputation nodes may miss out on potential revenue. +For example, if nodes chose neighbors based on reputation, low-reputation nodes miss out on potential revenue. Reputation scores may be local (a node assigns scores to its neighbors) or global (each node gets a uniform score). -Global reputation in its simplest implementation involves a trusted third party, +Global reputation in its simplest form involves a trusted third party, although decentralized approaches are also possible. ## Prior work @@ -32,20 +32,20 @@ We may split incentivized decentralized networks into early file-sharing, blockc ### Early P2P file-sharing -Early P2P file-sharing networks employed reputation-based approaches and sticky defaults. -For instance, in BitTorrent, a peer by default shares pieces of a file before having received it in whole. -At the same time, the bandwidth that a peer can use depends on how much is has shared previously. -This policy rewards nodes who share by allowing them to download file faster. -While this reward is not monetary, it has proven to be working in practice. +Early P2P file-sharing networks employ reputation-based approaches and sticky defaults. +For instance, the BitTorrent protocol rewards uploading peers with faster downloads. +The download bandwidth available to a peer depends on how much it has uploaded. +Moreover, peers share pieces of a file before having received it in whole. +This non-monetary i13n policy has been proved to work in practice. ### Blockchains -Bitcoin has introduced native monetary i13n in a P2P network with proof-of-work (PoW). -PoW miners are automatically rewarded with newly mined coins for generating blocks. +Bitcoin has introduced proof-of-work (PoW) for native monetary rewards in a P2P network. +PoW miners are automatically assigned newly mined coins for generating blocks. There are no intrinsic monetary punishments in Bitcoin. However, miners must expend physical resources before claiming the reward. -Proof-of-stake (PoS) algorithms introduce intrinsic monetary punishments. -PoS validators lock up (stake) native tokens to get rewarded for validating blocks or slashed for misbehavior. +Proof-of-stake (PoS), used in Ethereum and many other cryptocurrencies, introduces intrinsic monetary punishments. +PoS validators lock up (stake) native tokens and get rewarded for validating blocks or slashed for misbehavior. ### Decentralized storage @@ -54,31 +54,31 @@ Their i13n methods combine techniques from early P2P file-sharing with blockchai # Waku background -Waku is a family of protocols (see [architecture](https://waku.org/about/architect)) for a modular decentralized censorship-resistant P2P communications network. +Waku is a family of protocols (see [architecture](https://waku.org/about/architect)) for a modular privacy-preserving censorship-resistant decentralized communications network. The backbone of Waku is the Relay protocol (and its spam-protected version [RLN-Relay](https://rfc.vac.dev/spec/17/)). Additionally, there are light protocols: Filter, Store, and Lightpush. Light protocols are also referred to as client-server protocols and request-response protocols. -A server is a node running Relay and Store (server-side). -A client is a node running a client-side of any of the light protocols as a light node or a client. +A server is a node running Relay and a server-side of at least one light protocol. +A client is a node running a client-side of any of the light protocols. A server may sometimes be referred to as a full node, and a client as a light node. There is no strict definition of a full node vs a light node in Waku (see [discussion](https://github.com/waku-org/research/issues/28)). In light protocols, a client sends a request to a server, and a server performs some actions and returns a response: - [[Filter]]: the server will relay (only) messages that pass a filter to the client; -- [[Store]]: the server responds with messages broadcast earlier within the specified time frame; +- [[Store]]: the server responds with messages relayed within the specified earlier time frame; - [[Lightpush]]: the server publishes the client's message to the Relay network. ## Waku i13n challenges Waku lacks consensus or a native token, which brings it closer to reputation-incentivized file-sharing systems. -Indeed, currently Waku only operates under reputation-based rewards and punishments. +As of late 2023, Waku only operates under reputation-based rewards and punishments. While [RLN-Relay](https://rfc.vac.dev/spec/17/) adds monetary punishments for spammers, slashing is yet to be activated. Monetary rewards and punishments should ideally be atomically linked with performance. A benefit of blockchains in this respect is that the desired behavior of miners or validators can be verified on-chain. Enforcing atomicity in decentralized data-focused networks is more challenging: -it is non-trivial to prove that a certain piece of data was sent or received. +it is non-trivial to prove that a certain piece of data has been relayed. Our goal is to combine monetary and reputation-based incentives for Waku. Monetary incentives have demonstrated their robustness in blockchains. @@ -86,24 +86,19 @@ We think they are necessary to scale the network beyond the initial phase when i ## Waku Store -In this document, we focus on i13n for Waku Store. - -Store is a client-server protocol that currently works as follows: +Waku Store is a light protocol for querying historic messages. +It currently works as follows: 1. the client sends a `HistoryQuery` to the server; 2. the server sends a `HistoryResponse` to the client. The response may be split into multiple parts, as specified by pagination parameters in `PagingInfo`. -Let us define a relevant message as a message that has been broadcast via Relay within the time frame and matching the filter criteria that the client specified. -The desired functionality of Store can be described as following: +We define a _relevant_ message as a message that matches a client-defined filter (e.g., it has been relayed within a specified time frame). +Ideally, after receiving a request, a server should quickly send back a response containing all relevant messages and only them. -- the server responds quickly; -- all relevant messages are in the response; -- the response contains only relevant messages. +# Waku Store incentivization -# Waku Store incentivization MVP - -We propose to add the following aspects to the protocol: +An incentivized Store protocol has the following extra steps: 1. pricing: 1. cost calculation 2. price advertisement @@ -114,20 +109,31 @@ We propose to add the following aspects to the protocol: 3. reputation 4. results cross-checking -In this document, we define the simplest viable i13n modification to the Store protocol (MVP). -The MVP protocol has no price advertisement, no price negotiation, and no results cross-checking. -Other elements are present in a minimal version. -In further subsections, we list the potential direction for future work towards a fully-fledged i13n protocol. +In this document, we focus on the simplest viable i13n for Store (MVP). +Compared to the fully-fledged protocol, the MVP version is simplified in the following ways: +- cost calculation is based on a common-knowledge price; +- there is no price advertisement and no price negotiation; +- each query is paid for in a separate transaction, `txid` acts a proof of payment; +- the reputation system is simplified (see below); +- there is no results cross-checking. + +In the MVP protocol: +1. the client calculates the price based on the known rate per hour of history; +2. the client pays the appropriate amount to the server's address; +3. the client sends a `HistoryQuery` to the server alongside the proof of payment (`txid`); +4. the server checks that the `txid` corresponds to a confirmed transaction with at least the required amount; +5. the server sends a `HistoryResponse` to the client. + +In further subsections, we list the potential direction for future work towards a fully-fledged i13n mechanism. ## Pricing For MVP, we assume a constant price per hour of history. -After the client sends a `HistoryQuery` to the server: -1. The server internally calculates the offer price and sends it to the client. -2. If the client agrees, it pays and sends a proof of payment to the server. -3. If the client does not agree, it sends a rejection message to the server. -4. If the server receives a valid payment before a certain timeout, it sends the response to the client. -5. If the server receives a rejection message, or receives no message before a timeout, the server assumes that the client has rejected the offer. +This price and the blockchain address of the server are assumed to be common knowledge. +This simplifies the client-server interaction, avoiding the price negotiation step. + +In the future versions of the protocol, the price will be negotiated and will depend on multiple parameters, +such as the total size of the relevant messages in the response. ### Future work @@ -138,44 +144,55 @@ After the client sends a `HistoryQuery` to the server: ## Payment -If the client agrees to the price, it sends a _proof of payment_ to the server. -For the MVP, each request is paid for with a separate blockchain transaction. +For the MVP, each request is paid for with a separate transaction. The transaction hash (`txid`) acts as a proof of payment. +The server verifies the payment by ensuring that: +1. the transaction has been confirmed; +2. the transaction is paying the proper amount to the server's account; +3. the `txid` does not correspond to any prior response. -Note that client gives proof of payment before it receives the response. +The client gives proof of payment before it receives the response. Other options could be: 1. the client pays after the fact; 2. the client pays partly upfront and partly after the fact; -3. an escrow (a centralized trusted third party, or a semi-trusted entity like a smart contract) ensures atomicity . +3. a centralized third party (either trusted or semi-trusted, like a smart contract) ensures atomicity; +4. cryptographically ensured atomicity (similar to atomic swaps, Lightning, or Hopr). Our design considerations are: - the MVP protocol should be simple; - servers are more "permanent" entities and are more likely to have long-lived identities; - it is more important to protect the clients's privacy than the server's privacy. -In light of these criteria, we suggest that the client pays first: this is simpler than splitting the payment, more secure than trusting a third party, and (arguably) more privacy-preserving for the client than the alternative where the client pays after the fact (that would encourage servers to deanonymize clients to prevent fraud). +In light of these criteria, we suggest that the client pays first. +This is simpler than splitting the payment, or involving an extra atomicity-enforcing mechanism. +Moreover, pre-payment is arguably more privacy-preserving than post-payment, which encourages servers to deanonymize clients to prevent fraud. ### Future work - Add more payment methods - see https://github.com/waku-org/research/issues/58 - Design a subscription model with service credentials - see https://github.com/waku-org/research/issues/59 - Add privacy to service credentials - see https://github.com/waku-org/research/issues/60 +- Consider the impact of network disruptions - see https://github.com/waku-org/research/issues/65 ## Reputation We use reputation to discourage the server from taking the payment and not responding. The client keeps track of the server's reputation: -- all servers start with zero reputation; -- if the server honors the request, it gets +1; -- if the server does not respond before the payment, it gets -1; -- if the server does not respond after the payment and before a timeout, the client will never query it again. +- all servers start with zero reputation points; +- if the server honors the request, it gets `+n` points; +- if the server does not respond before a timeout, it gets `-m` points. +- if the server's reputation drops below `k` points, the client will never query it again. + +`n`, `m`, and `k` are subject to configuration. + +Optionally, a client may treat a given server as trusted, assigning it a constant positive reputation. Potential issues: - An attacker can establish new server identities and continue running away with clients' money. Countermeasures: - - a client only queries "trusted" servers (which however leads to centralization); - - when querying a new server, a client first sends a small (i.e. cheap) request as a test. + - a client only queries trusted servers (which however leads to centralization); + - when querying a new server, a client first sends a small (i.e. cheap) request as a test; + - more generally, the client selects a server on a case-by-case basis, weighing the payment amount against the server's reputation. - The ban mechanism can theoretically be abused. For instance, a competitor may attack the victim server and cause the clients who were awaiting the response to ban that server. Countermeasure: prevent DoS-attacks. -- Servers may also farm reputation by running clients and querying their own server. ### Future work @@ -188,8 +205,11 @@ While rate limiting stops such attack, the server would need to link requests co ## Results cross-checking -Cross-checking is absent in MVP but should be considered later. -We can separate it into two tasks for the client: ensure that servers are a) not censoring real messages; b) not injecting fake messages into history. +As there is no consensus over past messages, a client may want to query multiple servers and merge their responses. +Cross-checking helps ensure that servers are a) not censoring real messages; b) not injecting fake messages into history. +Cross-checking is absent in MVP but may be considered later. + +### Future work - Cross-checking the results against censorship - see https://github.com/waku-org/research/issues/57 - Use RLN to limit fake message insertion - see https://github.com/waku-org/research/issues/38 From de49c9ea941b06a68a3fe483845a373387675ead Mon Sep 17 00:00:00 2001 From: Sergei Tikhomirov Date: Thu, 23 Nov 2023 19:28:41 +0100 Subject: [PATCH 15/18] MVP -> PoC for consistency with other docs --- incentivization.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/incentivization.md b/incentivization.md index 0b83208..57597a6 100644 --- a/incentivization.md +++ b/incentivization.md @@ -109,15 +109,15 @@ An incentivized Store protocol has the following extra steps: 3. reputation 4. results cross-checking -In this document, we focus on the simplest viable i13n for Store (MVP). -Compared to the fully-fledged protocol, the MVP version is simplified in the following ways: +In this document, we focus on the simplest proof-of-concept i13n for Store (PoC). +Compared to the fully-fledged protocol, the PoC version is simplified in the following ways: - cost calculation is based on a common-knowledge price; - there is no price advertisement and no price negotiation; - each query is paid for in a separate transaction, `txid` acts a proof of payment; - the reputation system is simplified (see below); - there is no results cross-checking. -In the MVP protocol: +In the PoC protocol: 1. the client calculates the price based on the known rate per hour of history; 2. the client pays the appropriate amount to the server's address; 3. the client sends a `HistoryQuery` to the server alongside the proof of payment (`txid`); @@ -128,7 +128,7 @@ In further subsections, we list the potential direction for future work towards ## Pricing -For MVP, we assume a constant price per hour of history. +For PoC, we assume a constant price per hour of history. This price and the blockchain address of the server are assumed to be common knowledge. This simplifies the client-server interaction, avoiding the price negotiation step. @@ -144,7 +144,7 @@ such as the total size of the relevant messages in the response. ## Payment -For the MVP, each request is paid for with a separate transaction. +For the PoC, each request is paid for with a separate transaction. The transaction hash (`txid`) acts as a proof of payment. The server verifies the payment by ensuring that: 1. the transaction has been confirmed; @@ -159,7 +159,7 @@ Other options could be: 4. cryptographically ensured atomicity (similar to atomic swaps, Lightning, or Hopr). Our design considerations are: -- the MVP protocol should be simple; +- the PoC protocol should be simple; - servers are more "permanent" entities and are more likely to have long-lived identities; - it is more important to protect the clients's privacy than the server's privacy. @@ -207,7 +207,7 @@ While rate limiting stops such attack, the server would need to link requests co As there is no consensus over past messages, a client may want to query multiple servers and merge their responses. Cross-checking helps ensure that servers are a) not censoring real messages; b) not injecting fake messages into history. -Cross-checking is absent in MVP but may be considered later. +Cross-checking is absent in PoC but may be considered later. ### Future work From 63fbc7c3cd1bbf755ff78f399735d734c63fe326 Mon Sep 17 00:00:00 2001 From: Sergei Tikhomirov Date: Fri, 24 Nov 2023 11:52:49 +0100 Subject: [PATCH 16/18] clarify: requests are not only time-based Co-authored-by: fryorcraken <110212804+fryorcraken@users.noreply.github.com> --- incentivization.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/incentivization.md b/incentivization.md index 57597a6..ace1596 100644 --- a/incentivization.md +++ b/incentivization.md @@ -66,7 +66,7 @@ There is no strict definition of a full node vs a light node in Waku (see [discu In light protocols, a client sends a request to a server, and a server performs some actions and returns a response: - [[Filter]]: the server will relay (only) messages that pass a filter to the client; -- [[Store]]: the server responds with messages relayed within the specified earlier time frame; +- [[Store]]: the server responds with messages relayed that matches a set of criteria - [[Lightpush]]: the server publishes the client's message to the Relay network. ## Waku i13n challenges From 1566cff17d9c353ac8795f535ba7ba4449da5b8e Mon Sep 17 00:00:00 2001 From: Sergei Tikhomirov Date: Fri, 24 Nov 2023 13:05:54 +0100 Subject: [PATCH 17/18] minor edits --- incentivization.md | 43 ++++++++++++++++++++----------------------- 1 file changed, 20 insertions(+), 23 deletions(-) diff --git a/incentivization.md b/incentivization.md index ace1596..dbe06af 100644 --- a/incentivization.md +++ b/incentivization.md @@ -5,7 +5,8 @@ TWN needs incentivization (shortened to i13n) to ensure proper node behavior. The goal of this document is to outline and contextualize our approach to TWN i13n. After providing an overview of Waku and relevant prior work, we focus on Waku Store - a client-server protocol for querying historical messages. -We introduce a minimal viable addition to Store to enable i13n, and list research directions for future work. +We introduce a minimal viable addition to Store to enable i13n, +and list research directions for future work. # Incentivization in decentralized networks ## Incentivization tools @@ -42,9 +43,9 @@ This non-monetary i13n policy has been proved to work in practice. Bitcoin has introduced proof-of-work (PoW) for native monetary rewards in a P2P network. PoW miners are automatically assigned newly mined coins for generating blocks. -There are no intrinsic monetary punishments in Bitcoin. -However, miners must expend physical resources before claiming the reward. -Proof-of-stake (PoS), used in Ethereum and many other cryptocurrencies, introduces intrinsic monetary punishments. +Miners must expend physical resources to generate a block. +If the block is invalid, these expenses are not compensated (implicit monetary punishment). +Proof-of-stake (PoS), used in Ethereum and many other cryptocurrencies, introduces explicit monetary punishments. PoS validators lock up (stake) native tokens and get rewarded for validating blocks or slashed for misbehavior. ### Decentralized storage @@ -54,9 +55,9 @@ Their i13n methods combine techniques from early P2P file-sharing with blockchai # Waku background -Waku is a family of protocols (see [architecture](https://waku.org/about/architect)) for a modular privacy-preserving censorship-resistant decentralized communications network. +Waku is a [family of protocols](https://waku.org/about/architect) for a modular privacy-preserving censorship-resistant decentralized communication network. The backbone of Waku is the Relay protocol (and its spam-protected version [RLN-Relay](https://rfc.vac.dev/spec/17/)). -Additionally, there are light protocols: Filter, Store, and Lightpush. +Additionally, there are light protocols: Store, Filter, and Lightpush. Light protocols are also referred to as client-server protocols and request-response protocols. A server is a node running Relay and a server-side of at least one light protocol. @@ -65,20 +66,20 @@ A server may sometimes be referred to as a full node, and a client as a light no There is no strict definition of a full node vs a light node in Waku (see [discussion](https://github.com/waku-org/research/issues/28)). In light protocols, a client sends a request to a server, and a server performs some actions and returns a response: +- [[Store]]: the server responds with messages relayed that match a set of criteria; - [[Filter]]: the server will relay (only) messages that pass a filter to the client; -- [[Store]]: the server responds with messages relayed that matches a set of criteria - [[Lightpush]]: the server publishes the client's message to the Relay network. ## Waku i13n challenges -Waku lacks consensus or a native token, which brings it closer to reputation-incentivized file-sharing systems. +Waku has no consensus and no native token, which brings it closer to reputation-incentivized file-sharing networks. As of late 2023, Waku only operates under reputation-based rewards and punishments. While [RLN-Relay](https://rfc.vac.dev/spec/17/) adds monetary punishments for spammers, slashing is yet to be activated. -Monetary rewards and punishments should ideally be atomically linked with performance. -A benefit of blockchains in this respect is that the desired behavior of miners or validators can be verified on-chain. -Enforcing atomicity in decentralized data-focused networks is more challenging: -it is non-trivial to prove that a certain piece of data has been relayed. +Monetary rewards and punishments should ideally be atomically linked with the node's behavior. +A benefit of blockchains in this respect is that the desired behavior of miners or validators can be verified automatically. +Enforcing atomicity in a communication network is more challenging: +it is non-trivial to prove that a given piece of data has been relayed. Our goal is to combine monetary and reputation-based incentives for Waku. Monetary incentives have demonstrated their robustness in blockchains. @@ -86,15 +87,14 @@ We think they are necessary to scale the network beyond the initial phase when i ## Waku Store -Waku Store is a light protocol for querying historic messages. -It currently works as follows: +Waku Store is a light protocol for querying historic messages that works as follows: 1. the client sends a `HistoryQuery` to the server; 2. the server sends a `HistoryResponse` to the client. The response may be split into multiple parts, as specified by pagination parameters in `PagingInfo`. -We define a _relevant_ message as a message that matches a client-defined filter (e.g., it has been relayed within a specified time frame). -Ideally, after receiving a request, a server should quickly send back a response containing all relevant messages and only them. +We define a _relevant_ message as a message that matches client-defined criteria (e.g., relayed within a given time frame). +Upon receiving a request, a server should quickly send back a response containing all and only relevant messages. # Waku Store incentivization @@ -109,13 +109,13 @@ An incentivized Store protocol has the following extra steps: 3. reputation 4. results cross-checking -In this document, we focus on the simplest proof-of-concept i13n for Store (PoC). +In this document, we focus on the simplest proof-of-concept (PoC) i13n for Store. Compared to the fully-fledged protocol, the PoC version is simplified in the following ways: - cost calculation is based on a common-knowledge price; - there is no price advertisement and no price negotiation; - each query is paid for in a separate transaction, `txid` acts a proof of payment; - the reputation system is simplified (see below); -- there is no results cross-checking. +- the results are not cross-checked. In the PoC protocol: 1. the client calculates the price based on the known rate per hour of history; @@ -137,7 +137,7 @@ such as the total size of the relevant messages in the response. ### Future work -- DoS protection: a client can overwhelm a server with requests and not proceed to payment. Countermeasure: ignore requests from the same client if they come too often; generalize a reputation system to servers ranking clients. +- DoS protection - see https://github.com/waku-org/research/issues/66 - Cost calculation - see https://github.com/waku-org/research/issues/35 - Price advertisement - see https://github.com/waku-org/research/issues/51 - Price negotiation - see https://github.com/waku-org/research/issues/52 @@ -200,9 +200,6 @@ Design a more comprehensive reputation system: - local reputation - see https://github.com/waku-org/research/issues/48 - global reputation - see https://github.com/waku-org/research/issues/49 -Reputation may also be use to rank clients to prevent DoS attacks when a client overwhelms the server with requests. -While rate limiting stops such attack, the server would need to link requests coming from one client, threatening its privacy. - ## Results cross-checking As there is no consensus over past messages, a client may want to query multiple servers and merge their responses. @@ -223,4 +220,4 @@ We should think about what the success metrics for an incentivized protocol are, - Analyze privacy issues - see https://github.com/waku-org/research/issues/61 - Analyze decentralized storage protocols and their relevance e.g. as back-end storage for Store servers - see https://github.com/waku-org/research/issues/34 - Analyze the role of message senders, in particular, whether they should pay for sending non-ephemeral messages - see https://github.com/waku-org/research/issues/32 -- Generalize incentivization protocol to other Waku light protocols: Lightpush and Filter. \ No newline at end of file +- Generalize incentivization protocol to other Waku light protocols (Lightpush and Filter) - see https://github.com/waku-org/research/issues/67. \ No newline at end of file From d3313d67db0459048e0df1ece8043bb78fdb81a8 Mon Sep 17 00:00:00 2001 From: Sergei Tikhomirov Date: Fri, 24 Nov 2023 13:09:16 +0100 Subject: [PATCH 18/18] add links to RFCs of light protocols --- incentivization.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/incentivization.md b/incentivization.md index dbe06af..2749bdc 100644 --- a/incentivization.md +++ b/incentivization.md @@ -66,9 +66,9 @@ A server may sometimes be referred to as a full node, and a client as a light no There is no strict definition of a full node vs a light node in Waku (see [discussion](https://github.com/waku-org/research/issues/28)). In light protocols, a client sends a request to a server, and a server performs some actions and returns a response: -- [[Store]]: the server responds with messages relayed that match a set of criteria; -- [[Filter]]: the server will relay (only) messages that pass a filter to the client; -- [[Lightpush]]: the server publishes the client's message to the Relay network. +- [Store](https://rfc.vac.dev/spec/13/): the server responds with messages relayed that match a set of criteria; +- [Filter](https://rfc.vac.dev/spec/12/): the server will relay (only) messages that pass a filter to the client; +- [Lightpush](https://rfc.vac.dev/spec/19/): the server publishes the client's message to the Relay network. ## Waku i13n challenges