Return 2315 to draft status (#3453)

* Return to Draft Not ready for London review, cleaned up for further consideration, if any. * Update eip-2315.md
2025-02-23 12:18:16 +00:00 · 2021-03-31 21:07:21 -04:00 · 2021-03-31 21:07:21 -04:00 · c65abe153d
commit c65abe153d
parent 4f6c9eea29
1 changed files with 37 additions and 47 deletions
--- a/EIPS/eip-2315.md
+++ b/EIPS/eip-2315.md
@ -1,7 +1,7 @@
 ---
 eip: 2315
 title: Simple Subroutines for the EVM
-status: Review
+status: Draft
 type: Standards Track
 category: Core
 author: Greg Colvin <greg@colvin.org>, Martin Holst Swende (@holiman)
@ -9,19 +9,23 @@ discussions-to: https://ethereum-magicians.org/t/eip-2315-simple-subroutines-for
 created: 2019-10-17
 ---
 ##Simple Summary
 (Almost) the smallest possible change that provides native subroutines without breaking backwards compatibility.
 ## Abstract
-This proposal introduces three opcodes to support subroutines: `BEGINSUB`, `JUMPSUB` and `RETURNSUB`.
+This proposal introduces three opcodes to support subroutines: `BEGINSUB`, `JUMPSUB` and `RETURNSUB`.  (The smallest possible change would do without  `BEGINSUB`). 
 Safety and amenability to static analysis equivalent to  [EIP-615](https://eips.ethereum.org/EIPS/eip-615) can be ensured by enforcing a few simple rules, and validated with the provided algorithm.
 ## Motivation
-The EVM does not provide subroutines as a primitive.  Instead, calls can be synthesized by fetching and pushing the current program counter on the data stack and jumping to the subroutine address; returns can be synthesized by getting the return address to the top of the stack and jumping back to it.  Over the course of 30 years the computer industry struggled with this complexity and cost and settled in on providing primitive operations to directly support subroutines.  These are provided in some form by most all physical and virtual machines going back at least 50 years.
+The EVM does not provide subroutines as a primitive.  Instead, calls can be synthesized by fetching and pushing the current program counter on the data stack and jumping to the subroutine address; returns can be synthesized by getting the return address to the top of the stack and jumping back to it. 
-In whatever form, these operations provide for capturing the current context of execution, transferring control to a new context, and returning to original context.
+Facilities to directly support subroutines are provided in some form by most physical and virtual machines going back at least fifty years.  In whatever form, these operations provide for capturing the current context of execution, transferring control to a new context, and returning to the original context.
-We propose a safe, simple _return-stack_ mechanism, proven to work well for stack machines, which we specify here.  Note that this specification is entirely semantic.  It constrains only stack usage and control flow and imposes no syntax on code beyond being a sequence of bytes to be executed.
+We propose a simple _return-stack_ mechanism, known to work well for stack machines, which we specify here.  Note that this specification is entirely semantic.  It constrains only stack usage and control flow and imposes no syntax on code beyond being a sequence of bytes to be executed.
 In the future, amenability to static analysis equivalent to  [EIP-615](https://eips.ethereum.org/EIPS/eip-615) could be ensured by enforcing a few simple rules, and validated with the provided algorithm, still without imposing syntactic constraints.
 ## Specification
@ -61,11 +65,11 @@ _Note 3: The description above lays out the semantics of this feature in terms o
 ### Indirect Jumps
-If [EIP-3337 BEGINDATA](https://github.com/ethereum/EIPs/blob/master/EIPS/eip-3337.md) is implemented then the indirect jumps from  [EIP-615](https://eips.ethereum.org/EIPS/eip-615) -- `JUMPV` and `JUMPSUBV` -- can be implemented.  These would take two arguments on the stack: a constant offset relative to `BEGINDATA` to a jump table, and a variable index into that table.  Detailed specifications can await the acceptance of EIP-3337.
+If [EIP-2327: BEGINDATA](https://eips.ethereum.org/EIPS/eip-2327) or similar is implemented then the indirect jumps from  [EIP-615](https://eips.ethereum.org/EIPS/eip-615) -- `JUMPV` and `JUMPSUBV` -- can be implemented.  These could take two arguments on the stack: a constant offset relative to `BEGINDATA` to a jump table, and a variable index into that table.
 ## Rationale
-We modeled this design on the simple, proven, archetypal Forth virtual machine of 1970.  It is a two-stack design -- the data stack is supplemented with a return stack to support jumping into and returning from subroutines, as specified above.  The separate return stack ensures that the return address cannot be overwritten or mislaid, and obviates any need to swap the return address past the arguments on the stack.  Importantly, a dynamic jump is not needed to implement subroutine returns, allowing for deprecation of dynamic uses of JUMP and JUMPI.  Deprecating dynamic jumps is key to practical static analysis of code.
+We modeled this design on the simple, proven, archetypal Forth virtual machine of 1970.  It is a two-stack design -- the data stack is supplemented with a return stack to support jumping into and returning from subroutines, as specified above.  The separate return stack ensures that the return address cannot be overwritten or mislaid, and obviates any need to swap the return address past the arguments on the stack.  Importantly, a dynamic jump is not needed to implement subroutine returns, allowing for deprecation of dynamic uses of JUMP and JUMPI.  Eventually deprecating dynamic jumps is key to practical static analysis of code.
 (JUMPSUB and RETURNSUB were also defined in terms of a `return stack` in [EIP-615](https://eips.ethereum.org/EIPS/eip-615))
 .
@ -73,7 +77,7 @@ We modeled this design on the simple, proven, archetypal Forth virtual machine o
 These changes do not affect the semantics of existing EVM code.
-These changes are compatible with using [EIP-3337](https://github.com/ethereum/EIPs/blob/master/EIPS/eip-3337.md) to provide stack frames, by associating a frame with each subroutine.
+These changes are compatible with using [EIP-3337](https://eips.ethereum.org/EIPS/eip-3337) to provide stack frames, by associating a frame with each subroutine.
 ## Implementations
@ -106,15 +110,13 @@ We suggest the following opcodes:
 These changes do introduce new flow control instructions, so any software which does static/dynamic analysis of evm-code needs to be modified accordingly. The `JUMPSUB` semantics are similar to `JUMP` (but jumping to a `BEGINSUB`), whereas the `RETURNSUB` instruction is different, since it can 'land' on any opcode (but the possible destinations can be statically inferred).
-The safety and amenability to static analysis of valid programs is equivalent to  [EIP-615](https://eips.ethereum.org/EIPS/eip-615), but without imposing syntactic constraints, and thus with minimal impact on low-level optimizations.  Validity is ensured by the following rules, and programs can be validated with the provided algorithm.
+The safety and amenability to static analysis of valid programs can be made comparable to [EIP-615](https://eips.ethereum.org/EIPS/eip-615), but without imposing syntactic constraints, and thus with minimal impact on low-level optimizations.  Validity can ensured by following the rules given in the next section, and programs can be validated with the provided algorithm.  The validation algorithm is simple and bounded by the size of the code, allowing for validation at deploy time or at load time.
-As with  [EIP-615](https://eips.ethereum.org/EIPS/eip-615), contract code must be validated at deploy time for contracts created by external transactions.  Unlike EIP-615, backwards compatibility means that no versioning is needed.
+While it is crucial going forward that it be possible to validate programs, this EIP does propose that validity be enforced.  Note that much value for people doing static analysis (e.g. for proofs that bytecode meets formal specifications of a contract) can be had without enforcement.  Code can be scanned in linear time to ensure that the rules are or are not followed before analysis begins.  And compilers can easily follow the rules up front.
 However, as soon as these rules are enforced compilers that generate dynamic jumps will be broken.  Therefore, in the initial upgrade there should not be any deploy-time validation, though compilers are encouraged to emit only valid code from the start.  A future upgrade will start enforcing the rules once compilers and tools are ready.
 ### Validity
-We would like to consider EVM code valid iff no execution of the program can lead to an exceptional halting state, but we must validate code in linear time.  (More precisely, in time `O(vertices + edges) in the control-flow graph.) So our validation algorithm does not consider the code’s data and computations, only its control flow and stack use.  This means we will reject programs with any invalid code paths, even if those paths are not reachable at runtime.
+*** Exceptional Halting States
 _Execution_ is as defined in the [Yellow Paper](https://ethereum.github.io/yellowpaper/paper.pdf)—a sequence of changes in the EVM state.  The conditions on valid code are preserved by state changes.  At runtime, if execution of an instruction would violate a condition the execution is in an exceptional halting state.  The Yellow Paper defines five such states.
 1. Insufficient gas
@ -123,42 +125,35 @@ _Execution_ is as defined in the [Yellow Paper](https://ethereum.github.io/yello
 4. Invalid jump destination
 5. Invalid instruction
-Conditions 1 and 2 -- Insufficient gas and stack overflow, must be checked at runtime.  Conditions 3, 4, and 5 cannot occur if the code conforms to the following rules.
+We would like to consider EVM code valid iff no execution of the program can lead to an exceptional halting state, but we must be able to validate code in linear time to avoid denial of service attacks.  So in practice, we can only partially meet these requirements.  Our validation algorithm does not consider the code’s data and computations, only its control flow and stack use.  This means we will reject programs with any invalid code paths, even if those paths are not reachable at runtime.   Further, conditions 1 and 2 —Insufficient gas and stack overflow—must in general be checked at runtime.  Conditions 3, 4, and 5 cannot occur if the code conforms to the following rules.
-* `JUMP` and `JUMPI` address only  valid `JUMPDEST` instructions.
+*** The Rules 
 * `JUMPSUB`  addresses only valid `BEGINSUB` instructions.
-Valid instructions are not part of PUSH data.
+1. `JUMP` and `JUMPI` address only valid `JUMPDEST` instructions.
 2. `JUMPSUB` addresses only valid `BEGINSUB` instructions.
 3. `JUMP`, `JUMPI` and `JUMPSUB` are always preceded by one of the `PUSH` instructions.
 4. For each instruction in the code the `stack depth` is always the same.
 5. The `stack depth` is always positive and at most 1024.
-* `JUMP`, `JUMPI` and `JUMPSUB` are always preceded by one of the `PUSH` instructions.
+Rules 1 and 2 are currently enforced at runtime.  _Note: Valid instructions are not part of PUSH data._
-Requiring a `PUSH` before each `JUMP` forbids dynamic jumps.  Absent dynamic jumps another mechanism is needed for subroutine returns, as provided here.
+Rule 3, requiring a `PUSH` before each `JUMP*` would forbid dynamic jumps.  Absent dynamic jumps another mechanism is needed for subroutine returns, as provided here. 
-The `stack pointer` or `SP` points just past the top item on the `data stack`.  We define the `stack depth` as the number of stack elements between the current `SP` and the current `stack base`.  The `stack base` was the `SP` at the previous `JUMPSUB`, or `0` on program entry.  So we can preclude all stack underflows and some stack overflows.
+For rules 4 and 5 we need to define `stack depth`.  The Yellow Paper has the `stack pointer` or `SP` pointing just past the top item on the `data stack`.   We define the `stack base` as where the `SP` pointed at the most recent `JUMPSUB`, or `0` on program entry.  So we can define the `stack depth` as the number of stack elements between the current `SP` and the current `stack base`.  
-* The `stack depth` is always positive and at most 1024.
+Given our definition of `stack depth` Rule 4 ensures that control flows which return to the same place with a different `stack depth` are invalid.  These can be caused by irreducible paths like jumping into loops and subroutines, and calling subroutines with different numbers of arguments.  Taken together, these rules allow for code to be validated  by following the control-flow graph, traversing each edge only once.
-Control flows which return to the same place with a different `stack depth` are invalid.  These can be caused by irreducible paths like jumping into loops and subroutines.
+Finally, Rule 5 precludes all stack underflows (and some stack overflows.)
 * For each instruction in the code the `stack depth` is always the same.
 Internal calls require that we add the block number of creation to the account state.  Further contracts created by an account will be validated as of the upgrade level at that block.  This ensures that all of an account's contracts are mutually compatible, and we can rule that:
 * Validated contracts can only call contracts which have been validated at the same or greater level of upgrade.
 If [EIP-3337 BEGINDATA](https://github.com/ethereum/EIPs/blob/master/EIPS/eip-3337.md) is implemented we can add one last rule.
 * All statically unreachable instructions must be INVALID.
 ### Validation
-The following is a pseudo-Go specification of an algorithm for enforcing program validity.  It recursively traverses the bytecode, following its control flow and stack use and checking for violations of the rules above.  (For simplicity we ignore the issue of JUMPDEST or BEGINSUB bytes in PUSH data.)  It runs in time == O(vertices + edges) in the program's control-flow graph.
+The following is a pseudo-Go specification of an algorithm for enforcing program validity.  It recursively traverses the bytecode, following its control flow and stack use and checking for violations of the rules above.  (For simplicity we ignore the issue of JUMPDEST or BEGINSUB bytes in PUSH data.)  It runs in time == O(vertices + edges) in the program's control-flow graph, where vertices represent control-flow instructions and the edges represent basic blocks.
 ```
-   bytecode []byte
+   var bytecode []byte
-   stack_depth []int
+   var stack_depth []int
-   SP := 0
+   var SP := 0
-   validate(PC :=0) {
+   func validate(PC :=0) boolean {
      // traverse code sequentially, recurse for subroutines and conditional jumps
      while true {
         instruction = bytecode[PC]
@ -242,18 +237,13 @@ The following is a pseudo-Go specification of an algorithm for enforcing program
             continue
         }
         if instruction == RETURNSUB {
             // successful return from recursion
             PC = prevPC
             return true
         }
         // advance PC according to instruction
         PC = advance_pc(PC, instruction)
-      }
+      }    
      // reached end of code
      return true       
   }
 ```
 ## Test Cases
@ -343,8 +333,8 @@ Consumed gas: `30`
 ## References
 Gavin Wood, [Ethereum:  A  Secure  Decentralized Generalized  Transaction  Ledger](https://ethereum.github.io/yellowpaper/paper.pdf), 2014-2021
 Greg Colvin, Brooklyn Zelenka, Paweł Bylica, Christian Reitwiessner, [EIP-615: Subroutines and Static Jumps for the EVM](https://eips.ethereum.org/EIPS/eip-615),  2016-2019
-Martin Lundfall,  [EIP-2327: BEGINDATA Opcode](https://github.com/ethereum/EIPs/blob/master/EIPS/eip-2327.md), 2019
+Martin Lundfall, [EIP-2327: BEGINDATA Opcode](https://eips.ethereum.org/EIPS/eip-2327), 2019
-Nick Johnson, [EIP-3337: Frame pointer support for memory load and store operations](https://github.com/ethereum/EIPs/blob/master/EIPS/eip-3337.md), 2021
+Nick Johnson, [EIP-3337: Frame pointer support for memory load and store operations](https://eips.ethereum.org/EIPS/eip-3337), 2021
 ## Copyright
 Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).