EIPs/eip-2315.md at fb32ff8ea22d48dd445a1dbcb7f1b0d9d033c8c1

mirror of https://github.com/status-im/EIPs.git synced 2025-02-23 04:08:09 +00:00

* JUMPSUB improvements.

* typo

Co-authored-by: Greg Colvin <gcolvin@Tinman.local>

2021-08-19 20:34:08 +00:00

22 KiB

Raw Blame History

eip	title	status	type	category	author	discussions-to	created
2315	Simple Subroutines for the EVM	Draft	Standards Track	Core	Greg Colvin <greg@colvin.org>, Martin Holst Swende (@holiman)	https://ethereum-magicians.org/t/eip-2315-simple-subroutines-for-the-evm/3941	2019-10-17

Simple Summary

(Almost) the smallest possible change that provides native subroutines.

Abstract

This proposal introduces three opcodes to support subroutines: BEGINSUB, JUMPSUB and RETURNSUB. (The smallest possible change would do without BEGINSUB).

Substantial gains in efficiency are achieved.

Safety properties equivalent to EIP-615 can be ensured by enforcing a few simple rules, which are validated with the provided algorithm, and without imposing syntactic constraints.

Motivation

The EVM does not provide subroutines as a primitive. Instead, calls can be synthesized by fetching and pushing the current program counter on the data stack and jumping to the subroutine address; returns can be synthesized by getting the return address to the top of the stack and jumping back to it. These conventions are more costly than necessary, and impede static analysis of EVM code, especically rapid validation of some important safety properties.

Facilities to directly support subroutines are provided by all but one of the machines programmed by the lead author, including the Burroughs 5000, CDC 7600, IBM 360, DEC PDP 11 and VAX, Motorola 68000, Intel 80x86, Sun SPARC, USCD p-Machine, Sun JVM, Wasm, and the EVM. In whatever form, these operations provide for capturing the current context of execution, transferring control to a new context, and returning to the original context. The concept goes back to Turing (1946):

We also wish to be able to arrange for the splitting up of operations into subsidiary operations. This should be done in such a way that once we have written down how an operation is done we can use it as a subsidiary to any other operation.

... When we wish to start on a subsidiary operation we need only make a note of where we left off the major operation and then apply the first instruction of the subsidiary. When the subsidiary is over we look up the note and continue with the major operation. Each subsidiary operation can end with instructions for this recovery of the note. How is the burying and disinterring of the note to be done? There are of course many ways. One is to keep a list of these notes in one or more standard size delay lines, (1024) with the most recent last. The position of the most recent of these will be kept in a fixed TS, and this reference will be modified every time a subsidiary is started or finished...

Notes: TS 1 contains the address of the currently executing instruction.

We propose to use Turing's simple mechanism, long known to work well for virtual stack machines, which we specify here. Note that this specification is entirely semantic. It constrains only stack usage and control flow and imposes no syntax on code beyond being a sequence of bytes to be executed.

Specification

We introduce one more stack into the EVM in addition to the existing data stack which we call the return stack. The return stack is limited to 1024 items. This stack supports the three new instructions for subroutines.

`BEGINSUB`

Marks an entry point to a subroutine. Execution of a BEGINSUB is a no-op.

`JUMPSUB <location>`

Transfers control to a subroutine.

Decode the location from the immediate data. The data is encoded as two bytes, MSB-first.
If the opcode at location is not a BEGINSUB abort.
If the return stack already has 1024 items abort.
Push the current pc + 1 to the return stack.
Set pc to location + 1.

pops one item off the data stack
pushes one item on the return stack

`RETURNSUB`

Returns control to the caller of a subroutine.

If the return stack is empty abort.
Pop pc off the return stack.

pops one item off the return stack

Note 1: If a resulting pc to be executed is beyond the last instruction then the opcode is implicitly a STOP, which is not an error.

Note 2: Values popped off the return stack do not need to be validated, since they are alterable only by JUMPSUB and RETURNSUB.

Note 3: The description above lays out the semantics of this feature in terms of a return stack. But the actual state of the return stack is not observable by EVM code or consensus-critical to the protocol. (For example, a node implementor may code JUMPSUB to unobservably push pc on the return stack rather than pc + 1, which is allowed so long as RETURNSUB observably returns control to the pc + 1 location.)

Dependencies

EVM Object Format (EOF) v1

EOF static relative jumps

Rationale

We modeled this design on the simple, proven, archetypal Forth virtual machine of 1970. It is a two-stack design -- the data stack is supplemented with Turing's return stack to support jumping into and returning from subroutines, as specified above.

The separate return stack ensures that the return address cannot be overwritten or mislaid, and obviates any need to swap the return address past the arguments on the stack. Importantly, a dynamic jump is not needed to implement subroutine returns, allowing for deprecation of JUMP and JUMPI.

(JUMPSUB and RETURNSUB are also defined in terms of a return stack in EIP-615).

Backwards and Forwards Compatibility

These changes affect the semantics of existing EVM code. The EVM Object Format is required to allow for immediate data.

These changes are compatible with using EIP-3337 to provide stack frames, by associating a frame with each subroutine.

Implementations

Three clients have implemented this (or an earlier version of this) proposal:

Costs and Codes

We suggest that the cost of

BEGINSUB be jumpdest (1)
JUMPSUB be low (5)
RETURNSUB be low (5)

The low costs are justified by these instructions requiring only a few operations on the return stack and pc, with no 256-bit arithmetic and no checking for valid destinations.

Benchmarking might be needed to tell if the costs are well-balanced.

We suggest the following opcodes:

0x5c BEGINSUB
0x5d RETURNSUB
0x5e JUMPSUB

Efficiency

Consider this example of calling a minimal subroutine.

TEST_DIV:
    beginsub        ; 1 gas
    0x02            ; 3 gas
    0x03            ; 3 gas
    jumpsub DIVIDE  ; 5 gas
    returnsub       ; 5 gas

DIVIDE:
    beginsub        ; 1 gas
    mul             ; 5 gas
    returnsub       ; 5 gase

Total 28 gas.

The same code, using JUMP.

TEST_DIV:
   jumpdest         ; 1 gas
   RTN_MUL          ; 3 gas
   0x02             ; 3 gas
   0x03             ; 3 gas
   DIVIDE           ; 3 gas
   jump             ; 8 gas
RTN_DIV:
   jumpdest         ; 1 gas
   swap1            ; 3 gas
   jump             ; 8 gas

DIVIDE:
   jumpdest         ; 1 gas
   div              ; 5 gas
   swap1            ; 3 gas
   jump             ; 8 gas

50 gas total.

Both approaches need to push two arguments and divide, using 11 gas. So control flow gas is 39 using JUMP versus 17 using JUMPSUB.

That’s a 56% savings of 22 gas to just jump to and return from a subroutine.

In the general case of one routine calling another I don’t think the JUMP version can do better. Of course in this case we can optimize the tail call, so that the final jump in DIVIDE actually returns from TEST_DIV.

TEST_DIV:
   jumpdest         ; 1 gas
   0x02             ; 3 gas
   0x03             ; 3 gas
   DIVIDE           ; 3 gas
   jump             ; 8 gas

DIVIDE:
   jumpdest         ; 1 gas
   div              ; 5 gas
   swap1            ; 3 gas
   jump             ; 8 gas

Total 35 gas, which is still worse than with JUMPSUB.

We could even take advantage of DIVIDE just happening to directly follow TEST_DIV and just fall through:

TEST_DIV:
   jumpdest         ; 1 gas
   0x02             ; 3 gas
   0x03             ; 3 gas
DIVIDE:
   jumpdest         ; 1 gas
   div              ; 5 gas
   swap1            ; 3 gas
   jump             ; 8 gas

Total 24 gas, better than JUMPSUB.

However, JUMPSUB can do even better with the same optimizations.

We can optimize the tail call.

TEST_DIV:
    beginsub        ; 1 gas
    0x02            ; 3 gas
    0x03            ; 3 gas
    DIVIDE          ; 3 gas
    jump            ; 9 gas

DIVIDE:
    beginsub        ; 1 gas
    div             ; 5 gas
    returnsub       ; 5 gase

Total 30 gas. 5 better than with JUMP.

And we can fall directly into the "called" routine.

TEST_DIV:
    beginsub        ; 1 gas
    0x02            ; 3 gas
    0x03            ; 3 gas
DIVIDE:
    beginsub        ; 1 gas
    div             ; 5 gas
    returnsub       ; 5 gase

Total 18 gas. 6 better than with JUMP.

So these opcodes both improve simple subroutine calls, and increase the efficiency of available optimizations.

Security Considerations

These changes do introduce new flow control instructions, so any software which does static/dynamic analysis of evm-code needs to be modified accordingly. The JUMPSUB semantics are similar to JUMP (but jumping to a BEGINSUB), whereas the RETURNSUB instruction is different, since it can 'land' on any opcode (but the possible destinations can be statically inferred).

Safety and amenability to static analysis of valid programs can be made comparable to EIP-615, but without imposing syntactic constraints, and thus with minimal impact on low-level optimizations (as shown above.) Validity can ensured by following the rules given in the next section, and programs can be validated with the provided algorithm. The validation algorithm is simple and bounded by the size of the code, allowing for validation at creation time. And compilers can easily follow the rules.

Validity

We define safety here as avoiding exceptional halting states.

Exceptional Halting States

Execution is as defined in the Yellow Paper — a sequence of changes in the EVM state. The conditions on valid code are preserved by state changes. At runtime, if execution of an instruction would violate a condition the execution is in an exceptional halting state. The Yellow Paper defines five such states.

Insufficient gas
More than 1024 stack items
Insufficient stack items
Invalid jump destination
Invalid instruction

We would like to consider EVM code valid iff no execution of the program can lead to an exceptional halting state, but we must be able to validate code in linear time to avoid denial of service attacks. So in practice, we can only partially meet these requirements. Our validation algorithm does not consider the code’s data and computations, only its control flow and stack use. This means we will reject programs with any invalid code paths, even if those paths are not reachable at runtime. Further, conditions 1 and 2 — Insufficient gas and stack overflow — must in general be checked at runtime. Conditions 3, 4, and 5 cannot occur if the code conforms to the following rules.

The Rules

JUMP and JUMPI are deprecated.
RJUMP and RJUMPI address only valid JUMPDEST instructions.
JUMPSUB addresses only valid BEGINSUB instructions.
For each instruction in the code the stack depth is always the same.
The stack depth is always positive and at most 1024.

Rule 0, depracating JUMP and JUMPI, would forbid dynamic jumps. Absent dynamic jumps another mechanism is needed for subroutine returns, as provided here.

Jump destinations are currently checked at runtime. Static jumps allow them to be validated at creation time, per rule 1. Note: Valid instructions are not part of immediate data.

For rules 3 and 4 we need to define stack depth. The Yellow Paper has the stack pointer or SP pointing just past the top item on the data stack. We define the stack base as where the SP pointed before the most recent JUMPSUB, or 0 on program entry. So we can define the stack depth as the number of stack elements between the current SP and the current stack base.

Given our definition of stack depth Rule 3 ensures that control flows which return to the same place with a different stack depth are invalid. These can be caused by irreducible paths like jumping into loops and subroutines, and calling subroutines with different numbers of arguments. Taken together, these rules allow for code to be validated by following the control-flow graph, traversing each edge only once.

Finally, Rule 4 catches all stack underflows. It also catches stack overflows in programs that overflow without (or before) recursing.

Validation

The following is a pseudo-Go specification of an algorithm for enforcing program validity. It recursively traverses the bytecode, following its control flow and stack use and checking for violations of the rules above. (For simplicity we ignore the issue of JUMPDEST or BEGINSUB bytes in immediate data, assume an advance_pc() routine, and don't specify JUMPTABLE, which amounts to a loop over RJUMP.) It runs in time == O(vertices + edges) in the program's control-flow graph, where vertices represent control-flow instructions and the edges represent basic blocks.

   var bytecode []byte
   var stack_depth []int
   var SP := 0

   func validate(PC :=0) boolean {
      // traverse code sequentially
      // recurse for subroutines and conditional jumps
      while true {
         instruction = bytecode[PC]
         if is_invalid(instruction) {
            return false;
         }

         // if stack depth non-zero we have been here before 
         // check for constant depth and return to break cycle
         if stack_depth[PC] != 0 {
             if SP != stack_depth[PC] {
                 return false
             } 
             return true
         }
         stack_depth[PC] = SP

         // effect of instruction on stack
         SP -= removed_items(instruction)
         SP += added_items(instruction)
         if SP < 0 || 1024 < SP {
             return false
         }

         // successful validation of path
         if instruction == STOP, RETURN, or SUICIDE {
             return true
         }

         if instruction == RJUMP {

             // check for valid destination
             jumpdest = *PC, PC++, jumpdest << 8, jumpdest = *PC
             if bytecode[jumpdest] != JUMPDEST {
                 return false
             }

             // reset PC to destination of jump 
             PC = jumpdest
             continue
         }
         if instruction == RJUMPI {

             // check for valid destination
             jumpdest = *PC, PC++, jumpdest << 8, jumpdest = *PC
             if bytecode[jumpdest] != JUMPDEST {
                 return false
             }

             // reset PC to destination of jump 
             PC = jumpdest

             // recurse to jump to code to validate 
             if !validate(stack[SP])) {
                 return false
             }
             continue 
         }
         if instruction == JUMPSUB {

             // check for valid destination
             jumpdest = *PC, PC++, jumpdest << 8
             if bytecode[jumpdest] != BEGINSUB {
                 return false
             }

             // recurse to jump to code to validate
             prevSP = SP
             depth = SP - prevSP
             SP = depth
             if  !validate(stack[SP]+1)) {
                 return false
             }
             SP = prevSP - depth + SP
             PC = prevPC
             continue
         }
         if instruction == RETURNSUB {
             PC = prevPC
             return true
         }

         // advance PC according to instruction
         PC = advance_pc(PC, instruction)
      }    
   }

Test Cases

TO DO: Test cases are not up-to-date.

Simple routine

This should jump into a subroutine, back out and stop.

Bytecode: 0x60045e005c5d (PUSH1 0x04, JUMPSUB, STOP, BEGINSUB, RETURNSUB)

Pc	Op	Cost	Stack	RStack
0	JUMPSUB	5	[]	[]
3	RETURNSUB	5	[]	[0]
4	STOP	0	[]	[]

Output: 0x Consumed gas: 10

Two levels of subroutines

This should execute fine, going into one two depths of subroutines

Bytecode: 0x6800000000000000000c5e005c60115e5d5c5d (PUSH9 0x00000000000000000c, JUMPSUB, STOP, BEGINSUB, PUSH1 0x11, JUMPSUB, RETURNSUB, BEGINSUB, RETURNSUB)

Pc	Op	Cost	Stack	RStack
0	JUMPSUB	5	[]	[]
3	JUMPSUB	5	[]	[0]
4	RETURNSUB	5	[]	[0,3]
5	RETURNSUB	5	[]	[3]
6	STOP	0	[]	[]

Consumed gas: 20

Failure 1: invalid jump

This should fail, since the given location is outside of the code-range. The code is the same as previous example, except that the pushed location is 0x01000000000000000c instead of 0x0c.

Bytecode: (PUSH9 0x01000000000000000c, JUMPSUB, 0x6801000000000000000c5e005c60115e5d5c5d, STOP, BEGINSUB, PUSH1 0x11, JUMPSUB, RETURNSUB, BEGINSUB, RETURNSUB)

Pc	Op	Cost	Stack	RStack
0	JUMPSUB	10	[18446744073709551628]	[]

Error: at pc=10, op=JUMPSUB: invalid jump destination

Failure 2: shallow `return stack`

This should fail at first opcode, due to shallow return_stack

Bytecode: 0x5d5858 (RETURNSUB, PC, PC)

Pc	Op	Cost	Stack	RStack
0	RETURNSUB	5	[]	[]

Error: at pc=0, op=RETURNSUB: invalid retsub

Subroutine at end of code

In this example. the JUMPSUB is on the last byte of code. When the subroutine returns, it should hit the 'virtual stop' after the bytecode, and not exit with error

Bytecode: 0x6005565c5d5b60035e (PUSH1 0x05, JUMP, BEGINSUB, RETURNSUB, JUMPDEST, PUSH1 0x03, JUMPSUB)

Pc	Op	Cost	Stack	RStack
0	PUSH1	3	[]	[]
2	JUMP	8	[5]	[]
5	JUMPDEST	1	[]	[]
6	JUMPSUB	5	[]	[]
2	RETURNSUB	5	[]	[2]
7	STOP	0	[]	[]

Consumed gas: 30

Appendix: Stack Frames

Given EIP-3337, we can easily create an auxiliary stack in memory for local variables and temporaries. Two conventions are typical -- either create the frame before calling the subroutine, or create the frame on entry to the subroutine. Adding these two instructions would easse the way. They are modelled on the corresponding Intel opcdodes.

ENTER <frame_size>

Write the current FP to memory as 4 bytes, MSB-first at FP - frame_size, then set FP to FP - frame size.

This should be placed either before a JUMPSUB operation, or after a BEGINSUB -- depending on the calling convemtion. Repeated calls to PUSHFP create a stack of frames, linked by their previous FP field. The stack grows towards lower addresses in memory, as is the common, efficient practice.

`LEAVE`

Restores FP to the value which was stored at offset FP in memory by ENTER.

This should be placed after the JUMPSUB operation or before the RETURNSUB -- depending on the calling convention -- to pop the most recent frame from the call stack.

These operations would be especially useful if the gas formula for memory was reinterpreted so that writes from the top of memory down (equivalently, from -1 down, twos-complement) are charged the same as or writes from bottom of memory up.

The total cost to expand the memory to size a words is

Cmem(a) = 3 * a + floor(a ** 2 / 512)

If the memory is already b words long, the incremental cost is

Cmem(a) - Cmem(b)

If a, b and memory offsets in general are allowed to be negative the formula gives the desired results -- stack memory can grow from the top down, and heap memory can be allocated from the bottom up, without address conflicts or excessive gas charges.

Appendix: Code Sections

Given EIP-3540 we can divide code into sections, with subroutines contained within sections. These sections could provide syntactic boundaries against cross-routine control flow. Calls from within code sections to subroutines in other sections would be allowed. but not Jumps into or out of code sections would not be allowed.

References

A.M. Turing, Proposals for the development in the Mathematics Division of an Automatic Computing Engine (ACE) Report E882, Executive Committee, NPL 1946 Alex Beregszaszi, Paweł Bylica, Andrei Maiboroda, EVM Object Format (EOF) v1 2021 Andrei Maiboroda, EOF static relative jumps Gavin Wood, Ethereum: A Secure Decentralized Generalized Transaction Ledger, 2014-2021 Greg Colvin, Brooklyn Zelenka, Paweł Bylica, Christian Reitwiessner, EIP-615: Subroutines and Static Jumps for the EVM, 2016-2019 Nick Johnson, EIP-3337: Frame pointer support for memory load and store operations, 2021

22 KiB Raw Blame History Unescape Escape