15 KiB
eip | title | status | type | category | author | discussions-to | created |
---|---|---|---|---|---|---|---|
2315 | Simple Subroutines for the EVM | Draft | Standards Track | Core | Greg Colvin <greg@colvin.org>, Martin Holst Swende (@holiman) | https://ethereum-magicians.org/t/eip-2315-simple-subroutines-for-the-evm/3941 | 2019-10-17 |
Abstract
This proposal introduces five opcodes to better support simple subroutines and relative jumps: BEGINSUB
, JUMPSUB
RETURNSUB
, JUMPR
and JUMPRI
.
This change supports substantial reductions in the gas costs of calling and optimizing simple subroutines – from %33 to as much as 54%.
Motivation
The EVM does not provide subroutines as a primitive. Instead, calls can be synthesized by fetching and pushing the current program counter on the data stack and jumping to the subroutine address; returns can be synthesized by getting the return address to the top of the stack and jumping back to it. These conventions are more costly than necessary.
Facilities to directly support subroutines are provided by all but one of the real and virtual machines programmed by the lead author, including the Burroughs 5000, CDC 7600, IBM 360, DEC PDP 11 and VAX, Motorola 68000, a few generations of Intel silicon, Sun SPARC, UCSD p-Machine, Sun JVM, Wasm, and the sole exception the EVM. In whatever form, these operations provide for
- capturing the current context of execution,
- transferring control to a new context, and
- returning to the original context
- after possible further transfers of control
- to some arbitrary depth.
The concept goes back to Turing, 1946:
We also wish to be able to arrange for the splitting up of operations into subsidiary operations. This should be done in such a way that once we have written down how an operation is done we can use it as a subsidiary to any other operation. ... When we wish to start on a subsidiary operation we need only make a note of where we left off the major operation and then apply the first instruction of the subsidiary. When the subsidiary is over we look up the note and continue with the major operation. Each subsidiary operation can end with instructions for this recovery of the note. How is the burying and disinterring of the note to be done? There are of course many ways. One is to keep a list of these notes in one or more standard size delay lines, (1024) with the most recent last. The position of the most recent of these will be kept in a fixed TS, and this reference will be modified every time a subsidiary is started or finished...
We propose to follow Turing's simple concept in our subroutine design, as specified below. Note that this specification is entirely semantic. It constrains only stack usage and control flow and imposes no syntax on code beyond being a sequence of bytes to be executed.
Specification
We introduce one more stack into the EVM in addition to the existing data stack
, which we call the return stack
. The return stack
is limited to 1024
items. This stack supports three new instructions for subroutines.
Instructions
BEGINSUB( 0x5c)
Marks an entry point to a subroutine. Execution of a
BEGINSUB
is a no-op. The cost is jumpdest.
JUMPSUB (0x5d) location
Transfers control to a subroutine.
- Decode the
location
from the immediate data. The data is encoded as three bytes, MSB-first.- If the opcode at
location
is not aBEGINSUB
abort
.- If the
return stack
already has1024
itemsabort
.- Push the current
PC + 1
to thereturn stack
.- Set
PC
tolocation
.The cost is low.
- pops one item off the
data stack
- pushes one item on the
return stack
RETURNSUB (0x5e)
Returns control to the caller of a subroutine.
- If the
return stack
is emptyabort
.- Pop
PC
off thereturn stack
.The cost is verylow.
- pops one item off the
return stack
To take full advantage of the performance benefits of simple subroutines we also provide two new static, relative jump functions that take their arguments as immediate data rather then off the stack.
JUMPR (0x??) offset
Transfers control to the address
PC + offset
, where offset is a three-byte, MSB first, twos-complement integer.
- Decode the
offset
from the immediate data. The data is encoded as three bytes, MSB first, twos-complement.- If the opcode at
location
is not aJUMDEST
thenabort
.- Set
PC
tolocation
.The cost is low.
JUMPRI (0x??) offset
Conditionally transfers control to the address
PC + offset
, where offset is a three-byte, MSB first, twos-complement integer.
- Decode the
offset
from the immediate data. The data is encoded as three bytes, MSB first, twos-complement.- Pop the
condition
from the stack.- If the condition is true then continue
- If the opcode at
PC + offset
is not aJUMDEST
abort
. SetPC
toPC + offset
.The cost is mid.
Notes:
- If a resulting
PC
to be executed is beyond the last instruction then the opcode is implicitly aSTOP
, which is not an error. - Values popped off the
return stack
do not need to be validated, since they are alterable only byJUMPSUB
andRETURNSUB
. - The description above lays out the semantics of this feature in terms of a
return stack
. But the actual state of thereturn stack
is not observable by EVM code or consensus-critical to the protocol. (For example, a node implementor may codeJUMPSUB
to unobservably pushPC
on thereturn stack
rather thanPC + 1
, which is allowed so long asRETURNSUB
observably returns control to thePC + 1
location.) - The
return stack
is the functional equivalent of Turing's "delay line".
Dependencies
We need EIP-3540: EVM Object Format (EOF) to allow for immediate arguments without special encoding.
Rationale
We modeled this design on Moore's 1970 Forth virtual machine. It is a two-stack design – the data stack is supplemented with a return stack to support jumping into and returning from subroutines, as specified above. The return address (Turing's "note") is pushed onto the return stack (Turing's "delay line") when calling, and the stack is popped into the PC
when returning.
The alternative design is to push the return address and the destination address on the data stack before jumping, and to pop the data stack and jump back to the popped PC
to return. We prefer the separate return stack because it ensures that the return address cannot be overwritten or mislaid, uses fewer data stack slots, and obviates any need to swap the return address past the arguments or return values on the stack. Crucially, a dynamic jump is not needed to implement subroutine returns`.
The low cost of JUMPSUB
is justified by needing only about six Go operations to push the return address on the return stack, and decode the immediate two byte destination to the PC
. The verylow cost of RETURNSUB
is justified by needing only about three Go operations to pop the return stack into the PC
. No 256-bit arithmetic or checking for valid destinations is needed. Also, JUMP
is assigned mid, and JUMPSUB
and JUMPR should be more efficient, as decoding immediate bytes should be cheaper than than converting 32-byte stack items, and the destination address will not need to be checked for either JUMPSUB
or RETURNSUB
. Benchmarking will be needed to tell if the costs are well-balanced.
Gas Cost Analysis
These opcodes reduce the gas costs of both ordinary subroutine calls and low-level optimizations. The savings reported here will of course be less relevant to programs that use a few large subroutines rather than being a factored than in to smaller ones. The choice of gas costs for the new opcodes above does not make a large difference in this analysis, as much of the improvement is due to PUSH and SWAP operations that are no longer needed. Even if JUMPSUB
cost the same as JUMP
– 8 gas rather than 5 - a simple subroutine call would still be 52% less costly versus 54%.
_Note: the JUMP versions of the examples below are all valid code._
Simple Subroutine Call
Consider this example of calling a minimal subroutine
using JUMPSUB
ADD:
beginsub ; 1 gas
0x02 ; 3 gas
0x03 ; 3 gas
jumpsub ADDITION ; 5 gas
returnsub ; 3 gas
ADDITION:
beginsub ; 1 gas
add ; 3 gas
returnsub ; 3 gas
Total 22 gas.
The same code, using JUMP
.
TEST_ADD:
jumpdest ; 1 gas
RTN_ADD ; 3 gas
0x02 ; 3 gas
0x03 ; 3 gas
ADDITION ; 3 gas
jump ; 8 gas
RTN_ADD:
jumpdest ; 1 gas
swap1 ; 3 gas
jump ; 8 gas
ADDITION:
jumpdest ; 1 gas
add ; 3 gas
swap1 ; 3 gas
jump ; 8 gas
Total: 48 gas
Using JUMPSUB
saves 48 - 22 = 26 gas versus using JUMP
– a 54% performance improvement.
The advantages of JUMPR can be seen in, e.g., the tail simple subroutine call.
Tail Call Optimization
Of course in cases like this one we can optimize the tail call, so that the final RETURNSUB
in ADDITION
actually returns from TEST_ADD.
TEST_ADD:
beginsub ; 1 gas
0x02 ; 3 gas
0x03 ; 3 gas
jumpsub ADDITION ; 3 gas
ADDITION:
beginsub ; 1 gas
add ; 3 gas
returnsub ; 3 gas
Total: 20 gas
Or the same code, using JUMP
TEST_ADD:
jumpdest ; 1 gas
0x02 ; 3 gas
0x03 ; 3 gas
ADDITION ; 3 gas
jump ; 8 gas
ADDITION:
jumpdest ; 1 gas
add ; 3 gas
swap1 ; 3 gas
jump ; 8 gas
Total: 33 gas
Using JUMPSUB
saves 33 - 20 = 13 gas versus using JUMP
– a 39% performance improvement.
Tail Call Elimination
We can even take advantage of ADDITION
just happening to directly follow TEST_ADD
and just fall through rather than jump at all.
TEST_ADD:
beginsub ; 1 gas
0x02 ; 3 gas
0x03 ; 3 gas
ADDITION:
beginsub ; 1 gas
add ; 3 gas
returnsub ; 3 gas
Total 16 gas.
The same code, using JUMP.
TEST_ADD:
jumpdest ; 1 gas
0x02 ; 3 gas
0x03 ; 3 gas
ADDITION:
jumpdest ; 1 gas
add ; 3 gas
swap1 ; 3 gas
jump ; 8 gas
Total: 24 gas
Using JUMPSUB
saves 22 - 14 = 8 gas versus using JUMP
– a 36% performance improvement.
Finally, we can take a look at using JUMPR
instead of JUMP
Tail Calls with JUMPR
TEST_ADD:
jumpdest ; 1 gas
0x02 ; 3 gas
jumpr ADDITION
; 3 gas
ADDITION:
jumpdest ; 1 gas
add ; 3 gas
swap1 ; 3 gas
jump ; 8 gas
Total: 22 gas.
Using JUMPR
saves 33 - 22 = 11_ gas – a 33% performance improvement.
Backwards and Forwards Compatibility
These changes do not affect the semantics of existing EVM code.
These changes are compatible with using EIP-3337 to provide stack frames, by associating a frame with each subroutine.
Security Considerations
These changes do introduce new flow control instructions, so any software which does static/dynamic analysis of EVM code needs to be modified accordingly. The JUMPSUB
semantics are similar to JUMP
(but jumping to a BEGINSUB
), whereas the RETURNSUB
instruction is different, since it can 'land' on any opcode (but the possible destinations can be statically inferred).
If EIP-
3779 – Safe Control Flow for the EVM – advances then the requirement on JUMPSUB
to abort
if the opcode at location
is not a BEGINSUB
will need to be enforced at creation time rather than runtime.
Test Cases
Simple routine
This should jump into a subroutine, back out and stop.
Bytecode: 0x60045e005c5d
(PUSH1 0x04, JUMPSUB, STOP, BEGINSUB, RETURNSUB
)
Pc | Op | Cost | Stack | RStack |
---|---|---|---|---|
0 | JUMPSUB | 5 | [] | [] |
3 | RETURNSUB | 5 | [] | [0] |
4 | STOP | 0 | [] | [] |
Output: 0x
Consumed gas: 10
Two levels of subroutines
This should execute fine, going into one two depths of subroutines
Bytecode: 0x6800000000000000000c5e005c60115e5d5c5d
(PUSH9 0x00000000000000000c, JUMPSUB, STOP, BEGINSUB, PUSH1 0x11, JUMPSUB, RETURNSUB, BEGINSUB, RETURNSUB
)
Pc | Op | Cost | Stack | RStack |
---|---|---|---|---|
0 | JUMPSUB | 5 | [] | [] |
3 | JUMPSUB | 5 | [] | [0] |
4 | RETURNSUB | 5 | [] | [0,3] |
5 | RETURNSUB | 5 | [] | [3] |
6 | STOP | 0 | [] | [] |
Consumed gas: 20
Failure 1: invalid jump
This should fail, since the given location is outside of the code-range. The code is the same as previous example,
except that the pushed location is 0x01000000000000000c
instead of 0x0c
.
Bytecode: (PUSH9 0x01000000000000000c, JUMPSUB,
0x6801000000000000000c5e005c60115e5d5c5d, STOP, BEGINSUB, PUSH1 0x11, JUMPSUB, RETURNSUB, BEGINSUB, RETURNSUB
)
Pc | Op | Cost | Stack | RStack |
---|---|---|---|---|
0 | JUMPSUB | 10 | [18446744073709551628] | [] |
Error: at pc=10, op=JUMPSUB: invalid jump destination
Failure 2: shallow return stack
This should fail at first opcode, due to shallow return_stack
Bytecode: 0x5d5858
(RETURNSUB, PC, PC
)
Pc | Op | Cost | Stack | RStack |
---|---|---|---|---|
0 | RETURNSUB | 5 | [] | [] |
Error: at pc=0, op=RETURNSUB: invalid retsub
Subroutine at end of code
In this example. the JUMPSUB is on the last byte of code. When the subroutine returns, it should hit the 'virtual stop' after the bytecode, and not exit with error
Bytecode: 0x6005565c5d5b60035e
(PUSH1 0x05, JUMP, BEGINSUB, RETURNSUB, JUMPDEST, PUSH1 0x03, JUMPSUB
)
Pc | Op | Cost | Stack | RStack |
---|---|---|---|---|
0 | PUSH1 | 3 | [] | [] |
2 | JUMP | 8 | [5] | [] |
5 | JUMPDEST | 1 | [] | [] |
6 | JUMPSUB | 5 | [] | [] |
2 | RETURNSUB | 5 | [] | [2] |
7 | STOP | 0 | [] | [] |
Consumed gas: 30