* eip-2315: updated spec and examples * eip-2315: formatting nits * Update EIPS/eip-2315.md * Update EIPS/eip-2315.md Co-Authored-By: Andrei Maiboroda <andrei@ethereum.org> * Update EIPS/eip-2315.md Co-Authored-By: MrChico <martin.lundfall@gmail.com> Co-authored-by: Andrei Maiboroda <andrei@ethereum.org> Co-authored-by: MrChico <martin.lundfall@gmail.com>
7.6 KiB
eip | title | status | type | category | author | discussions-to | created |
---|---|---|---|---|---|---|---|
2315 | Simple Subroutines for the EVM | Draft | Standards Track | Core | Greg Colvin (greg@colvin.org), Martin Holst Swende (@holiman) | https://ethereum-magicians.org/t/eip-2315-simple-subroutines-for-the-evm/3941 | 2019-10-17 |
Abstract
This proposal introduces three opcodes to support subroutines: BEGINSUB
, JUMPSUB
and RETURNSUB
.
Motivation
The EVM does not provide subroutines as a primitive. Instead, calls can be synthesized by fetching and pushing the current program counter on the data stack and jumping to the subroutine address; returns can be synthesized by contriving to get the return address back to the top of stack and jumping back to it. Complex calling conventions are then needed to use the same stack for computation and control flow. Code becomes harder to read and write, and tools may need to pattern-match the conventions to identify the use of subroutines. Complex calling conventions like these can be avoided using memory, but regardless, it costs a lot of gas.
Having opcodes to directly support subroutines can eliminate this complexity and cost, just as for other physical and virtual machines going back at least 50 years.
In the Appendix we show example solc output for a simple program that uses over three times as much gas just calling and returning from subroutines as comparable code using these opcodes.
Specification
We introduce one more stack into the EVM, called return_stack
. The return_stack
is limited to 1023
items.
BEGINSUB
Marks the entry point to a subroutine.
pops: 0
pushes: 0
JUMPSUB
- Pops
1
value from thestack
, hereafter referred to aslocation
.
- 1.1 If the opcode at
location
is not aBEGINSUB
, abort with error.
- Pushes the current
pc+1
to thereturn_stack
.
- 2.1 If the
return_stack
already has1023
items, abort with error.
- Sets the
pc
tolocation
.
pops: 1
pushes: 0
(return_stack
pushes: 1
)
RETURNSUB
- Pops
1
value form thereturn_stack
. 1.1 If thereturn_stack
is empty, abort with error - Sets
pc
to the popped value
pops: 0
(return_stack
pops: 1
)
pushes: 0
Note: Values popped from return_stack
do not need to be validated, since they cannot be set arbitrarily from code, only implicitly by the evm.
Note2: A value popped from return_stack
may be outside of the code length, if the last JUMPSUB
was the last byte of the code
. In this case the next opcode is implicitly a STOP
, which is not an error.
Rationale
This is the smallest possible change that provides native subroutines without breaking backwards compatibility.
Backwards Compatibility
These changes do not affect the semantics of existing EVM code.
Alternative variants
One possible variant, would be to add an extra clause to the BEGINSUB
opcode.
- A
BEGINSUB
opcode may only be reached via aJUMPSUB
.
This would make walking
into a subroutine an error. The rationale for this would be to possibly improve static analysis, being able
to make stronger assertions about the code flow.
This is not part of the current specification, since code-generators can trivially implement these guarantees by always prepending STOP
opcode before
any BEGINSUB
operation.
Test Cases
Simple routine
This should jump into a subroutine, back out and stop.
Bytecode: 0x6004b300b5b7
Pc | Op | Cost | Stack | RStack |
---|---|---|---|---|
0 | PUSH1 | 3 | [] | [] |
2 | JUMPSUB | 8 | [4] | [] |
4 | BEGINSUB | 1 | [] | [ 2] |
5 | RETURNSUB | 2 | [] | [ 2] |
3 | STOP | 0 | [] | [] |
Two levels of subroutines
This should execute fine, going into one two depths of subroutines
Bytecode: 0x6800000000000000000cb300b56011b3b7b5b7
Pc | Op | Cost | Stack | RStack |
---|---|---|---|---|
0 | PUSH9 | 3 | [] | [] |
10 | JUMPSUB | 8 | [12] | [] |
12 | BEGINSUB | 1 | [] | [10] |
13 | PUSH1 | 3 | [] | [10] |
15 | JUMPSUB | 8 | [17] | [10] |
17 | BEGINSUB | 1 | [] | [10,15] |
18 | RETURNSUB | 2 | [] | [10,15] |
16 | RETURNSUB | 2 | [] | [10] |
11 | STOP | 0 | [] | [] |
Failure 1: invalid jump
This should fail, since the given location
is outside of the code-range. The code is the same as previous example,
except that the pushed location
is 0x01000000000000000c
instead of 0x0c
.
Bytecode: 0x6801000000000000000cb300b56011b3b7b5b7
Pc | Op | Cost | Stack | RStack |
---|---|---|---|---|
0 | PUSH9 | 3 | [] | [] |
10 | JUMPSUB | 8 | [18446744073709551628] | [] |
Error: at pc=10, op=JUMPSUB: evm: invalid jump destination
Failure 2: shallow return_stack
This should fail at first opcode, due to shallow return_stack
Bytecode: 0xb75858
(RETURNSUB
, PC
, PC
)
Pc | Op | Cost | Stack | RStack |
---|---|---|---|---|
0 | RETURNSUB | 2 | [] | [] |
Error: at pc=0, op=RETURNSUB: evm: invalid retsub
Implementations
No clients have implemented this proposal as of yet, but there are Draft PRs for
Costs and Codes
We suggest that the cost of BEGINSUB
be base, JUMPSUB
be low, and RETURNSUB
be verylow.
Measurement will tell. We suggest the following opcodes:
0xb2 BEGINSUB
0xb3 JUMPSUB
0xb7 RETURNSUB
Security Considerations
These changes do introduce new flow control instructions, so any software which does static/dynamic analysis of evm-code
needs to be modified accordingly. The JUMPSUB
semantics are similar to JUMP
(but jumping to a BEGINSUB
), whereas the RETURNSUB
instruction
is different, since it can 'land' on any opcode (but the possible destinations can be statically inferred).
Appendix: Comparative costs.
contract fun {
function test(uint x, uint y) public returns (uint) {
return test_mul(2,3);
}
function test_mul(uint x, uint y) public returns (uint) {
return multiply(x,y);
}
function multiply(uint x, uint y) public returns (uint) {
return x * y;
}
}
Here is solc 0.6.3 assembly code with labeled destinations.
TEST:
jumpdest
0x00
RTN
0x02
0x03
TEST_MUL
jump
TEST_MUL:
jumpdest
0x00
RTN
dup4
dup4
MULTIPLY
jump
RTN:
jumpdest
swap4
swap3
pop
pop
pop
jump
MULTIPLY:
jumpdest
mul
swap1
jump
solc does a good job with the multiply() function, which is a leaf. Non-leaf functions are more awkward to get out of. Calling fun.test()
will cost 118 gas, plus 5 for the mul
.
This is the same code written using jumpsub
and returnsub
. Calling fun.test()
will cost 34 gas (plus 5).
TEST:
beginsub
0x02
0x03
TEST_MUL
jumpsub
returnsub
TEST_MUL:
beginsub
MULTIPLY
jumpsub
returnsub
MULTIPLY:
beginsub
mul
returnsub
Copyright and related rights waived via CC0.