EIPs/EIPS/eip-2315.md

---
eip: 2315
title: Simple Subroutines for the EVM
status: Draft
type: Standards Track
category: Core
author: Greg Colvin <greg@colvin.org>, Martin Holst Swende (@holiman)
discussions-to: https://ethereum-magicians.org/t/eip-2315-simple-subroutines-for-the-evm/3941
created: 2019-10-17
---

## Abstract

This proposal introduces three opcodes to better support subroutines: `BEGINSUB`, `JUMPSUB` and `RETURNSUB`.

This change supports substantial reductions in the gas costs of using subroutines.

## Motivation

The EVM does not provide subroutines as a primitive.  Instead, calls can be synthesized by fetching and pushing the current program counter on the data stack and jumping to the subroutine address; returns can be synthesized by getting the return address to the top of the stack and jumping back to it.  These conventions are more costly than necessary.

Facilities to directly support subroutines are provided by all but one of the real and virtual machines programmed by the lead author, including the Burroughs 5000, CDC 7600, IBM 360, DEC PDP 11 and VAX, Motorola 68000, a few generations of Intel silicon, Sun SPARC, UCSD p-Machine, Sun JVM, Wasm, and the sole exception  the EVM.  In whatever form, these operations provide for
* capturing the current context of execution,
* transferring control to a new context, and
* returning to the original context
   * after possible further transfers of control
   * to some arbitrary depth.

The concept goes back to [Turing, 1946](http://www.alanturing.net/turing_archive/archive/p/p01/P01-001.html):
> We also wish to be able to arrange for the splitting up of operations into subsidiary operations.  This should be done in such a way that once we have written down how an operation is done we can use it as a subsidiary to any other operation.
> ...
> When we wish to start on a subsidiary operation we need only make a note of where we left off the major operation and then apply the first instruction of the subsidiary.  When the subsidiary is over we look up the note and continue with the major operation. Each subsidiary operation can end with instructions for this recovery of the note.  How is the burying and disinterring of the note to be done?  There are of course many ways.  One is to keep a list of these notes in one or more standard size delay lines, (1024) with the most recent last.  The position of the most recent of these will be kept in a fixed TS, and this reference will be modified every time a subsidiary is started or finished...

We propose to use Turing's simple mechanism to implement subroutines, as specified below.  Note that this specification is entirely semantic.  It constrains only stack usage and control flow and imposes no syntax on code beyond being a sequence of bytes to be executed.

## Specification

We introduce one more stack into the EVM in addition to the existing `data stack`, which we call the `return stack`. The `return stack` is limited to `1024` items. This stack supports three new instructions for subroutines.

###  `BEGINSUB( 0x5c)`

> Marks an entry point to a subroutine.  Execution of a `BEGINSUB` is a no-op.  The cost is _jumpdest_.

#### `JUMPSUB (0x5d) location`

> Transfers control to a subroutine.
>
> 1. Decode the `location` from the immediate data.  The data is encoded as three bytes, MSB-first.
> 2. If the opcode at `location` is not a `BEGINSUB` _`abort`_.
> 3. If the `return stack` already has `1024` items _`abort`_.
> 4. Push the current `PC + 1` to the `return stack`.
> 5. Set `PC` to `location`.
>
>  The cost is _low_.
>
> * _pops one item off the `data stack`_
> * _pushes one item on the `return stack`_

#### `RETURNSUB (0x5e)`

> Returns control to the caller of a subroutine.
>
> 1. If the `return stack` is empty _`abort`_.
> 2. Pop `PC` off the `return stack`.
>
> The cost is _verylow_.
>
> * _pops one item off the `return stack`_

_Notes:_
* _If a resulting `PC` to be executed is beyond the last instruction then the opcode is implicitly a `STOP`, which is not an error._
* _Values popped off the `return stack` do not need to be validated, since they are alterable only by `JUMPSUB` and `RETURNSUB`._
* _The description above lays out the semantics of this feature in terms of a `return stack`.  But the actual state of the `return stack` is not observable by EVM code or consensus-critical to the protocol.  (For example, a node implementor may code `JUMPSUB` to unobservably push `PC` on the `return stack` rather than `PC + 1`, which is allowed so long as `RETURNSUB` observably returns control to the `PC + 1` location.)_
* _The `return stack` is the functional equivalent of Turing's "delay line"._

### Dependencies

We need [EIP-3540: EVM Object Format (EOF)](./eip-3540.md) to allow for immediate arguments without special encoding and to deprecate dynamic use of JUMP and JUMPI.

## Rationale

We modeled this design on Moore's 1970 [Forth virtual machine](http://www.ultratechnology.com/4th_1970.pdf). It is a two-stack design  the data stack is supplemented with a return stack to support jumping into and returning from subroutines, as specified above.  The return address (Turing's "note") is pushed onto the return stack (Turing's "delay line") when calling, and the stack is popped into the `PC` when returning.

The alternative design is to push the return address and the destination address on the data stack before jumping, and to pop the data stack and jump back to the popped `PC` to return.  We prefer the separate return stack because it ensures that the return address cannot be overwritten or mislaid, uses fewer data stack slots, and obviates any need to swap the return address past the arguments or return values on the stack.  Crucially, a dynamic jump is not needed to implement subroutine returns, allowing for deprecation of `JUMP` and `JUMPI`.

The _low_ cost of `JUMPSUB` is justified by needing only about six Go operations to push the return address on the return stack, and decode the immediate two byte destination to the `PC`.   The _verylow_ cost of `RETURNSUB` is justified by needing only about three Go operations to pop the return stack into the `PC`.  No 256-bit arithmetic or checking for valid destinations is needed.  Also, `JUMP` is assigned _mid_, and `JUMPSUB` should be more efficient, as decoding immediate bytes should be cheaper than than converting 32-byte stack items, and the destination address will not need to be checked for either `JUMPSUB` or `RETURNSUB`.  Benchmarking will be needed to tell if the costs are well-balanced.

### Gas Costs in Practice

These opcodes reduce the gas costs of both ordinary subroutine calls and low-level optimizations.  The savings reported here will of course be less relevant to programs that use a few large subroutines rather than being a factored than in to smaller ones.   The choice of gas costs for the new opcodes above does not make a large difference in this analysis, as much of the improvement is due to PUSH and SWAP operations that are no longer needed.

**_Note**: the **JUMP** versions of the examples below are all **valid code**._

#### Subroutine Call

Consider this example of calling a minimal subroutine
using  `JUMPSUB`
```
ADD:
    beginsub          ; 1 gas
    0x02              ; 3 gas
    0x03              ; 3 gas
    jumpsub ADDITION  ; 5 gas
    returnsub         ; 3 gas

ADDITION:
    beginsub          ; 1 gas
    add               ; 3 gas
    returnsub         ; 3 gas

Total 22 gas.
```
The same code, using `JUMP`.
```
TEST_ADD:
   jumpdest           ; 1 gas
   RTN_ADD            ; 3 gas
   0x02               ; 3 gas
   0x03               ; 3 gas
   ADDITION           ; 3 gas
   jump               ; 8 gas
RTN_ADD:
   jumpdest           ; 1 gas
   swap1              ; 3 gas
   jump               ; 8 gas

ADDITION:
   jumpdest           ; 1 gas
   add                ; 3 gas
   swap1              ; 3 gas
   jump               ; 8 gas

Total: 48 gas
```
Using `JUMP` uses **_48 - 22 = 26_** more gas than using `JUMPSUB`.

#### Tail Call Optimization

Of course in cases like this one we can optimize the tail call, so that the final `RETURNSUB` in `ADDITION` actually returns from TEST_ADD.
```
TEST_ADD:
    beginsub          ; 1 gas
    0x02              ; 3 gas
    0x03              ; 3 gas
    rjump ADDITION    ; 3 gas

ADDITION:
    beginsub          ; 1 gas
    add               ; 3 gas
    returnsub         ; 3 gas

Total: 20 gas
```
Or the same code, using `JUMP`
```
TEST_ADD:
   jumpdest           ; 1 gas
   0x02               ; 3 gas
   0x03               ; 3 gas
   ADDITION           ; 3 gas
   jump               ; 8 gas

ADDITION:
   jumpdest           ; 1 gas
   add                ; 3 gas
   swap1              ; 3 gas
   jump               ; 8 gas

Total: 33 gas
```
Using `JUMP` cost **_33 - 20 = 13_** more gas than using `JUMPSUB`.

#### Tail Call Elimination

We can even take advantage of `ADDITION` just happening to directly follow `TEST_ADD` and just fall through rather than jump at all.
```
TEST_ADD:
    beginsub          ; 1 gas
    0x02              ; 3 gas
    0x03              ; 3 gas
ADDITION:
    beginsub          ; 1 gas
    add               ; 3 gas
    returnsub         ; 3 gas

Total 16 gas.
```
The same code, using JUMP.
```
TEST_ADD:
   jumpdest           ; 1 gas
   0x02               ; 3 gas
   0x03               ; 3 gas
ADDITION:
   jumpdest           ; 1 gas
   add                ; 3 gas
   swap1              ; 3 gas
   jump               ; 8 gas

Total: 24 gas
```
Using `JUMP` costs **_22 - 14 = 8_** more gas than using `JUMPSUB`.

## Backwards and Forwards Compatibility

These changes affect the semantics of existing EVM code.  The [EVM Object Format (EOF)](./eip-3540.md) is required to allow for immediate arguments without special encoding and to deprecate dynamic use of JUMP and JUMPI.

Contracts that make only static use of JUMP and JUMPI remain valid.

These changes are compatible with using [EIP-3337](https://eips.ethereum.org/EIPS/eip-3337) to provide stack frames, by associating a frame with each subroutine.

## Security Considerations

These changes do introduce new flow control instructions, so any software which does static/dynamic analysis of EVM code needs to be modified accordingly. The `JUMPSUB` semantics are similar to `JUMP` (but jumping to a `BEGINSUB`), whereas the `RETURNSUB` instruction is different, since it can 'land' on any opcode (but the possible destinations can be statically inferred).

## Test Cases

### Simple routine

This should jump into a subroutine, back out and stop.

Bytecode: `0x60045e005c5d` (`PUSH1 0x04, JUMPSUB, STOP, BEGINSUB, RETURNSUB`)

|  Pc   |      Op     | Cost |   Stack   |   RStack  |
|-------|-------------|------|-----------|-----------|
|    0  |    JUMPSUB  |    5 |        [] |        [] |
|    3  |  RETURNSUB  |    5 |        [] |       [0] |
|    4  |       STOP  |    0 |        [] |        [] |

Output: 0x
Consumed gas: `10`

### Two levels of subroutines

This should execute fine, going into one two depths of subroutines

Bytecode: `0x6800000000000000000c5e005c60115e5d5c5d` (`PUSH9 0x00000000000000000c, JUMPSUB, STOP, BEGINSUB, PUSH1 0x11, JUMPSUB, RETURNSUB, BEGINSUB, RETURNSUB`)

|  Pc   |      Op     | Cost |   Stack   |   RStack  |
|-------|-------------|------|-----------|-----------|
|    0  |    JUMPSUB  |    5 |        [] |        [] |
|    3  |    JUMPSUB  |    5 |        [] |       [0] |
|    4  |  RETURNSUB  |    5 |        [] |     [0,3] |
|    5  |  RETURNSUB  |    5 |        [] |       [3] |
|    6  |       STOP  |    0 |        [] |        [] |

Consumed gas: `20`

### Failure 1: invalid jump

This should fail, since the given location is outside of the code-range. The code is the same as previous example,
except that the pushed location is `0x01000000000000000c` instead of `0x0c`.

Bytecode: (`PUSH9 0x01000000000000000c, JUMPSUB, `0x6801000000000000000c5e005c60115e5d5c5d`, STOP, BEGINSUB, PUSH1 0x11, JUMPSUB, RETURNSUB, BEGINSUB, RETURNSUB`)

|  Pc   |      Op     | Cost |   Stack   |   RStack  |
|-------|-------------|------|-----------|-----------|
|    0  |    JUMPSUB  |   10 |[18446744073709551628] |        [] |

```
Error: at pc=10, op=JUMPSUB: invalid jump destination
```

### Failure 2: shallow `return stack`

This should fail at first opcode, due to shallow `return_stack`

Bytecode: `0x5d5858` (`RETURNSUB, PC, PC`)

|  Pc   |      Op     | Cost |   Stack   |   RStack  |
|-------|-------------|------|-----------|-----------|
|    0  |  RETURNSUB  |    5 |        [] |        [] |

```
Error: at pc=0, op=RETURNSUB: invalid retsub
```

### Subroutine at end of code

In this example. the JUMPSUB is on the last byte of code. When the subroutine returns, it should hit the 'virtual stop' _after_ the bytecode, and not exit with error

Bytecode: `0x6005565c5d5b60035e` (`PUSH1 0x05, JUMP, BEGINSUB, RETURNSUB, JUMPDEST, PUSH1 0x03, JUMPSUB`)

|  Pc   |      Op     | Cost |   Stack   |   RStack  |
|-------|-------------|------|-----------|-----------|
|    0  |      PUSH1  |    3 |        [] |        [] |
|    2  |       JUMP  |    8 |       [5] |        [] |
|    5  |   JUMPDEST  |    1 |        [] |        [] |
|    6  |    JUMPSUB  |    5 |        [] |        [] |
|    2  |  RETURNSUB  |    5 |        [] |       [2] |
|    7  |       STOP  |    0 |        [] |        [] |

Consumed gas: `30`