Merge pull request #612 from ethereum/jumps&subs

Jumps&subs
2025-01-26 23:00:38 +00:00 · 2017-04-25 15:42:58 +02:00 · 2017-04-25 15:42:58 +02:00 · 2b7c657bee
commit 2b7c657bee
parent 4ce5a7ccdc 2733748276
1 changed files with 35 additions and 18 deletions
--- a/EIPS/JumpsAndSubs.md
+++ b/EIPS/JumpsAndSubs.md
@ -1,5 +1,5 @@
 ```
-EIP: 187
+EIP: TBD
 Title: Subroutines and Static Jumps for the EVM
 Status: Draft
 Type: Core
@ -90,37 +90,49 @@ Dynamic jumps to a `JUMPDEST` are used to implement O(1) jumptables, which are u
 * `JUMPV n, jumpdest ...`
 jumps to one of a vector of `n` `JUMPDEST` offsets via a zero-based index on the stack.  The vector is stored inline in the bytecode.  If the index is greater than or equal to `n - 1` the last (default) offset is used.  `n` is given as four immediate bytes, all `JUMPDEST` offsets as four immediate bytes each.

-Dynamic jumps to a `BEGINSUB` are used to implement O(1) virtual functions and callbacks, which take just two pointer dereferences in C.
+Dynamic jumps to a `BEGINSUB` are used to implement O(1) virtual functions and callbacks, which take just two pointer dereferences on most CPUs.
 * `JUMPSUBV n, beginsub ...`
 jumps to one of a vector of `n` `BEGINSUB` offsets via a zero-based index on the stack.  The vector is stored inline in the bytecode, MSB-first.  If the index is greater than or equal to `n - 1` the last (default) offset is used.  `n` is given as four immediate bytes, the `n` offsets as four immediate bytes each.

 `JUMPV` and `JUMPSUBV` are not strictly necessary.  They provide O(1) operations that can be replaced by O(n) or O(log n) EVM code using static jumps, but that code will be slower, larger and use more gas for things that can and should be fast, small, and cheap, and that are directly supported in WASM with br_table and call_indirect.

+### Variable Access
+
+These operations provide convenient access to subroutine parameters and other variables at fixed stack offsets within a subroutine.
+
+* `PUTLOCAL n`
+Pops the top value on the stack and copies it to local variable `n`.
+
+* `GETLOCAL n`
+Pushes the value of local variable `n` on the EVM stack.
+
+Local variable `n` is `FP[-n]` as defined below.
+
 ## SEMANTICS

 Jumps to and returns from subroutines are described here in terms of
-* the EVM data stack, usually just called “the stack”,
+* the EVM data stack, (as defined in the [Yellow Paper](https://ethereum.github.io/yellowpaper/paper.pdf) usually just called “the stack”,
 * a return stack of `JUMPSUB` and `JUMPSUBV` offsets, and
-* a virtual stack of frame pointers (not needed at runtime).
+* a frame stack of frame pointers.

 We will adopt the following conventions to describe the machine state:
 * The _program counter_ `PC` is (as usual) the byte offset of the currently executing instruction.
-* The _stack pointer_ `SP` corresponds to the number of items on the stack - the _stack size_.  As an offset it addresses the current top of the stack of data values, where new items are pushed.
-* The virtual _frame pointer_ `FP` is set to `SP - n_args` at entry to the currently executing subroutine.
+* The _stack pointer_ `SP` corresponds to the number of items on the stack - the _stack size_.  As a negative offset it addresses the current top of the stack of data values, where new items are pushed.
+* The _frame pointer_ `FP` is set to `SP + n_args` at entry to the currently executing subroutine.
 * The _stack items_ between the frame pointer and the current stack pointer are called the _frame_.
-* The current number of items in the frame, `SP - FP`, is the _frame size_.
+* The current number of items in the frame, `FP - SP`, is the _frame size_.

-Placing the frame pointer at the beginning of the arguments rather than the end is unconventional, but better fits our stack semantics and simplifies the remainder of the proposal.
+Defining the frame pointer so as to include the arguments is unconventional, but better fits our stack semantics and simplifies the remainder of the proposal.

-Note that frame pointers and the frame pointer stack, being virtual, are only needed for the following descriptions of subroutine semantics, not for their actual implementation.  Also, the return stack is internal to the subroutine mechanism, and not directly accessible to the program.
+The frame pointer and return stacks are internal to the subroutine mechanism, and not directly accessible to the program.  This is necessary to prevent the program from modifying its state in ways that could be invalid.

 The first instruction of an array of EVM bytecode begins execution of a _main_ routine with no arguments, `SP` and `FP` set to 0, and with one value on the return stack - `code size - 1`. (Executing the virtual byte of 0 after this offset causes an EVM to stop.  Thus executing a `RETURNSUB` with no prior `JUMPSUB` or `JUMBSUBV` - that is, in the _main_ routine - executes a `STOP`.)

 Execution of a subroutine begins with `JUMPSUB` or `JUMPSUBV`, which
 * push `PC` on the return stack,
-* push `FP` on the virtual frame stack,
+* push `FP` on the frame stack,
 thus suspending execution of the current subroutine, and
-* set `FP` to `SP - n_args`, and
+* set `FP` to `SP + n_args`, and
 * set `PC` to the specified `BEGINSUB` address,
 thus beginning execution of the new subroutine.
 (The _main_ routine is not addressable by `JUMPSUB` instructions.)
@ -155,7 +167,7 @@ To handle the return stack we expand the conditions on stack size:
 >2b  The size of the return stack does not exceed 1024.

 Given our more detailed description of the data stack we restate condition 3 - stack underflow - as
->3  `SP` must be greater than or equal to `FP`
+>3  `SP` must be less than or equal to `FP`

 Since the various `DUP` and `SWAP` operations are formalized as taking items off the stack and putting them back on, this prevents `DUP` and `SWAP` from accessing data below the frame pointer, since taking too many items off of the stack would mean that `SP` is less than `FP`.

@ -337,10 +349,15 @@ All of the instructions are O(1) with a small constant, requiring just a few mac

 We tentatively suggest the following opcodes:
 ```
-0x4C JUMPTO     0x5C BEGINSUB
-0x4D JUMPIF     0x5D BEGINDATA
-0X4E JUMPSUB    0x5E RETURNSUB
-0x4F JUMPSUBV
+0xB0 JUMPTO
+0xB1 JUMPIF
+0XB2 JUMPSUB
+0xB4 JUMPSUBV
+0xB5 BEGINSUB
+0xB6 BEGINDATA
+0xB8 RETURNSUB
+0xB9 PUTLOCAL
+0xBA GETLOCAL
 ```

 ### GETTING THERE FROM HERE
@ -350,6 +367,6 @@ These changes would need to be implemented in phases at decent intervals:

 >2 A later hard fork would require clients to place only valid code on the block chain.  Note that despite the fork old EVM code will still need to be supported indefinitely.

-If desired, the period of deprecation can be extended indefinitely by continuing to accept code not versioned as new - but without validation.  Since we must continue to run old code this is not technically difficult. 
+If desired, the period of deprecation can be extended indefinitely by continuing to accept code not versioned as new - but without validation.  That is, by delaying step 2.  Since we must continue to run old code this is not technically difficult. 

-Implementation of this proposal need not be difficult,  At the least, interpreters can simply be extended with the new opcodes and run unchanged otherwise.  The new opcodes require only a stack for the return offsets and the few pushes, pops, and assignments described above.  JIT code can use native calls.  Further optimizations include minimizing runtime checks for exceptions and taking advantage of validated code wherever possible.
+Implementation of this proposal need not be difficult,  At the least, interpreters can simply be extended with the new opcodes and run unchanged otherwise.  The new opcodes require only stacks for the frame pointers and return offsets and the few pushes, pops, and assignments described above.  JIT code can use native calls.  Further optimizations include minimizing runtime checks for exceptions and taking advantage of validated code wherever possible.