mirror of
https://github.com/status-im/EIPs.git
synced 2025-02-23 12:18:16 +00:00
* Switch EOF EIPs to Review * Add the description header to the EOF * Add test cases to 3540 * Add security considerations * Add reference code validation implementation to EIP-3540 * Fix the simplified validator in 3540 * Turn the simplified validator into python too
115 lines
5.1 KiB
Markdown
115 lines
5.1 KiB
Markdown
---
|
|
eip: 3670
|
|
title: EOF - Code Validation
|
|
description: Validate EOF bytecode for correctness at the time of deployment.
|
|
author: Alex Beregszaszi (@axic), Andrei Maiboroda (@gumb0), Paweł Bylica (@chfast)
|
|
discussions-to: https://ethereum-magicians.org/t/eip-3670-eof-code-validation/6693
|
|
status: Review
|
|
type: Standards Track
|
|
category: Core
|
|
created: 2021-06-23
|
|
requires: 3540
|
|
---
|
|
|
|
## Abstract
|
|
|
|
Introduce code validation at contract creation time for EOF formatted ([EIP-3540](./eip-3540.md)) contracts. Reject contracts which contain truncated `PUSH`-data or undefined instructions. Legacy bytecode (code which is not EOF formatted) is unaffected by this change.
|
|
|
|
## Motivation
|
|
|
|
Currently existing contracts require no validation of correctness and EVM implementations can decide how they handle truncated bytecode or undefined instructions. This change aims to bring code validity into consensus, so that it becomes easier to reason about bytecode. Moreover, EVM implementations may require less paths to decide which instruction is valid in the current execution context.
|
|
|
|
If it will be desired to introduce new instructions without bumping EOF version, having undefined instructions already deployed would mean such contracts potentially can be broken (since some of the instructions are changing their behaviour). Rejecting to deploy undefined instructions allows introducing new instructions with or without bumping the EOF version.
|
|
|
|
## Specification
|
|
|
|
*Remark:* We rely on the notation of *initcode*, *code* and *creation* as defined by [EIP-3540](./eip-3540.md).
|
|
|
|
This feature is introduced on the very same block EIP-3540 is enabled, therefore every EOF1-compatible bytecode MUST be validated according to these rules.
|
|
|
|
At contract creation time both *initcode* and *code* are iterated instruction-by-instruction (the same process is used to perform "JUMPDEST-analysis"). Bytecode is deemed invalid if any of these conditions is true:
|
|
- it contains opcodes which are not currently assigned to an instruction (for the sake of assigned instructions, we count `INVALID` (0xfe) as assigned),
|
|
- the data portion of a `PUSHn` (0x60..0x7f) instruction is beyond the end of the code section.
|
|
|
|
If *initcode* is invalid, the contract creation results in an exceptional abort.
|
|
If *code* is invalid, the contract creation results in an exceptional abort.
|
|
|
|
## Rationale
|
|
|
|
The deprecated `CALLCODE` (0xf2) opcode may be dropped from the `valid_opcodes` list to prevent use of this instruction in future. Likewise `SELFDESTRUCT` (0xff) could also be rejected. Yet we decided not to mix such changes in.
|
|
|
|
## Reference Implementation
|
|
|
|
```python
|
|
# The below are ranges as specified in the Yellow Paper.
|
|
# Note: range(s, e) excludes e, hence the +1
|
|
valid_opcodes = [
|
|
*range(0x00, 0x0b + 1),
|
|
*range(0x10, 0x1d + 1),
|
|
0x20,
|
|
*range(0x30, 0x3f + 1),
|
|
*range(0x40, 0x47 + 1),
|
|
*range(0x50, 0x5b + 1),
|
|
*range(0x60, 0x6f + 1),
|
|
*range(0x70, 0x7f + 1),
|
|
*range(0x80, 0x8f + 1),
|
|
*range(0x90, 0x9f + 1),
|
|
*range(0xa0, 0xa4 + 1),
|
|
# Note: 0xfe is considered assigned.
|
|
*range(0xf0, 0xf5 + 1), 0xfa, 0xfd, 0xfe, 0xff
|
|
]
|
|
|
|
# Fails with assertion on invalid code
|
|
def validate_code(code: bytes):
|
|
pos = 0
|
|
while pos < len(code):
|
|
# Ensure the opcode is valid
|
|
opcode = code[pos]
|
|
pos += 1
|
|
assert(opcode in valid_opcodes)
|
|
|
|
# Skip pushdata
|
|
if opcode >= 0x60 and opcode <= 0x7f:
|
|
pos += opcode - 0x60 + 1
|
|
|
|
# Ensure last PUSH doesn't go over code end
|
|
assert(pos == len(code))
|
|
```
|
|
|
|
## Test Cases
|
|
|
|
#### Contract creation
|
|
|
|
Each case should be tested for creation transaction, `CREATE` and `CREATE2`.
|
|
|
|
- Invalid initcode
|
|
- Valid initcode returning invalid code
|
|
- Valid initcode returning valid code
|
|
|
|
#### Valid codes
|
|
|
|
- EOF code containing `INVALID`
|
|
- EOF code with a code section ending with `PUSH` instruction followed by correct number of bytes of data
|
|
- EOF code with data section containing bytes that are undefined instructions
|
|
- Legacy code containing undefined instruction
|
|
- Legacy code ending with incomplete PUSH instruction
|
|
|
|
#### Invalid codes
|
|
|
|
- EOF code containing undefined instruction
|
|
- EOF code ending with incomplete `PUSH` instruction
|
|
- This can include `PUSH` instruction unreachable by execution, e.g. after `STOP`
|
|
|
|
## Backwards Compatibility
|
|
|
|
This change poses no risk to backwards compatibility, as it is introduced at the same time EIP-3540 is. The validation does not cover legacy bytecode (code which is not EOF formatted).
|
|
|
|
## Security Considerations
|
|
|
|
The *initcode* validation adds additional overhead. We consider adding additional gas cost for the contract creation proportional to the length of the code section in *initcode* (e.g. 3 gas per byte).
|
|
|
|
This is not an issue for *code* in average case because the deployment cost is already 200 gas per byte. However, an attack may be constructed where long *code* is first validated and then contract creation fails because of insufficient gas for the deployment cost. The solution is to either charge the deployment cost before validation, or add a similar charge as *initcode* will have.
|
|
|
|
## Copyright
|
|
|
|
Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/). |