EIPs/EIPS/eip-3670.md
Alex Beregszaszi a380e73193
Add EOF code validation draft (#3670)
* Add EOF validation draft

* Rename to EIP-3670

* Use new discussion url
2021-07-20 09:29:11 +00:00

4.3 KiB

eip title author discussions-to status type category created requires
3670 EOF - Code Validation Alex Beregszaszi (@axic), Andrei Maiboroda (@gumb0), Paweł Bylica (@chfast) https://ethereum-magicians.org/t/eip-3670-eof-code-validation/6693 Draft Standards Track Core 2021-06-23 3540

Abstract

Introduce code validation at contract creation time for EOF formatted (EIP-3540) contracts. Reject contracts which contain truncated PUSH-data or unassigned instructions. Legacy bytecode is unaffected by this change.

Motivation

Currently existing contracts require no validation of correctness and EVM implementations can decide how they handle truncated bytecode or unassinged instructions. This change aims to bring code validity into consensus, so that it becomes easier to reason about bytecode. Moreover, EVM implementations may require less paths to decide which instruction is valid in the current execution context.

If it will be desired to introduce new instructions without bumping EOF version, having unassigned instructions already deployed would mean such contracts potentially can be broken (since some of the instructions are changing their behaviour). Rejecting to deploy unassigned instructions allows introducing new instructions with or without bumping the EOF version.

Specification

Remark: We rely on the notation of initcode, code and creation as defined by EIP-3540.

This feature is introduced on the very same block EIP-3540 is enabled, therefore every EOF1-compatible bytecode MUST be validated according to these rules.

At contract creation time both initcode and code are iterated instruction-by-instruction (the same process is used to perform "JUMPDEST-analysis"). Bytecode is deemed invalid if any of these conditions is true:

  • it contains opcodes which are not currently assigned to an instruction (for the sake of assigned instructions, we count INVALID (0xfe) as assigned),
  • the data portion of a PUSHn (0x60..0x7f) instruction is beyond the end of the code section.

If initcode is invalid, the contract creation results in an exceptional abort. If code is invalid, the contract creation results in an exceptional abort.

Rationale

The deprecated CALLCODE (0xf2) opcode may be dropped from the valid_opcodes list to prevent use of this instruction in future. Likewise SELFDESTRUCT (0xff) could also be rejected. Yet we decided not to mix such changes in.

Reference Implementation

# The below are ranges as specified in the Yellow Paper.
# Note: range(s, e) excludes e, hence the +1
valid_opcodes = [
    *range(0x00, 0x0b + 1),
    *range(0x10, 0x1d + 1),
    0x20,
    *range(0x30, 0x3f + 1),
    *range(0x40, 0x47 + 1),
    *range(0x50, 0x5b + 1),
    *range(0x60, 0x6f + 1),
    *range(0x70, 0x7f + 1),
    *range(0x80, 0x8f + 1),
    *range(0x90, 0x9f + 1),
    *range(0xa0, 0xa4 + 1),
    # Note: 0xfe is considered assigned.
    *range(0xf0, 0xf5 + 1), 0xfa, 0xfd, 0xfe, 0xff
]

# Fails with assertion on invalid code
def validate_code(code: bytes):
    pos = 0
    while pos < len(code):
        # Ensure the opcode is valid
        opcode = code[pos]
        pos += 1
        assert(opcode in valid_opcodes)

        # Skip pushdata -- over-reading will be checked on next iteration
        if opcode >= 0x60 and opcode <= 0x7f:
            pos += opcode - 0x60 + 1
    
    # Ensure last PUSH doesn't go over code end
    assert(pos == len(code))

Test Cases

TBA

Backwards Compatibility

This change poses no risk to backwards compatibility, as it is introduced at the same time EIP-3540 is. The validation does not cover legacy bytecode.

Security Considerations

The initcode validation adds additional overhead. We consider adding additional gas cost for the contract creation proportional to the length of the code section in initcode (e.g. 3 gas per byte).

This is not an issue for code in average case because the deployment cost is already 200 gas per byte. However, an attack may be constructed where long code is first validated and then contract creation fails because of insufficient gas for the deployment cost. The solution is to either charge the deployment cost before validation, or add a similar charge as initcode will have.

Copyright and related rights waived via CC0.