Ahhh I'm Huffing - Low Level EVM Programming in Huff
07/14/238 min read
by Mantle
Developers
Tutorials
Note: This article caters more to proficient solidity/vyper developers with a good understanding of advanced concepts such as EVM stack, calldata, memory, and storage.
Huff is a low-level programming language designed for developing highly optimized smart contracts that run on the Ethereum Virtual Machine (EVM). Huff does not hide the inner workings of the EVM. Instead, Huff exposes its programming stack to the developer for manual manipulation. — Huff
Prelude
Gas golfing is something that is uniquely Ethereum. The concept is simple: Write something that accomplishes something that uses the least amount of gas. This works due to the varying gas consumption associated with each opcode, for example:
Source: EVM.codes
The ADD
opcode would cost 3 gas, while the MUL
opcode would cost 5 gas. While compilers generally do a decent job in optimizing the code to minimize gas consumption, it should be noted that it is not a silver bullet for all gas-related concerns.
Gas golfing is something that Huff is great for.
Why Huff?
Initially introduced by the Aztec protocol for the purpose of writing a highly optimized elliptic curve arithmetic library. Over time, Huff has gained widespread popularity and cultivated a dedicated user base.
The language's success is evident through the emergence of community plugins and tooling, which seamlessly integrate with popular industry-standard tools like Forge and Hardhat.
With its strong community support, robust tooling, and user-friendly language design, Huff stands as the clear preference for those aiming to delve deep into the bare metal of EVM programming.
Development Environment
To install the Huff compiler, simply run the command below
curl -L get.huff.sh | bash
Once that is done, restart your terminal session or reload your shell path and run huffup
. If you can run huffc --version
then you're ready for the next step.
Huff Features
One of the key reasons for using Huff is to avoid the burden of manually managing the program counter while maintaining complete control over the low level features such as jump destinations, stack, memory, and calldata.
Huff exposes two building blocks to achieve this:
- Macros
- Jump labels
Macros
The definition of a macro is similar to that of a function (take X arguments, return Y pieces of data), however how macros operate behind the scenes is slightly different (take X data from the stack, put Y pieces of data on the stack).
// Macro that takes 1 piece of data from the stack and
// puts 2 piece of data onto the stack
#define macro TAKE_ONE_PUT_TWO() = takes (1) returns (2)
Jump Labels
In Huff, there are no "if's", but rather jump destinations if a certain condition if met (or not). This is to enable efficient execution flows:
#define jumptable JUMP\_TABLE {
lsb\_0 lsb\_1
}
#define macro EXAMPLE = takes(0) returns(0) {
0x01
\_\_tablesize(JUMP\_TABLE) \_\_tablestart(JUMP\_TABLE) 0x00 codecopy
0x00 calldataload mload jump
lsb\_0:
0x01 add
lsb\_1:
0x02 add
}
Simple Bank Example
I believe the best way to learn is by example. As such, we will explore a simple bank written in Huff where you can:
- Increment a value
- Set a value
- Retrieve the value
To begin, we will define the 3 functions that our contract will contain:
increment()
setValue(uint256 value)
getValue()
In Huff we can generate the 4 byte function signatures using the #define function [FUNCTION NAME]
syntax:
// Interface
#define function increment() nonpayable returns () // 0xd09de08a
#define function setValue(uint256) nonpayable returns () // 0x55241077
#define function getValue() view returns (uint256) // 0x20965255
When storing a value, defining a storage slot becomes essential. Rather than manually assigning slot 0
(which works in our case as we are only storing one variable), we can instead leverage Huff's built-in FREE_STORAGE_POINTER
.
#define constant VALUE_SLOT = FREE_STORAGE_POINTER()
We will then define our macros, and with the stack value next to the line of code by convention.
INCREMENT
#define macro INCREMENT() \= takes (0) returns (0) {
\[VALUE\_SLOT\] // \[0\]
sload // \[var\_value\]
0x01 // \[0x01, var\_value\]
add // \[var\_value++\]
\[VALUE\_SLOT\] // \[0, var\_value++\]
sstore // \[\]; stores updated value
stop
}
To increase the stored value, we:
- Push the constant
VALUE_SLOT
onto the stack. VALUE_SLOT
is 0 in our case as it is the only value stored in storage.- Use the
SLOAD
opcode, which takes the top value of the stack as the key, loads the value from storage using the key, and then puts the loaded value onto the top of the stack. - Push a new value
1
onto the stack. - Call the
ADD
opcode to add1
and the loaded value from 2. This consumes two values from the stack and puts the addition of the two values onto the top of the stack. - Push the constant
VALUE_SLOT
onto the stack again. - Call the
SSTORE
which takes in the first value (VALUE_SLOT
) from the stack as the storage location, and the second value (var_value++
) as the value to be stored into the storage. - Call the
STOP
opcode which stops execution.
SET_VALUE
#define macro SET\_VALUE() \= takes (0) returns (0) {
0x04 // \[0x04\]
calldataload // \[new\_value\]
\[VALUE\_SLOT\] // \[value\_slot, new\_value\]
sstore
stop
}
To set the stored value, we:
- Push
0x04
onto the stack. - Call
CALLDATALOAD
to read the calldata (64 bytes), starting from the offset on the top of the stack (0x04
). This is because function signatures are 4 bytes long, and should be ignored. - Push the constant
VALUE_SLOT
onto the stack. - Call the
SSTORE
, which takes in the first value (VALUE_SLOT
) from the stack as the storage location, and the second value (new_value
) as the value to be stored into the storage. - Call the
STOP
opcode, which stops execution.
GET_VALUE
#define macro GET\_VALUE() \= takes (0) returns (0) {
\[VALUE\_SLOT\] // \[0\]
sload // \[var\_value\]
0x00 // \[0x00, var\_value\]
mstore // \[\]; stores var\_value into memory location 0x0
0x20 // \[0x20\]
0x00 // \[0x00, 0x20\]
return
}
To get the stored value, we:
- Push the constant
VALUE_SLOT
onto the stack. - Use the
SLOAD
opcode, which takes the top value of the stack as the key, loads the value from storage using the key, and then puts the loaded value onto the top of the stack. - Push
0x00
onto the stack. - Call the
MSTORE
opcode, which storesVAR_VALUE
into memory location0x00
. We push stuff into memory as return data is obtained from memory, not the stack. - Push
0x20
onto the stack. - Push
0x00
onto the stack. - Call the
RETURN
opcode, which takes the first value from the stack (0x00
) as the memory offset of the return data and the second value from the stack (0x20
) as the length of the return data (in memory).
MAIN (Entrypoint)
In Huff, MAIN()
is a reserved keyword that serves as the entry point for Huff contracts. All calls to a contract will start from MAIN
.
#define macro MAIN() \= takes (0) returns (0) {
// Load the function selector
pc calldataload 0xE0 shr // \[sig\]
// Jump tables
dup1 \_\_FUNC\_SIG(increment) eq increment jumpi
dup1 \_\_FUNC\_SIG(getValue) eq getValue jumpi
dup1 \_\_FUNC\_SIG(setValue) eq setValue jumpi
getValue:
GET\_VALUE()
setValue:
SET\_VALUE()
increment:
INCREMENT()
}
In the above code, we are:
- Extracting out the 4 byte function signatures on line 3
- Comparing them to the pre-defined function signatures and branching off if they match.
- i.e. If the calldata starts with
0xd09de08a
it'll jump to theincrement
macro, and if it starts with0x55241077
, it will jump tosetValue
, and0x20965255
will route it togetValue
.
To put it simply, the MAIN
function in Huff serves as a way to handle function routing.
SimpleBank.huff
Putting everything above, we get:
// SimpleBank.Huff
// Interface
#define function increment() nonpayable returns () // 0xd09de08a
#define function setValue(uint256) nonpayable returns () // 0x55241077
#define function getValue() view returns (uint256) // 0x20965255
// Storage definitions
#define constant VALUE\_SLOT \= FREE\_STORAGE\_POINTER()
// Functions
#define macro INCREMENT() \= takes (0) returns (0) {
\[VALUE\_SLOT\] // \[0\]
sload // \[var\_value\]
0x01 // \[0x01, var\_value\]
add // \[var\_value++\]
\[VALUE\_SLOT\] // \[0\]
sstore // \[\]; stores updated value
stop
}
#define macro SET\_VALUE() \= takes (0) returns (0) {
0x04 // \[0x04\]
calldataload // \[value\]
\[VALUE\_SLOT\] // \[value\_slot, value\]
sstore
stop
}
#define macro GET\_VALUE() \= takes (0) returns (0) {
\[VALUE\_SLOT\] // \[0\]
sload // \[var\_value\]
0x00 // \[0x00, var\_value\]
mstore // \[\]; stores var\_value into memory location 0x0
0x20 // \[0x20\]
0x00 // \[0x00, 0x20\]
return
}
#define macro MAIN() \= takes (0) returns (0) {
// Load the function selector
pc calldataload 0xE0 shr // \[sig\]
dup1 \_\_FUNC\_SIG(increment) eq increment jumpi
dup1 \_\_FUNC\_SIG(getValue) eq getValue jumpi
dup1 \_\_FUNC\_SIG(setValue) eq setValue jumpi
getValue:
GET\_VALUE()
setValue:
SET\_VALUE()
increment:
INCREMENT()
}
We can now compile it and get the runtime bytecode via -r
huffc SimpleBank.huff \-r
Which yields us the runtime bytecode
583560e01c8063d09de08a1461003a57806320965255146100265780635524107714610032575b60005460005260206000f35b600435600055005b60005460010160005560206000f3
Testing SimpleBank
A handy tool that I've been using to test low level bytecode is evm.codes.
Pasting the bytecode into the playground and running the getValue()
function signature (0x20965255
) we can see that our return value is 0
. This is expected as we haven't initialized the value yet.
We can then call the increment()
function (0xd09de08a
) to increment the stored value, and we can see that the storage slot 0x00
now has a value of 0x01
If we were to run getValue()
again, we can see that the return value is now 0x01
! Just as expected.
Lets try the setValue(uint256)
function. We will set the value to 256
. Encoded properly the calldata will be:
0x552410770000000000000000000000000000000000000000000000000000000000000100
You can do so with the help of this handy tool from hashex.
And if we were to run it in the playground, we can see in the storage section that slot 0
now has a value of 0x100
(256), which is what we expect.
Conclusion
Huff is like a double-edged sword: It allows you to get as close to pure EVM bytecode as possible, but if you don't know what you're doing, you can easily shoot your foot off.
The benefits it provides for extreme code optimization are unparalleled, which comes at the expense of the lack of modern abstractions (such as for loops).
Ultimately, it's an extremely useful tool to have at your tool belt, just like how you shouldn't use a hammer for every occasion, you shouldn't use Huff for every problem.