Ethereum and Solidity: Application Binary Interfaces (and Function Signatures)

30 September 2017

In the last section, we covered some blockchain fundamentals and became a little acquainted with Solidity. Before we move toward setting up a development environment, let's take a few minutes grasp some important concepts that many development frameworks conceal.

First off: please read the docs. In the official documentation, not only do we gain information of how Solidity is intended to work—we also can intuit the rationalizations for many of the design decisions; this intuition is invaluable as it allows us to determine what measures need to be taken in drafting up robust contracts, especially as when we later introduce complexity and shirk away from Solidity's safety mechanisms^[1]. It can only help to absorb the documentation.

Now, the core topic!

ABI (Part I)

An 'Application Binary Interface' (ABI) is the bridge between the application code (the higher level code us smart programmers may write) and the lower level, binary code that gets processed by dumb computers.

In the context of Ethereum, an ABI is how we are able to interact with smart contracts once they are put onto the Ethereum network. Remember, while we may write contracts in Solidity, we compile (specifically, using solc—the SOLidity Compiler) the Solidity code we write into bytecode. This is because the Ethereum Virtual Machine (EVM) cannot run Solidity—it knows nothing about Solidity. The EVM only knows opcodes (those things defined in Appendix H of the yellow paper). Solidity is just an abstraction^[2].

The official documentation states the reason for ABIs (with 'encoding' referring to bytecode): "The encoding is not self describing and thus requires a schema in order to decode". (You may also want to note that in the next paragraph, the documentation goes on to say, "No introspection mechanism will be provided.")

So, what this is telling us is that the bytecode for a contract is not something that we can look at and use to figure out how we are to interact with the contract. An ABI is the schema of which they are referring to. Note that an ABI is not actually part of the Ethereum protocol—you can write bytecode without an Application Binary Interface (or with an ABI of your own definition). While there is no ABI defined as part of the Ethereum protocol, there is a standard ABI used by seemingly everyone that is just a conventional bridge between Solidity (as well as the other popular higher level languages) and bytecode.

"Why isn't the encoding (bytecode) self describing?" you may ask. The reason that we require a separate schema to assist us in interactions with the encoding is that the Ethereum Virtual Machine is an abstraction (only the essentials) of the operations a processor would typically perform when given machine code. This process is similar to Java and the Java Virtual Machine. Different processors and hardware have different instruction sets, so the abstraction is necessary if we want some arbitrary code to run across multiple machines of varying hardware: programmers code in a higher level language that compiles into bytecode, and that bytecode is translated into the proper machine language for a particular machine—with this scheme, it allows the programmers to focus on the rules for their systems without having to worry about whether it will run on Windows, or Linux, or MacOS, etc.

When a Solidity contract is compiled, all the variable names, function names—even the contract name—are excluded from the resulting bytecode. The resulting bytecode is merely the opcodes the EVM needs, and nothing more. The variable names and all that stuff were actually just helpful tools and symbols to aide the development process. This information useful for development isn't contained in the bytecode; thus, the need for a separate schema—the need to keep separate lists of properties, functions, and events that our contracts have—the need for an ABI.

Function Signatures

But, even with the a list of all the properties and functions, how will we interact with them on a contract that lives on the Ethereum network? Well, before we answer that, let's simplify. One of things that Solidity does is that it makes functions for the properties that are declared public (interactable)—these functions simply return the value of the property. The mechanism that is used to interact with properties becomes the mechanism that is used for interacting with functions—only one mechanism is needed.

Now, how does the mechanism work? The mechanism is surprisingly simple, and we'll walk through our own design process of sorts to understand the concerns. Let us articulate what require: a manner to call a specific 'function' that exists in bytecode. We would need some kind of unique identifier for each function—what we need is called a 'function signature'.

A simple way, if you and I were to have designed it, might be to use the function or property names in the bytecode. An example function we might be wanting to invoke could be exampleFunction(), a made up function for demonstration purposes that'll take some int and do something with it. We might just design our EVM code to work with labels directly, so somewhere in the EVM could there might be exampleFunction, and based on some input, we would jump (go to) the label relevant to the input.

But, what would we do in the case of overloaded functions (multiple functions with the same name, but accepting different arguments)? Well, then we would have to include the parameters (arguments) into the identifier that would be included in the bytecode. So, for two example functions sharing the same name, we might have exampleFunction(int thing) and exampleFunction(int thing1, int thing2) (ignore the illegal characters for now).

There's a couple of concerns with this approach, though. The first is, for functions with a lot of parameters, the function signatures could get very long, and start to take up additional storage to fully encode—remember that blockchain storage is expensive, whatever data is put in the chain that will stay there for as long as the community carries on the project, and judging by the fact that [EXPLETIVE] FidoNet still lives, that'll be pretty much forever (or until the sun blows up and kills us all). The point is, it's expensive, so we need to look for parsimonious ways to shorten it up and keep it still unique to the function we intend to call. One easy way is by omitting any spaces. Let's do that. It's also good to make a rule like this just for consistency.

Now, another thing is that we don't actually need for function signatures are the names of the parameters since those are only used in the function body (that is, used by the code in the function), so having just the types of the parameters will suffice. But, we have an important consideration: the types we have chosen for our example are int, which is really an another name for int256. We'll have to change that to get rid of any potential ambiguity (among programmers with experiences in other languages, 32-bit integers are typically the default so 256-bit integers by default might be unexpected or even seem a little insane)—let's be explicit and consistent.

So, with all these considerations, this is where we have ended up:
exampleFunction(int256)
exampleFunction(int256,int256)

Hmmm . . . We end up with things of varying lengths, and they still can get pretty lengthy if they have many parameters. :/ Now, these are the core signatures of the functions that we will want to interact with, but remember—this is a cryptosystem—we haven't been thinking with fancy cryptographic tools in mind.

Omitted from the first Solidity/Blockchain post entirely was the concept of 'hashing'. We will define hashing as a means of fingerprinting a chunk of data—it is a process usually used to come up with unique identifiers for some files so that people can check if there have been any changes to the files since they were initially 'hashed'. 'Hashing algorithms' are the means of doing this, with the result being a 'hash code' (or 'hash' for short; 'hash' is used as a verb, too), which is a numerical value and as such can be represented in decimal (0–9), hexadecimal (0–F), etc. There are many different hashing algorithms around, and Ethereum largely uses the algorithm that was the leading candidate for becoming the latest 'Secure Hash Algorithm' when it was developed. That algorithm was originally used within Ethereum as SHA3, but unfortunately it didn't become the standard, so now we all try to call it by its proper name, Keccak-256. (The 256 means the resulting hash code is 256 bits, or 32 bytes, in size.)

With hashing now in mind, the last step is to put the function signature through the Keccak-256 algorithm, and then we'll get a unique fingerprint of the function signature of a standard length. Now, smart contracts will probably not have so many functions and properties where we need a 32 byte number—the max value for a 256-bit unsigned integer is:

115,792,089,237,316,195,423,570,985,008,687,907,853,269,984,665,640,564,039,457,584,007,913,129,639,935

That's two to the two-hundred-fifty-sixth power minus one (2**256 - 1). We're probably not going to have that many functions, so we can just shorten the resulting hash code. How about just four bytes (or 32 bits)? That gives us 2,147,483,647 as the max value. A bit more reasonable; it's a number we can at least say. Remember from the first section that a byte can be represented with two hexadecimal characters, so we end up with an number representable in eight characters, which is fairly tolerable (and doesn't contain any illegal characters).

So, a summary of what our process will be: format with functionName and parameter types (unaliased,no-spaces). This is what would be called the function signature. Then hash the function signature (using the Keccak-256 hash algorithm), and take the first 4 bytes of the resulting hash. That number—that is what we could put in our bytecode that'll run on the EVM. Well, actually, this is exactly what solc does for our Solidity smart contract properties and functions! Know that the resulting number should be called the MethodID, but note that it is often also referred to as the function signature, just to make things more confusing. (Here's a Stack Exchange that shows some pseudo-EVM code so you can sort of see what it looks like in action.)

solc uses this approach when compiling the Solidity contract into bytecode, so if we use that same approach when wanting to interact with the contract, we'll come up with the same MethodID that exists in the bytecode, and should be able to invoke the desired function.

ABI (Part II)

We've digressed from discussing the actual ABI, though. Remember, what the ABI is is just the core, essential information that is needed to bridge from the application level to the bytecode level, such as what is needed in order to do the above steps to get the encoding for a function (the compiler does this when compiling the contract, and we'll use it when calling functions on our deployed contract). The conventional ABI is essentially just JSON.

For example, here is our finished contract from the Solidity introduction:

contract ExampleContract {
    address public owner;

    function ExampleContract() {
        owner = msg.sender;
    }

    function isOwner(address _account) public constant returns (bool) {
        bool isOwner = (owner == _account);
        return isOwner;
    }

    function changeOwner(address _newOwner) public {
        if (isOwner(msg.sender)) {
            owner = _newOwner;
        }
    }
}

Now, here is the ABI that gets generated when compiling the above contract:

[{
    "constant": true,
    "inputs": [{
        "name": "_account",
        "type": "address" }],
    "name": "isOwner",
    "outputs": [{
        "name": "",
        "type": "bool" }],
    "payable": false,
    "type": "function"
},{
    "constant": true,
    "inputs": [],
    "name": "owner",
    "outputs": [{
        "name": "",
        "type": "address" }],
    "payable": false,
    "type": "function"
},{
    "constant": false,
    "inputs": [{
        "name": "_newOwner",
        "type": "address" }],
    "name": "changeOwner",
    "outputs": [],
    "payable": false,
    "type": "function"
},{
    "inputs": [],
    "payable": false,
    "type": "constructor"
}]

The resulting ABI looks a little bit longer that the contract. This is because of the formatting and that our contract's method bodies are really short. More complicated contracts would likely have ABIs that are significantly shorter than the actual Solidity contract code.

Bonus

Let's go back to the docs: remember that the documentation said, "No introspection mechanism will be provided"? A pity. I don't know about you, but I'm pretty disorganized at times. I might misplace my ABI after deploying a contract.

We'll just have to make our own introspection mechanism, I guess. This is an unfinalized idea.

This is how it'll work (this is an abstract base):

contract IQueryable {
    function getABI() constant returns (string);
    function getParentTypes() constant returns (string);
}

It could either be a contract or an interface at this point.
getABI() (or potentially getAbi(), depending on a decision of what the style will be) is intended to return the ABI of a contract as a string with minified whitespace.
getParentTypes() is intended to return a comma delimited list of all the types that the contract implementing the interface derives from. It is not intended to be recursive (one could define such a method either on the contract or make the contract implement an additional interface for this and either getABI() or getParentTypes() would allow it to be discovered).

All contracts I write will have to (some currently do) implement this interface. Right now it suffices to copy and paste the ABI after initially compiling a contract and compile it again. Eventually this will be automated with a special tool, but my efforts are prioritized and am I working on planning something else at the moment. The special tool will be implemented with a finalized IQueryable interface once other similar features are designed.

Conclusion

In this post, we've explored Application Binary Interfaces and got a good grasp on how solc turns function signatures into the MethodIDs that are encoded in the bytecode that the EVM plays with. With this information, we'll be able to do some interesting things later on, but at this point, we're just getting acquainted with the basics of all this crazy programmable-blockchain stuff.

Next week is either a continuation of this type of beginner, non-programmer track, or the beginning of a more advanced track on higher-level design decisions.

There will be a later post describing these, so don't worry if you don't yet know what is meant by 'safety mechanisms'—it suffices to say for now that there are restrictions for good reasons. ↩︎
As an aside, all Solidity really does is give us, programmers, convenient and relatively-safe, readable patterns of opcodes—patterns that are designed to be defensive. If you don't like the patterns or they don't fit your use case, you are free to build your own language that compiles into bytecode, but that really isn't necessary, because Solidity gives you the ability to write 'assembly' (work directly with those opcodes and the memory on the stack in either a somewhat human readable manner or instructional manner) within contracts. This, I'd argue, is the most powerful part of Solidity: in it, you can tell Solidity to go [EXPLETIVE] right off with its safety and staticity, and do some wicked cool [EXPLETIVE] when you need to. ↩︎

Prev Introduction to Solidity (and Blockchains in General) For Non-Programmers

R1CSs, QAPs, zk-SNARKs, & OMGWTFBBQs Next