A normie's guide to rollups

READ STORY

“Blockchains don’t scale.” You’ve probably heard this a million times. A couple years ago, it felt like a real threat to the industry. How can we bank the unbanked if we can’t even process 15 transactions per second?

But big problems fuel innovation. And blockchain scalability was a big problem. It quickly caught the attention of engineers and scientists, and, fast forward to today, we’re getting more and more confident that blockchains can scale. In fact, most of the current debate concerns which scalability solution will win.

That’s why, in this blog post, I’m going to break down some of the Ethereum scalability solutions that have been explored in the past five years — and explain how each of these was a stepping stone to what may be emerging as the winner: Rollups.

Let’s get right into it.

Facing the “Scalability Trilemma”

We can’t talk about scalability without mentioning the famous “scalability trilemma.” The term was coined by Vitalik to explain the three properties that a blockchain aims to have: scalability, decentralization, and security. It’s clear, so far, that we can achieve two of these properties. But getting all three is really, really difficult.

Before we understand why, let’s get our terms straight.

Scalability just means that the blockchain can process a lot of transactions, measured in transactions per second (TPS).

Decentralization means the blockchain is run by many “trustless” nodes across the world — not run by a small group of centralized “trusted” nodes.

Security means that the blockchain is resistant to attack even if a certain percent of the nodes in the network are malicious. Ideally, it should be able to handle up to 50% of malicious nodes.

Vitalik created a simple triangle with each edge representing one of the three properties. Each side of the triangle represents the different types of blockchain solutions that are able to achieve 2 of the 3 properties.

source

Decentralized & Secure

On the bottom of the triangle are traditional blockchains like Bitcoin and Ethereum 1.0.

These types of blockchains are...

  • Decentralized: Yes. Because anyone in the world can choose to become a mining node. There are thousands of miners around the world who participate in securing Bitcoin and Ethereum networks. There’s also no need to authorize yourself as a miner; it’s completely trustless.
  • Secure: Yes. Because every node in the network keeps a copy of the blockchain and verifies every transaction. Moreover, Proof-of-Work is designed to handle up to 50% of the nodes being malicious.
  • Scalable: No. By virtue of its security, every node in the network keeps a copy of the blockchain and verifies every transaction. While secure, this is inefficient, making the latency and throughput very low. For Bitcoin, it’s about 7 transactions per second, and for Ethereum, it’s about 15 transactions per second.

Secure & Scalable

On the right side of the triangle are the typical high-TPS chains, such as Binance smart chain. They use a consensus algorithm called “Proof-of-staked-authority,” where 21 nodes are “elected” to produce new blocks. Every 24 hours, a fresh set of 21 nodes are elected to produce new blocks for that 24-hour period.

  • Secure: Yes. Each “elected” node is authorized, so we have control over adversaries in the system.
  • Scalable: Yes. Since a small number of elected nodes are producing new blocks at any time, we can get much higher transaction throughput and latency. That means there’s lower communication overhead than if every node had to validate every transaction.
  • Decentralized: No. Since there are only 21 elected validators, it’s way less decentralized than traditional blockchains. Moreover, each “elected” node is authorized, so we introduce trust into the system, making it less decentralized.

Scalable & Decentralized

On the left side of the triangle are multi-chain ecosystems, such as Cosmos, Polkadot, and Avalanche. These systems have many independent blockchain networks that all communicate as part of the larger blockchain network.

  • Scalable: Yes. Since we’re no longer required to store all of the state on a single blockchain, we can split the state across many independent blockchains, achieving higher scalability than a traditional blockchain.
  • Decentralized: Depends. Each blockchain in the ecosystem has a set of nodes that validate the blockchain. Some chains in the ecosystem will have many validators (such as a stablecoin chain that needs to be decentralized), while others may have very few or even just one (such as an enterprise chain, which doesn’t need much decentralization). So, the level of decentralization depends on which chain in the ecosystem we’re referring to.
  • Secure: Not really. If there’s a chain in the ecosystem that gets attacked, then it could have ripple effects on the rest of the system. For example, if chain B gets attacked and Chain A, C, and D rely on it, then those other chains will be affected as well.

As you can see, there have been many different attempts at scaling blockchains, but it almost always comes at the cost of one of these three properties. Vitalik and the Ethereum community have been unwilling to make that compromise. Their aim is to get all three.

Before we get into how that could be possible, it’s important to understand one more thing: Layer 1 scaling vs. Layer 2 scaling.

Understanding layer 1 & 2 scaling

At the highest level, layer 1 scaling refers to scaling the core blockchain itself. In contrast, layer 2 scaling refers to moving transactions off the main blockchain layer into a separate layer that can communicate with the mainchain.

Ethereum hopes to use both layer 1 and layer 2 solutions to solve the scalability trilemma. Sharding is Ethereum’s layer 1 solution, while rollups are Ethereum’s layer 2 solution.

Ethereum’s earliest layer 2 solutions

But rollups and sharding have to wait for just a moment. First, we should chronicle the layer 2 scaling solutions that Ethereum explored in the past before finally arriving at rollups, Ethereum’s “holy grail.” After all, that’s how engineering works — we come up with ideas, test them, and iterate until we find a solution that works.

State Channels

State channels have been around for a LONG time, so they’re nothing new. Here’s a quick illustration of how they work. Let’s say we have two people, Alice and Bob, who want to transact with each other. Alice will pay Bob $1 every time he tweets. But since Bob tweets a LOT every day, using Ethereum to transact would be too slow and too expensive.

Instead, they use “state channels:”

  • Alice puts $500 into a smart contract on Ethereum.
  • Whenever Alice wants to send Bob $1, she signs a message that indicates how much she wants to send Bob. She keeps signing messages until Bob is ready to “cash out” his funds.
  • Bob submits a new message indicating that he’s ready to close the state channel. The smart contract on Ethereum verifies Alice and Bob's signatures, pays Bob the amount due, and returns the rest to Alice.

Notice how only the first and last steps require us to transact on the blockchain. In between those steps, Alice and Bob can send an unlimited number of signed messages to each other indicating a payment.

In this case, the Ethereum blockchain is used only as a settlement layer to process the final transaction with the lumpsum payment. This lifts the burden from the underlying blockchain.

Here’s the take-home point: By opening up a channel to transact off the blockchain, we massively increase transaction capacity and speed while keeping costs low. This is possible because:

  • First, a majority of the transactions are happening off-chain. That means payments can be handled instantaneously since off-chain updates between two parties don’t require the extra time to be processed and verified by the blockchain network.
  • Second, payments incur lower fees because we only need to transact on-chain when we open and close the state channel. That means most of the transactions are happening off-chain with much lower fees.

So, why isn’t this the end-all solution? Well, there are limits to what state channels can do.

For example, we can’t use state channels to transact with people who aren’t part of the state channel. We’re also limited to the types of state updates that are possible in the state channel. A complex application like Uniswap can’t be used in a state channel because, when we swap two tokens on Uniswap, smart contracts automatically carry out a bunch of intermediary steps to make a swap. Those steps don’t have an authorized user signing off on each of them.

The other downside with state channels is that they require us to lock up liquidity to instantiate the channel and protect against cases where a malicious counter-party might never actually pay the promised funds. This might be fine for a single channel, but when we’re trying to route payments through a network of state channels, the liquidity locked in intermediate channels makes it quite “capital inefficient.”

Lastly, state channels require someone who can periodically watch the network (or delegate this responsibility to someone else). This ensures the security of your funds, which adds another layer of complexity and inefficiency.

All in all, state channels work well for uses cases where two parties need to transact with one another quickly, cheaply, and over a period of time (such as a merchant and customer). But given their limited use cases and capital inefficiency, we’re not betting on state channels being the ultimate scaling solution for Ethereum.

Sidechains

Sidechains have also been around for a long time, and they’re pretty easy to understand. Briefly, a sidechain is an independent blockchain that’s “pegged” to the main blockchain.

When we “peg” one blockchain to another, that means we can move assets between the two blockchains. A “one-way” peg is where we move the assets from the main blockchain to the sidechain, but not the other way. This works by “burning” the tokens on the main blockchain by sending them to an unspendable address — and then “minting” the equivalent tokens on the sidechain.

A “two-way peg,” then, is when we can move assets to and from the main blockchain and sidechain. This requires “locking” up our tokens on the mainchain and then “minting” an equivalent amount of tokens on the sidechain. When we want to convert back to the original token, we “burn” the tokens on the sidechain and then unlock the tokens on the mainchain.

Therefore, a sidechain is when we create a new blockchain that’s two-way pegged to the main blockchain. When we want faster transactions, we can move our funds from the mainchain to the sidechain and transact there. When we’re done, we move our funds back to the mainchain.

An example of a sidechain for Bitcoin is the Liquid Network. The Liquid Network is pegged to Bitcoin and permits faster and cheaper Bitcoin payments. Another popular example is Polygon, which is a sidechain pegged to Ethereum.

When users want faster transactions, they can lock up some ETH and create an equivalent amount of Matic token on the Polygon sidechain. On the Polygon sidechain, they can enjoy faster and cheaper transactions. When they’re done transacting, they can convert their Matic token back to ETH.

Note: Technically Matic is not a sidechain because it periodically commits the sidechain’s state to Ethereum. Hence, they like to call themselves a “commit chain.”

Source

Overall, sidechains are scalable because they typically make a trade-off in decentralization and/or security by using a different consensus algorithm that allows for scalability.

Sidechains are a good stop-gap solution for the insane amount of congestion on the Ethereum network, but Ethereum is not betting on this being the ultimate scalability solution either. The reason will become clear as you continue to read :)

Plasma

Plasma is another “layer 2” solution that lets us move transactions off the base layer. Before we get into Plasma, it’s important to note that there have been many iterations of Plasma over time, each with its own trade-offs. You can check out the plasma world map, which lists the many different types of designs that people have tried to create to solve the challenges that come with Plasma. There’s a lot!

Of course, for the purpose of this post, I’ll have to generalize the Plasma concept without focusing too much on individual implementations. If you want to dig deeper, definitely check out the world map.

So, what’s Plasma? Plasma is essentially a series of smart contracts (or “plasma chains”) that run outside of the main blockchain.

Source

The Plasma chains are like branches of a tree, where Ethereum is the trunk and each plasma chain is a branch. Each branch is treated as a blockchain that has its own blockchain history and computations.

The “root blockchain” (i.e., Ethereum blockchain) enforces the validity of the state in the Plasma chains using something called “fraud proofs.” Fraud proofs are a mechanism by which we provide some piece of data, and anyone can determine if the data is invalid using mathematical proofs.

source

Each plasma blockchain does not need to post transaction data onto the root chain. Instead, each plasma chain has an “operator.” This could be a centralized actor, a multisig representing multiple people, or even a committee who stake to participate in being an operator. The operator of the Plasma chain submits a Merkle root of the transfers that happened on the Plasma chain.

Note: If you don’t know how Merkle trees work, then I highly recommend reading this explainer before moving on.

At a high level, a Merkle tree lets us take a large data set (e.g. transactions in a block) and generate a single root hash that represents the entire data set.

Later, we can easily prove that a piece of data from the large data set (i.e., a single transaction from a block of transactions) existed in that data set by providing just the branch leading up to that piece of data.

If someone tries to prove the existence of a fraudulent transaction, then the hashes wouldn’t match and we would know right away.

Okay, back to Plasma.

Every Plasma chain is submitting a Merkle root of the transfers happening on it. When a user tries to later move their assets from the Plasma chain back to the root chain, the user can submit the Merkle branch of the most recent transaction sending the asset to them (the most recent transaction is enough for us to know the current balance on the plasma chain). This starts a challenge period where anyone can try to prove that the user’s Merkle branch is fraudulent. If the Merkle branch is fraudulent, then a fraud proof can be submitted.

Since the root blockchain only keeps track of Merkle roots, it has to process much less data than if those transactions happened on the mainchain. This significantly decreases the amount of data stored on the root blockchain and lets us scale the root chain.

Moreover, if there’s a malicious attack on a particular plasma chain, people can do a “mass-exit” from the corrupt child chain.

Source
Source

Plasma chains are more favorable than state channels because you can send assets to anyone, whereas with state channels, you can only transact with the people in the state channel. Moreover, the benefit of Plasma chains over sidechains is that the Plasma chain is secured by Ethereum.

The fundamental difference between the two (Plasma vs. sidechains) is that sidechains have their own security model. They have their own consensus mechanism and a separate set of nodes that validate the state. Even if the sidechain gets attacked, nothing will happen to the mainchain, and vice versa. If an attack happens on the sidechain, the mainchain can’t do anything to protect users.

Plasma chains, on the other hand, have a dependent security model. Each Plasma chain can use its own mechanism for validating transactions, but it still uses the Ethereum blockchain as a final arbiter of truth. In the case of byzantine attacks, plasma chain users can exit to Ethereum.

But Plasma has several downsides that make it a lackluster scalability solution.

First, when users want to move their assets from Plasma contracts to the main Ethereum blockchain, they need to wait seven days. This is enough time for people to verify that the withdrawal transaction isn’t fraudulent. If it is, they can construct a fraud proof using the Merkle tree on the Plasma chain.

Second, each Plasma chain requires an operator posting the Merkle root commitments to the mainchain. This requires us to rely on a third party to accurately post the Merkle root commitments on the chain. Unfortunately, operators can perform what’s known as a “data availability attack” where they withhold posting certain transactions onto the mainchain for malicious reasons.

In this case, operators can convince the network to accept invalid blocks, and there’s no way to prove invalidity. This prevents other users from knowing the accurate state of the blockchain. This also prevents people from creating blocks or transactions because they lack the information to construct the proofs. Data availability attacks, unlike fraud, are not uniquely attributable. We have no way of knowing the attack is happening.

An operator can be malicious in more explicit ways as well, such as through submitting fraudulent transactions. In this case, people can do “mass exits,” as explained above. But these have proven to be much harder to implement in practice. If many users want to mass exit, it can lead to congestion on mainchain, and users may not be able to exit in time, causing them to lose their funds.

Third, Plasma requires that owners of transacting assets be present. This ensures the safety of the Plasma chain since it effectively makes it impossible to transact without the owner’s consent (such as sending ERC 20 tokens to an approved address). Plasma works best for simple transfers, but as transactions get more complex, the design space gets unruly.

For the reasons above, the Polygon and OMG networks initially pursued Plasma architectures for scaling but have since pivoted away.

And it was in this context, after various solutions had been proposed, that we got rollups.

Rollups

Just like state channels, side chains, and plasma chains, rollups are a “layer 2” solution. In fact, rollups are very similar to Plasma in that we’re batching transactions off-chain and posting an update to the main blockchain. The key difference, however, is that with rollups, we also post the transaction data of each batch of transactions on-chain. With Plasma, we’re only posting the Merkle roots.

In other words, with rollups, we do the transaction processing off-chain, but we post transaction data on-chain. The amount of data we post on-chain is the minimum amount required to locally validate the rollups transaction. By putting data on-chain, anyone can detect fraud, initiate withdrawals, or personally start producing transaction batches. Therefore, rollups give us much higher security guarantees than Plasma chains or Sidechains.

The other key differentiation between rollups and Plasma is that we don’t have to worry about data availability issues. After all, we’re posting transaction data onto the mainchain. This is a huge win.

With rollups, we’re effectively running a version of the EVM inside of the rollup layer. That means any transaction possible on Ethereum is possible to execute in the rollup.

And that begs the question: If we’re still posting transaction data on-chain, then how does this scale layer 1? Isn’t the scalability still limited to the data bandwidth of the mainchain?

Yes, it is. The key here is that we get 5x to 100x scalability with rollups, but not infinite scalability. Rollups also use a lot of fancy compression tricks to minimize the transaction data we post on-chain, so there’s much less data storage on-chain than there would be otherwise.

Source

Meanwhile, we outsource all the heavy lifting of transaction execution off-chain into the rollup. Recall that during transaction execution, a transaction has to be processed by the Ethereum Virtual Machine (EVM) and interact with state (such as storage, account balances, etc.). This is expensive.

With rollups, however, we move this execution to the rollup layer by running a version of the EVM in the rollups — so we’re still doing the same execution, but gas costs on the rollup layer are much cheaper than on Ethereum.

A closer look at rollups

Now, let’s take a look at how rollups work under the hood.

There’s a “rollup contract” on the mainchain that maintains the current state of the rollup layer. This includes account balances of the users transacting on it and smart contract code of the contracts that live in it. In brief, the rollup contract keeps track of the “state root” of the transactions in the rollup layer.

The “state root” consists of a key-value map where the keys are addresses and the values are accounts. Every account has up to 4 properties: balance, nonce, code (only for smart contracts), and storage (only for smart contracts).  

When transactions happen on the rollup layer, a state change occurs. Of course, that means the state root needs to be updated as well. But rather than updating the state root for every transaction, the transactions are “batched” and sent to the rollup contract on the mainchain. The batch will include a compressed form of the batch of transactions and an updated state root that represents the data after the batch of transactions has been processed.

The rollup contract on the mainchain checks that the previous state root in the batch matches its current state root — if it does, it switches the state root to the new state root.

Since the published transaction data is not actually interpreted by the EVM, we aren’t accessing or writing to state — that would be far too expensive. Instead, we’re posting the compressed transaction data as a “calldata” parameter to the rollup contract.

Here's why that's neat. In Solidity, calldata is the cheapest form of storage to use. In fact, arguments passed as a calldata parameter don’t get stored in Ethereum’s state at all, meaning we avoid a lot of gas fees. Meanwhile, Ethereum nodes can still store the transaction data when the block is created (in historical logs).

Astute readers might be wondering what the difference is between Plasma and rollups. Here’s the key divide: With rollups, we’re posting the transaction data on-chain along with the state root. With Plasma, we just post the state root of the transactions.

Unlike Plasma, where we have an operator who posts the Merkle root to the root chain, rollups let anyone post the new batch of transactions to the rollup contract on-chain, as we’ll explore in a bit more depth later.

Once again, this begs the question: since we’re just posting transaction data to the mainchain and not executing the transactions on-chain, how do we know that the transaction data and state root posted on the mainchain isn’t fraudulent?

Enter: optimistic rollups and ZK rollups, each of which has its own way of handling  and verifying the correctness of batches.

Optimistic Rollups

You can probably guess what optimistic rollups are based on the name. When a new batch of transactions is “rolled up” to the mainchain, the state root and hash of each batch is posted, but we don’t actually validate that the transactions have been executed correctly, at least not at the time of posting.

In this way, we’re “optimistically” posting the new state root and transaction data to the rollup contract on the mainchain. When a new state root is posted to the mainchain by someone, the rollup smart contract simply takes their word for it.

If someone discovers that an invalid state transition was published to the rollup smart contract, they can generate a “fraud proof.”

The fraud proof includes:

  • a proof of “pre-state,” or, how things looked before a transaction was applied
  • a proof of “post-state,” or, how the state should have looked after the transaction was applied
  • a proof of the transactions that were applied during a state transition

The workflow is straightforward: This fraud proof is posted to the rollup contract on the mainchain. The rollup contract then verifies the proof and applies the transaction logic to the pre-state. Then, it compares the result to the post-state. If there’s a mismatch, then this proves that whoever posted the batch didn’t apply the transactions properly. The smart contract then reverts that batch of transactions and all batches after it.

To this end, anyone who posts batches to the mainchain has to put up a deposit, so if they behave maliciously and get caught, they can be “slashed.”

ZK Rollups

If optimistic rollups use the “Innocent until proven guilty” mentality, then ZK rollups use the “Don’t trust, verify” mentality.

With ZK rollups, every batch includes a cryptographic proof called a ZK-SNARK that proves the state root is the correct result of executing the batch of transactions. The ZK-SNARK proof is a hash that represents the change of the blockchain state after executing the transactions in the zk-rollup layer. This validity proof is posted to the rollup contract, so anyone can use it to verify transactions in a particular batch on the rollup layer.

The magic here lies in how ZK-SNARKs work. They let us generate a proof of an underlying piece of data without revealing the data. Anyone can later verify that the data existed, even if they don’t have access to the data itself.

The mathematical basis of ZK-SNARKS is complex and beyond our scope, but if you’re curious, I encourage you to hop on Google or YouTube and spend some time understanding how they work.

Which is better?

The obvious next question is... which is better? It’s hard to say. There are pros and cons for each, which we’ll take a look at next.

Cost

“Cost” doesn’t mean much in the abstract, but when we break it into parts, the performance of optimistic and ZK rollups begins to diverge.

  • Gas cost to post a new batch on-chain: Optimistic rollups cost less. We‘re optimistically posting the new state root and data, so it’s a simple transaction. ZK rollups cost more. When we post a new batch on-chain, we have to verify a ZK-SNARK validity proof. This is computationally more expensive.
  • Gas cost per transaction posted on-chain: Optimistic rollups cost more. We have to post sufficient data on-chain to later verify the fraud proof. ZK rollups cost less. We can leave out most transaction data because the validity proof is enough for anyone to verify the correctness of the batch.
  • Off-chain computation costs: Optimistic rollups cost less. We’re only posting the new state root and not executing/validating transactions. That said, we still need some people to watch the creation of new batches and the execution of new transactions to ensure the correctness of the batches. ZK rollup costs more. ZK-SNARKs are expensive to compute (between 20 - 1000 times more expensive, though it continues to get cheaper with innovation).

Side note: Even though the off-chain compuation costs for ZK-rollups is higher, it is also important to consider that the gas prices off-chain are magnitudes lower.

Speed

Optimistic rollups are slow. Typically, a user has to wait around one week to withdraw their assets. This could give someone enough to publish a fraud proof in case a user attempts to withdraw tokens they don’t actually own on the rollup layer.

ZK rollups are fast. Users typically wait less than 10 minutes to withdraw their assets. We just need to wait until the next batch to process withdrawals because all of the rollup state is already verified.

Side note: There are ways to get around this one-week waiting period by using “fast withdrawals.” This is accomplished using liquidity providers, who maintain a “cookie jar” of funds on the mainchain. When users withdraw funds quickly, they give a liquidity provider an IOU for funds in the rollup, and then they get paid by the liquidity provider on the mainchain immediately (for a fee).

Later, when the one-week period is over and the user gets their assets back from the rollup layer, the user can send the liquidity provider the funds they owed them. The liquidity provider can even choose to run a validator node that validates the user’s transaction on the rollup before issuing them the funds on mainchain, further reducing their risk.

However, this “fast withdrawal” scheme wouldn’t be possible for NFTs since there’s only one of any NFT in existence, and a liquidity provider can’t create the same NFT on-chain.

Complexity

Optimistic rollups are simpler.  The concept of fraud proofs has existed for a long time, so the solution is relatively straightforward.

ZK rollups are more complicated. ZK-SNARKs are new and mathematically complex.

Generalizability

Optimistic rollups are easier to generalize. Engineers have already built an EVM-compatible Virtual Machine called OVM (Optimistic Virtual Machine), which allows optimistic rollups to process any transactions that can be processed on Ethereum.

ZK rollups are harder to generalize. Proving general-purpose EVM execution using ZK-SNARKs is much harder than proving simple computations, like value transfers. That said, there’s a lot of innovation happening in the space. In fact, StarkNet alpha introduced a new programming language called Cairo, which is a Turing complete ZK-verifier on Ethereum that lets us verify general computation smart contracts.

Scalability

Optimistic rollups are less scalable. When we post data on-chain, it typically includes some state (such as transaction details) and a witness (such as digital signatures that prove the consent of the transacting parties). With optimistic rollups, we have to post the witness for every transaction, so people can prove fraud later. The witnesses take up a lot of storage space, anywhere between 3-10 times that of the transaction data.

ZK rollups are more scalable: We don’t need to include the witness for every transaction because all digital signatures were verified when the ZK-SNARK was computed. Instead, we just need one witness per batch. This significantly reduces the data stored on-chain.

Security

Optimistic rollups are less secure. Optimistic rollups rely on cryptoeconomics to ensure the security of the chain. In other words, they have to incentivize people to watch the batches being posted on-chain and detect fraud.

ZK rollups are more secure. ZK rollups rely on math, and they don’t require incentives. They use cryptography rather than cryptoeconomics.

So, now that we’ve broken these down, which is better? It’s still hard to say, but that’s a testament to the great work that the engineers behind these programs have done. We have teams like Optimism and Arbitrum working hard on optimistic rollups, which are already available for Ethereum developers. And we have companies like StarkWare and Zksync bringing general-purpose ZK-rollups to Ethereum.

Both solutions are very much in their infancy. But optimistic rollups are closer to being adopted since they’re less complex and can be used for general-purpose computation today. ZK rollups, on the other hand, will take some time to catch up, but many engineers would argue that ZK rollups are a superior technology. After all, they rely on math instead of cryptoeconomics, and they’re more scalable than optimistic rollups.

That said, the best technology doesn’t always win. We can’t ignore that once a technology becomes entrenched, it’s hard for it to be replaced. Optimistic rollups definitely have the lead — so only time will tell which will “win” in the long run.

The cleverness of rollups

Before we discuss some of the enduring challenges of rollups, let’s take a look at the compression tricks rollups use to be so efficient.

Source
  • Nonce: In a typical Ethereum transaction, we include the nonce to prevent replay attacks. Rollups omit them entirely because they can be recomputed using the previous state of the blockchain. In this way, rollups replace data with computation where possible.
  • Gas price: Rather than denominating gas prices in gwei (where 1 gwei is 10^-9 ETH), gas price could be limited to a fixed range of prices, significantly reducing the amount of storage required to record the gas price in transaction data. This really adds up!
  • Gas: Same as above.
  • To: An address is 20-bytes long, plus 1 byte for RLP encoding. Rather than including the address, rollups can store a mapping of indices to addresses and only include the index in the “to” field (e.g., 1234). It’s like leaving coordinates to a destination rather than rendering the entire location itself.
  • Value: The “value” field is 9-bytes because ETH and ERC-20 tokens have up to 9 decimal places. Rollups can instead restrict values to be up to 3 decimal places, saving us 6 bytes. Seems practical!
  • Signatures: As mentioned above, the “witness” of digital signatures takes up a lot of storage. Rollups can use BLS signatures, which allow us to aggregate many signatures into one. This saves a ton of storage space!


Let’s put all of this in action. Using all of these compression tricks, how many bytes can we reduce for an ETH transfer? Well, the typical ETH transfer requires 112 bytes. But with these compression tricks? Just 12 bytes. That’s almost 10-times more efficient!

ZK rollups can get even more optimizations than optimistic rollups can since they conduct transaction verification off-chain before posting the transaction data to the mainchain. They also don’t need to include the “verification” parts of the transaction data; the validity proof is enough. All ZK rollups need to store is the data necessary to compute the state transition.

So, the big take-home point is that rollups aren’t just efficient because they move computation off-chain but because of their very clever data compression tricks.

Rollups aren’t exactly the “Holy Grail”

But don’t get too excited. There are still plenty of kinks to work out, even though rollups are very promising. Here are some challenges you should be aware of.

Scalability has a ceiling

By now, you understand the key difference between rollups and other layer 2 solutions, like Plasma and sidechains. Rollups, you’ll recall, move computation off-chain but store the data on-chain. This is immensely helpful for solving the data availability problem. But since we’re storing transaction data on-chain (though in a very compressed form), we’re still limited by Ethereum’s storage capacity.

We can take a look at what the theoretical TPS would be using rollups with some back-of-the-envelope math.

  • Ethereum block gas limit: 12.5 million gas
  • Cost per byte of data stored on-chain: 16 gas
  • Max number of bytes per block: ~781,000 bytes (12.5 million gas / 16 gas per byte)
  • Bytes of data required to do an ETH transfer using rollups: 12 bytes (see math in previous section)
  • Transactions per block: ~65,000 (~781,000 bytes per block / 12 bytes per ETH transfer)
  • Average block time on Ethereum: 13 seconds
  • Transactions per second: ~5000 TPS (~65,000 transactions per block / 13 seconds per block)

Of course, this math assumes all the transactions in a block are ETH transfers, and there’s nothing else in a block but batches of rollup transactions. This is highly unlikely. Most blocks will include a variety of transactions, including some layer 1 transactions, which would cost more than 16 gas. Moreover, if these were ZK-rollup batches, it would exclude the cost to verify a SNARK proof on-chain, which is about 500,000 gas.

Nonetheless, this gives you a starting point for TPS with rollups. 5,000 isn’t anywhere near the 65,000 TPS number that Visa apparently has, but it’s far better than the TPS Ethereum has today.

Fractured Liquidity

Rollup technologies are being created as independent projects — not by the Ethereum protocol itself. So, there are going to be several different rollup technologies existing in parallel. That’s where fractured liquidity comes in.

As liquidity moves from the mainchain to rollups, it “fractures” the liquidity across different rollup networks. Although this can likely be solved once there are mechanisms to communicate across rollups, which are already being investigated by some smart engineers!

Reduced Composability

One of the key benefits of building on Ethereum is composability. Each new protocol that gets to build on Ethereum is like a Lego piece that other protocols can easily build on top of. That’s what makes DeFi so powerful, for example. It lets us create Money Legos.

When applications and liquidity move to rollups, we lose some of this composability. After all, passing messages and transactions between the rollup layer and the mainchain is not nearly as easy as doing so within the context of the base layer.

But it may be just a matter of time before this is solved. I can certainly see a world where smart contracts existing on different rollups can still communicate with one another. As always, we’re just a few smart engineers away from working out these problems.

Centralization

We glossed over the part where we discuss who’s actually responsible for posting new batches to the mainchain, so let’s come back to that.

Most rollups rely on a “sequencer” to do the job: A sequencer is a node that batches transactions and posts the result to the rollup contract on-chain. In the case of Arbitrum, Optimism, and StarkNet, the sequencer is a single node that they run themselves 😱

I know. “Decentralization” is at the heart of the blockchain, and this, though efficient, is obviously very centralized. What if the sequencer goes down or censors transactions?

Well, it’s not quite that simple. The reason these projects are taking this route at the moment is because it’s easier and faster to iterate using this method. To reduce the risks of centralization, most rollups will want to do some kind of decentralization of sequencers over time — and many of them do have plans to do so.

How would the decentralization of sequencers work? There are a few methods. For one, we can create a Proof-of-Stake-like system where sequencers have to stake tokens for a chance to propose the next batch. Or we can do Delegated-Proof-of-Stake where a single sequencer is elected and can be unelected if it does a poor job.

All-in-all, the ways in which sequencers will become decentralized have yet to be seen!

A Side-by-Side Comparison

Phew. Hopefully, you have a better understanding of rollups — and why Ethereum is betting on it (and sharding!) as a scalability solution. Of course, rollups sit on the shoulders of the giants that came before it — we wouldn’t have it without sidechain, state channels, and Plasma.

It’s also informative to compare rollups with other layer 2 solutions using that “trilemma” framework of decentralization, security, and scalability. Except I will add one additional dimension: generalizability. We've realized over the years that it is important for a layer 2 solution to be generealizable such that it can be used to do anything which is possible to do on the mainchain.

This makes it clear that Rollups give us moderate scalability without sacrificing decentralization, security, and generalizability.  

The trade-off, however, is scalability. Because we are still storing data on-chain, we are limited to how much scalability we get compared to layer 2 scaling solutions that store data off-chain. Moroever, in the short term, rollups are relying on centralized sequencers, which reduces the security. But this is a short-term problem, and it seems likely that rollups will decentralize the sequencers over time, making them a superior technology to Plasma, side chains, and state channels.

So are rollups the holy grail? I will let you decide ;) 

P.S...

For the sake of brevity, I left out a lot of interesting details on how rollups work. But I also left out a new layer 2 scaling scheme called Validium.”

Validium is similar to Plasma in that we move data and computation off-chain. The key difference is that Validium doesn't rely on fraud proofs for validating transactions. Instead, operators are required to use zero knowledge proofs to make new state commitments. This makes it impossible for operators to advance invalid state transitions.

It also removes the need to have "mass exit" schemes or long withdrawal delays in the protocol. But we still get the unlimited scalability of Plasma chains because we aren't storing transaction data on-chain. Anyway, it's worth mentioning. I encourage you to read about it!

Conclusion

This post turned out way longer than I intended. If you're still reading, you’re my type ;)

Despite the length, this is really the bare minimum you need to know to have a foundational understanding of rollups.  After all, rollups are only half of the scalability solution for Ethereum. The other half is sharding, which I’ll save for a separate post.

If building Web 3.0 applications is something you are interested in, then sign up for our next cohort of the DappCamp, where you will learn how to build and deploy secure smart contracts on Ethereum.

If you have any feedback, questions, or corrections on this post, please post them in the comments. I would love to hear from you, especially if you find a technical inaccuracy anywhere in the post!

Story tags:

Why am I sharing my travel stories?

Founder & CEO of TruStory. I have a passion for understanding things at a fundamental level and sharing it as clearly as possible.

Preethi Kasireddy
LEARN MORE