The burden of Ethereum’s unbounded state growth is running developers towards third-party, centralized infrastructure services, and Infura is at the center of it all. As of late, concern is mounting in the community – this backstage presence, with even deeper ties to internet-wide centralization via AWS, is surreptitiously destroying the project of decentralization that Ethereum was envisioned for. Why does this happen and how can Ethereum reverse the damage?
*For a 101 on nodes, read the section at the bottom of this article
A consistent problem with Ethereum is the constant growth of the EVM state trie. In the last 6 months alone, full fast synced Geth nodes have doubled their size, to 126GB (and full archive nodes are around 1.8TB). Today, the prospect of launching and syncing an Ethereum node implies facing not only hard work (learning how to interact with it, building a server to wrap it in), but an investment in both money and storage space. It is thus understandable that many users (e.g. developers eager to test their code or launch apps), will not be interested in the trouble it takes to host their own node. People looking for ways to streamline the whole process have found a comfortable solution in third-party centralized infrastructure service providers, like Infura.
This has not been a low-key shift: according to Ethernodes.org, the amount of active synced nodes fell from 30,000 to 9,500 in 2018. The unintended, and potentially dire, consequence of this migration towards Infura is that Infura itself is becoming another layer of Ethereum – a centralization layer. Lane Rettig, Ethereum core developer, puts it so:
"If the requirements and cost of running a node continue to exceed the rate at which commodity hardware improves and cheapens, Ethereum will collapse down to a small, relatively centralized core of nodes run by economically incentivized participants such as Infura and Etherscan." More here.
Infura is a SaaS platform that provides API and developer tools for easy and secure access to Ethereum and IPFS. In other words, it is core infrastructure that serves as a gateway to the blockchain. Co-founder Michael Wuehler, author at ConsenSys, has expressed that Infura’s goal is to simplify development cycles and the process of creating products on Ethereum. And developers have responded positively to the help. Jackson Ng sums up the feelings of many other developers like himself in his blog:
"It takes hours to fast-sync and this sometimes fails. Light-sync will look for a node on the Blockchain that supports light-syncing but these nodes are far and few (…) I don’t like this unpredictability and this is where Infura is such as beautiful solution in my opinion"
Infura is currently handling around 13 billion code requests per day (according to Cointelegraph) for 50,000+ developers and dapps. The service is the back-end to dapps and protocols such as MetaMask, 0x Protocol, CryptoKitties, Truffle, and many more. Though there aren’t more precise publicly available stats, it is widely accepted that Infura’s presence has become the backdrop to a large part of the Ethereum dapp ecosystem; it could become the feared single point of failure.
Bleaker still is the fact that Infura itself operated by a single provider – ConsenSys – and is based on Amazon Web Services (AWS), which is, in a similar manner, a point of centralization for the entire internet. So, how can the community backtrack on this turn towards centralization?
Sources and personalities of the crypto world have, since last year, begun to raise awareness towards the issue. Afri Schoedon, Constantinople hard fork coordinator, has claimed “if we don’t stop relying on infura, the vision of ethereum failed” (sic).
The truth is that running nodes need not be so difficult. Schoedon has been adamant in clearing up that users can run nodes at home for relatively cheap, because there is rarely a need to run archival nodes (which are as big as 1.4 Terabytes and can cost thousands of dollars to maintain, as attested by Rettig’s personal experiences). Instead, Schoedon claims, pruned nodes – which can take up 90 GB and cost under $100 – are the way to go (see the quick explanation of Ethereum nodes at the bottom of this article).
Other kinds of solutions are also on the horizon: services like dappnode.io, blockcyper, DeNode and quicknode.io, are emerging as decentralized alternatives to Infura. Furthermore, Infura itself is aware, and the company claims to be making efforts to reverse the centralizing trend, searching for cloud providers other than Amazon. Wuehler has stated, “Our efforts are mainly about continually trying to push more and more decentralization into the way that our technology stack is delivered.”
On January 15th Ethereum announced a delay in its Constantinople hard fork (second phase of Metropolis) due to a security vulnerability allowing a reentrancy attack. While the community awaits Constantinople and Serenity, which will usher in PoS Casper, it would be wise to remain aware that band-aid solutions – which Infura is to scalability issues – are simply that. Temporary and fickle.
Ethereum nodes (or clients) are a software that implements the ETH protocol. Nodes give users access to the blockchain: they usually provide wallet functionality, can be used for mining, and can sometimes connect to other nodes or indirectly connect other programs to the blockchain (through web APIs or Unix sockets). There are different implementations written in different languages (Go, C++, Python, Java Rust, etc.). The most popular is Geth (stands for Go-Ethereum), representing 56.2% of the network according to Etherscan (Parity-Ethereum and Parity follow).
Not all nodes are created equal: in broad terms, there are full nodes and light clients. Full nodes verify all blocks and maintain the current state of the network. They are therefore very large, and not accessible on low-end hardware like phones. Light clients, on the other hand, only download headers and verify small bits of the chain, allowing speedier access and requiring less storage space.
Finally, there are pruning modes: the way in which block data is saved. Different node implementations will have slightly different pruning modes, but in general we see variations of, for example, "archive" mode (saves all past states of the chain) "fast" and "light" modes (save only the current state and a few before), etc.