Episode 219

Hashgraph – A Radically Novel Consensus Algorithm

Mance Harmon

Hashgraph is a new consensus algorithm that radically differs from proof-of-work as well as proof-of-stake consensus algorithms. While work on Hashgraph begun in 2012, it’s design is radically different from today’s blockchain architectures. The Hashgraph team claims that it has found an optimal consensus algorithm design that will be impossible to significantly improve upon.

We were joined by Mance Harmon, who is CEO of the Swirlds, the company developing Hashgraph. Our conversation covered the origin story of hashgraph, how it compares to existing consensus algorithms and how Hashgraph works.

Topics discussed in the episode

  • Leemon Baird and Mance Harmon’s long history of building companies together
  • What motivated Leemon Baird to start working on Hashgraph in 2012
  • The existing categories of consensus algorithms and their problems
  • How Hashgraph consensus combines voting and gossip protocols
  • The performance characteristics of Hashgraph
  • How a public Hashgraph network could look like

Meher Roy: Today, we have a very interesting episode lined up for our audience. We’re going to interview Mance Harmon of the project Hashgraph. Hashgraph is pioneering a very unique consensus algorithm that offers something super interesting to the blockchain consensus toolset. Mance, welcome to the show.

Mance Harmon: Thank you. Thank you for having me. Glad to be here.

Meher: So we’re going to discuss how Hashgraph during the episode. But perhaps before we get into consensus algorithms, Hashgraph and such complex topics, tell us a bit about your background and how you came to be involved in the blockchain space.

Mance: Sure. Well, I have a deep tech background. I started off my career doing research for the Air Force senior scientist for machine intelligence. I worked on a team of five doing basic research in machine learning, specifically, reinforcement learning and the combination of reinforcement learning with neural networks, convolutional networks back in the early and mid-90s before it was known as deep learning. Incidentally, that’s where I met my business partner and co-founder and creator of Hashgraph, Leemon Baird, Dr. Leemon Baird. We were on that same team working together for the senior scientists. I also taught computer science at the Air Force Academy. I was a course director for cybersecurity there at the academy. Leemon was also at the academy and was a full professor at the academy. I then went off and manage the massive software program for the Missile Defense Agency for the US government. Basically, a massive simulator that allowed the government to learn how to protect its citizens and its allies from incoming ballistic missile attacks.

Leemon and I decided we wanted to become entrepreneurs. And so we started our first company. I left the military and it was an identity and access management company that distributed single sign on solution back in the days of the palm pilots around the year 2000 and sold that in 2004. Went to work for the acquirer, at Fortune 500 and became the Senior Exec in the company for product security. Stayed in that role for a short amount of time. And Leemon and I decided to start our second company. Again, in the space of identity and ran that for six and a half years. Sold that to private equity, ended up going to work for a friend, Andre Durand, who’s the CEO of Ping Identity. I became the head of Labs. I stood up the labs organization for Ping and also headed up Architecture for Ping.

In 2012, Leemon decided that he was seriously interested in this space of distributed consensus, sort of independently of anything that was going on with blockchain and Bitcoin. You know, his vision required far more performant and secure approach than what’s required for just a cryptocurrency. And so he went to work, and he sort of nod on this problem for years. And in 2015, he got to the point where he’d solved the problem and it’s what we now call today, Hashgraph. It was in that role at Ping that this happened while I was at Ping. And so I decided to leave Ping and started our third company with Leemon, mine and Leemon’s third company. And it just turns out that Ping happened to be a first investor in Swirlds, the company that’s commercializing Hashgraph, the technology. And that’s how we got to where we are today.

Brian: Cool. Well, you guys have an interesting background. That’s a bit unusual from many in the blockchain space and the distributed ledger space, I think more experienced than probably your average person in this industry.

Mance: Well, Leemon has been inventing things for as long as I’ve known him. It just happens that what he invented this time was exceptionally good and at exactly the right time in the market, right? [Laughs] You know, Hashgraph. You know, it’s not unusual for Leemon to come up with things. There are a lot of inventions I’m sure that we’ll talk about some other time, but Hashgraph is what we’re pursuing today. Yep.

Brian: Cool. Yeah, I mean, of course, we will speak more about Hashgraph later and how it exactly works, but I do think it’s interesting when people approach this problem of distributed ledger and blockchain, or you know, the thing that this technology is trying to solve without coming, you know, from having sort of gone through the sequence of Bitcoin and then next, because I think that kind of enforces a certain way of thinking. But still, it was interesting to hear that Leemon started working on this in 2012. So was he aware of Bitcoin back then? Did he kind of follow it and did that in any way kind of shape his thinking that say, okay, I’m not going to do it that way, or I see some problems there and that’s going to inform the design of Hashgraph. Do you know, kind of how that—

Mance: Yeah. Oh, no, I know exactly how it happened. I mean, Leemon and I, this was an Austin, Texas by the way, Cedar Park, a suburb of Austin. And there was a particular Starbucks on the corner, Leemon and I lived a mile apart. There’s a Starbucks on the corner and we spent many, many, many evenings at that Starbucks talking about the different approaches that he had been taken. And the simple answer is no. I mean, he did know about Bitcoin, but it didn’t really inform his thinking at all. Because the problem set, the use cases that he had in mind, he understood very clearly early on that Bitcoin would never be able to address them. And so it wasn’t as though he said, well, here’s how blockchain works and here’s what I don’t like about it, et cetera, et cetera. It was more like I have this clean slate to start with, sort of tabula rasa approach to coming up with an algorithm that can address this set of constraints that I have, that are implicit in the use cases we care about. And so it means clearly, he understood what was going on in the market for Bitcoin, but it really had no influence on the design of Hashgraph at all.

Brian: Cool. Well, even though Hashgraph, since he said, you know, it’s independent in its origination and kind of been developed in parallel, you guys, when you explain Hashgraph often, kind of analyze, okay, these are the existing type of kind of consensus systems in blockchain and how they work. Would you mind kind of running us through how you viewed those different categories and kind of your perspective on the strengths and shortcomings of them?

Mance: Sure. Yeah. Well, so when Leemon had the inspiration of gossip about gossip, you know that enables this approach and we’ll talk later about what that means. You know, it had a unique set of properties in the market and what we did at that point was began to do a serious deep dive into the full range of categories of consensus algorithms to just explore how Hashgraph was different than all of those. And it turns out that there are only a handful of categories. Now Leemon had done an exhaustive literature search and so he understood what these approaches were and the categories were, but when you put it on paper and you start to point out the differences, the strengths and weaknesses, you can more clearly see really, you know, the power and performance of Hashgraph as it relates to the field. And so the field is pretty small in terms of categories in the permissioned world, you know, Hyperledger and Corda from R3 and EEA, et cetera. Of course, they don’t use blockchain. They’ve replaced proof-of-work blockchain with what we’ll call as category leader-based algorithms, and these are represented by Paxos and Raft and PBFT (Practical Byzantine Fault Tolerance). And there are dozens of variants on this category. And it works very simply. The nodes in the network elect a leader, they send all of their transactions to the leader. The leader has responsibility for taking those transactions, putting the transactions in an order, and then sending those transactions in that order out to every member of the network, out to every node. The nodes, of course, just take them in that order and commit them to the underlying database and the same order. And that’s how they keep the databases in sync.

So this is okay. It doesn’t scale that well. You can, you know, get to dozens of nodes. You cannot scale to thousands or tens of thousands of nodes. You know, in fact, in the literature the most I’ve ever seen as about 100 nodes that scale in this manner. The problem is that there is a leader and because there’s a leader, there’s a bottleneck. These systems top out at about a thousand transactions per second, roughly speaking. If you look at IBM and you know, Hyperledger and Cocoa, macOS Cocoa and others, they’ll announce, or they’ve marketed the fact that they can achieve a thousand transactions per second. They’re all using this category. And the problem is that there’s a leader. So there’s a bottleneck in terms of scalability both in the number of nodes and the number of transactions per second, but more importantly, maybe more importantly, is the fact that it’s vulnerable to distributed denial of service attacks and nobody ever really talks about this, but each member of the network by design has to know the IP address of the leader or at least how to communicate with the leader. And if that’s the case, then it becomes possible for any node that is compromised, whether that be an insider threat, you know, a disgruntled employee or the node has been compromised by an attacker from a foreign country, et cetera. Whatever the case might be, if the attacker can direct a botnet to do a DDoS attack on the leader, then they can attack one computer at a time, and stop the communication flow for the entire network. Now what happens is that the leader, the network recognizes the leader’s gone offline and so they decide to elect a new leader. That happens very quickly, by the way, you know, within a couple of seconds normally. But the attacker knows the IP address of the new leader and so they just changed the target of the attack and just follow the leader with this DDoS attack. So this is a fundamental problem in the architecture.

So the other problem is that it’s not fair in certain technical sense. So there are use cases where maintaining the actual order of events as they flow into the network is extremely important. For example, maybe a distributed stock market. If it’s possible for a single party to prevent bids or asks to flow into the market or to influence the order of those transactions to unfair benefit, then that’s a problem, right? And with a leader-based system, the leader has responsibility for exactly that, for putting the transactions in an order. And if the leader so chose, they could drop transactions, they could influence the order to unfair advantage. And so in other words, leader could be bribed. So this category of use cases that require fairness is fairly broad. It’s not just trading platforms. It could be games if you had an MMO and you know, people in the game environment, two people reach over to pick up the pot of gold at the same time. The one that actually pushed the button first should get the gold. It shouldn’t be possible to influence that. Or auctions if you’re creating an eBay, a distributed eBay. The person that pushed the button last should actually get to win the auction. And this fairness property is really important for a large group of use cases and leader-based systems just inherently are not fair. So that’s problem with leader-based category. Of course, the public networks don’t use leaders or most of them don’t. We’re beginning to see some or some of the public networks beginning to use some types of leader-based technology.

But proof-of-work blockchain is inherently different. All the transactions flow to all of the miners in the network. The miners collect all those transactions, put them into a block, and then they compete with each other to solve a hard-crypto puzzle. This is the proof-of-work part of this. And the miner that solves the puzzle first, earns the right of publishing their block of transactions to the rest of the miners. That then gets put on top of their local copy of the chain, the blockchain. The miners unilaterally decide which transaction go into the block. So it’s isn’t fair in the sense that they can prevent transactions flowing into the network. And if they put it in the block, they decide the order of the transactions within the block and so it’s not fair in ordering either. And so proof-of-work blockchain and blockchain in general where there is a miner that decides which transactions go into block and the order is not fair. Now, proof-of-work blockchain is more secure than leader-based systems because you can’t predict which is going to win the crypto puzzle first, right? Who’s gonna solve the inverse hash puzzle first. And therefore, you can’t do a DDoS attack against the network in the same way you can in a leader-based system. So it’s more secure. It’s not fair. Of course, performance is terrible compared to leader-based systems today, a few transactions per second and extremely expensive when using proof-of-work.

And so those are two of the major categories. Now in response to the problems of proof-of-work blockchain, the market has come up with a new generation of technologies that try to replace proof-of-work with other mechanisms. We call these economy-based solutions for the simple reason that the approach is to make a fundamental assumption that the stakers in this case, change the term miners to stakers for economy-based systems. The stakers are rational and they’re going to do what’s in their own self-interest. In other words, they’re going to try to maximize the amount of money that they make. So that’s an assumption. And then we come up with a system of incentives that when applied across all of the stakers, will result in sort of an emergent behavior that the stakers come to agreement on the blocks that should go on top of the chain. In other words, they come to consensus. And their various ways of doing this. But what’s important is to understand that one, the fundamental assumption that the stakers are rational is not a good assumption. Of course, we assume that those running the nodes, the owner of this hardware that’s running the node in this economy-based system wants to maximize the amount of money they make. But if that node is compromised by a virus or a disgruntled employee or anything, then that assumption is out the window. It’s not a good assumption. The virus doesn’t care whether or not the person that owns the computer makes money. The virus only cares about bringing down the network that’s the of the virus, perhaps. And so it’s not a good assumption to assume rationality as the basis of consensus for the entire solution. Number one.

And then secondly, the whole approach is extremely complex in the sense that it’s impossible to formally analyze it in the sense that we would formally analyze an algorithm and write math proofs of certain properties about that algorithm. The community is talking about proving formally that code is correct and that the code will do certain things. For example, a lot has been made that we can write proofs that certain smart contracts will result in the staker either winning more money or losing money once they’ve entered into the contract. That’s not the right question to be asking, right? We care about that, but that’s not what we fundamentally care about. What we fundamentally care about is whether or not a single attacker can do something maybe unknown to us today but can figure out a way to attack the network and cause the network to basically come to a standstill, cause the network to not be able to come to consensus. That’s the math proof we care about. We need to know for certain, guaranteed, that it’s impossible for a single attacker to prevent the network from coming to consensus. And those proofs don’t exist for the economy-based model. And they don’t because the system is complex, it’s chaotic, it’s an economy, it’s a model of an economy. And so for the same reasons, it’s not possible to write a formal proof that, for example, a stock market will never crash again because it’s too complex. And it’s a chaotic system. You can’t do the same for this category of approach, these economy-based solutions.

So those are three, right? Leader-based system, proof-of-work blockchain, economy-based solutions. The question then is, well, what can you do formally? I mean, we want to guarantee the best security possible for the platform. If this platform is going to be processing trillions of dollars of value, that’s fundamentally important. And we want it to be extremely fast. We want to be able to scale up way beyond, orders and orders of magnitude beyond where we are today. So what do you do? It turns out there is another category we’ll call pure voting-based algorithms. They go back decades; 30 plus years and they work in the following way. Let’s assume perhaps we have ten nodes in our network and if I want those to agree on the order of transactions, I can ask those nodes to vote on a question that’s related to the order. And what that means is that each node, I asked to each node the same question. Each node will cast their vote and when they do so, they send it to every other node in the network. And then when an overseas a vote, it sends an acknowledgment, not just back to the sender but again, to every other node in the network. And then there may be multiple rounds of this voting process that’s required for those nodes, those ten to come to agreement on the order of transactions. And if you have a stream of transactions, it becomes even more complicated. The point here is this, the amount of bandwidth required, the amount of traffic that’s flowing just to accommodate the voting process just explodes and it explodes in the number of nodes and then the number of transactions. And so for that reason, this category is not practical at scale and no pure voting-based system exists at scale today.

However, this approach has phenomenal security properties. These pure voting-based systems have been shown to achieve the gold standard in terms of security, provably in the mathematical sense that I’ve been describing. And that standard is called Asynchronous Byzantine Fault Tolerance. Everything else prior to this point has been something less than that. And only these pure voting-based systems have been shown to achieve asynchronous BFT, but they’ve not been able to achieve it at scale. So those are the four. Leader-based systems, proof-of-work blockchain, economy-based solutions, pure voting-based systems that are theoretical and don’t scale. The question then is how do you achieve the properties of these pure voting-based systems?

Brian: Well, maybe let me just jump in here a little bit.

Mance: Yeah, please.

Brian: Because I feel there are a bunch of stuff, you know, maybe I don’t fully agree with. So you know, for the last years, as I’m sure most listeners know a bit more than that. I’ve been working on Cosmos and Tendermint. So at Tendermint, it isn’t essentially an implementation of—so it has kind of these voting base properties, and proof-of-stakes. I think game, theoretically, it’s extremely safe. In terms of performance, I know they have the run, we have done some benchmarking. We have, I don’t know, 100 nodes, a few hundred nodes and something like 10,000 transactions per second.

Mance: Sure, sure.

Brian: And now, in game of a leader-based. So now there is a validator that proposes each block, there is kind of a leader. And it is of course an interesting question. How can they be attacked? I think DDoS is something that can be mitigated. I think the point you made about fairness is a more fundamental one and that seems to be, I can see that being a massive issue. And of course, we are going to speak about—and I think while you’re making a good point is that, even though I think these systems seem to scale pretty well up to a point, in the point maybe a few hundred validators or something like that, but you know, you won’t be able to do 10 thousands validators, right? That that seems clearly infeasible or even a thousand, you know, performance will get much worse.

Mance: And of course, if you only have hundreds, that may be enough, if you’re sharding, right? In a single shard, it may be sufficient to only have hundreds of validators to ensure security of the system.

Brian: Well, I mean, my thought on this is actually a 100 validator is perfectly fine if those are, you know, independent, running their own separate system and there’s some sort of reasonable distribution of power. I mean, we look at Bitcoin today and you know, it’s for many times, it was maybe three miners, or three mining pools have actually a majority. And that’s concerning even though so far it worked. So if you have a hundred and this reasonable distribution, then and I think that’s fine. Of course the question is always, okay, who are all these validators, what are their incentives, how much power do they have, how are they associated? If you kind of made the point before, oh, someone have a like a virus and bring the system down. Well, I think then the question is, of course, if they were run in all the same hardware the same and it’s the same setup, then yeah, that could be a risk. So diversity in their setup as well.

Mance: Well, certainly, there’s that, but it’s still the question of is there a way to bring down the network? Period. And unless you can, so what we always have done throughout our careers, Leemon and I, is start with first principles, right? Start with the fundamentals and start with the math and create the best, most solid foundation possible. And once you have the best building block, then the building you build with those blocks turns out to be fundamentally different in some meaningful ways than if you start with something that’s not quite as strong, you know, has a flaw in one way or another or can be attacked in one way or another. Because these vulnerabilities compound, right? And so for example, just to make the point, what Leemon has designed is a way to scale this, you know, a sharded system, a protocol for sharding, that maintains the asynchronous BFT nature across the entire system.

Right now, this is not yet public. It will become public in the coming weeks and months. But you know, in most cases, when you shard that, the security of the system as a whole is degraded. In this case, because we’ve started with asynchronous BFT as the building block, it turns out you’re able then to create a sharded solution that system wide is asynchronous BFT. And then I would say just one other thing, there are a number of different platforms out there that are experimenting with the combination of these different architectures and they’re doing so because they need to achieve scale and they want to achieve high throughput and et cetera, et cetera, et cetera. So they’re doing it for good reasons and I can appreciate why they’re making these combinations. The problem is that when you do create the combinations, you may improve certain aspects, but you inherit the vulnerabilities of all. And so the vulnerabilities associated with leader-based and the vulnerabilities associated with a proof-of-work blockchain or economy-based solutions, if you combine them, you have then, you know, a larger set of vulnerabilities to defend against than if you stay pure to one of the categories.

Meher: So before we move on to Hashgraph, I’d like to know sort of the history and genealogy of voting-based protocols. So Leslie Lamport and Szostak, they defined the Byzantine Generals’ Problem in 1982 in a paper, right? And they defined the problem as being, there’s a city and there’s a set of generals around the city and these generals must coordinate and attack at the same time, yet some of the generals are probably corrupted and will try to make sure that part of the generals attacks at one time and the other part attacks at a different time. So you have traitors inside these group of generals, but despite these traitors, these generals must come to an agreement on when to attack. And Byzantine Fault Tolerance like the Byzantine Generals’ Problem stayed in an open problem for quite a while until algorithms that were based on leader election and voting followed by leader election were kind of proposed. So the idea would be one general would become a leader, that general will propose a time to attack, the other generals would vote. And then once enough generals vote affirmatively, you could like sort of collect these votes and attack at that time. And the problem there also was that none of these protocols in a computer science algorithm, in a performance sense for performance. And finally in 1999, in MIT, there was a PhD thesis and paper published on Practical Byzantine Fault Tolerance. I think it was Miguel Castro that published that paper. And Practical Byzantine Fault Tolerance short for the first time that in a leader-based system and as a leader and other people voting on what the leader proposes that that can be made practical and turn it into a computer system that can be used to vote not just on one piece of data like when to attack, but on thousands of transactions.

So in the family of leader-based algorithms, you see this pattern. There’s an idea, it’s not practical for a decade or decade and a half and then some person comes and makes it practical in a system. And at some level, with Hashgraph what you’re saying is there’s another family that’s voting-based algorithm that’s different from this genealogy, this part of the consensus tree. And that has existed as an academic idea. And what you’re doing is you’re making it practical. Can you point us to who came up with this family and do you know what are the important papers in voting-based algorithms?

Mance: Yeah. Well, so thank you for the genealogy. I think you’ve done a better job there then I could have done off the cuff. Leemon—

Meher: Thank you.

Mance: That was good. It was great. So I can’t tell you off the cuff what the papers are by reference, but I certainly could get the references. The starting point is the same, right? The peer voting-based algorithms that go back to the 70s and 80s is the starting point and then there’s sort of two trees, if you will, that would branch. One is the Paxos and Raft and PBF tree that you’ve mentioned. And by the way, the whole DDoS set of attacks and problems that are associated with this category, there is a phenomenal paper. Again, I can get you the reference and you can provide it, you know, when you publish this. It deals specifically with PBFT and how it’s vulnerable to these various types of timing based in DDoS attacks. So that’s a branch and then there hasn’t been another branch. I mean, there’s no in between paper from the 70s to Hashgraph, you know. Hashgraph as an approach is brand new. And what makes that possible for the very first time is this inspiration of if you’re going to gossip information, you know, they’re sending transactions across to everybody that you can do something special, a little bit special, there that makes it possible to use this internal data structure that you build up. That is the Hashgraph. And then combine that with the use of a pure voting-based algorithm without ever having to cast the votes. That’s entirely new. And it’s not as though there are steps from the 70s to 2015. There’s one step from the 70s to 2015. And that was Leemon and the Hashgraph.

Brian: So I think you just kind of summarize Hashgraph, but I’m pretty sure nobody would be able to understand it from that. So you said right the gossip, so we have the gossip sending transaction that then kind of a data structure emerges that is called the Hashgraph, and that that can somehow be combined with this voting type

Mance: Of approach.

Brian: Approach to create a consensus algo. And so maybe we can unpack that exactly what’s going on here. First of all, gossip. I don’t think many, everyone will be kind of, it will be clear what that means. Can you explain what that is?

Mance: Yes. Well, gossip protocols have existed for decades as well. It’s a common approach for sending information to a large population very efficiently over a data network. And the idea is, you know, it gets its name from what you observed in the workplace or in your social circles. You know, one person tells another person a rumor and then that person tells somebody else the same rumor and the rumor just flows through the population exponentially fast. Alice tells Bob the rumor and then you know, at time t+1, Bob is telling Ed and Alice is telling Charlie the rumor, et cetera. It explodes to the population exponentially fast. That is a gossip protocol and you can implement that in data networks. And the question then is, what is it that you are gossiping about? And you can gossip about a lot of different things. You can gossip identity information, you can gossip transactions to sort of opaque arbitrary transactions that are only understood by an application that sits above.

What we do is we gossip about gossip and that’s going to be a little bit hard to wrap your head around initially. Every node in the network has a local copy of a database. And the goal is for all those databases to stay in sync. Some people call them a ledger, right? Had the ledgers stay in sync. If I write a transaction, if I create a transaction and I want to update my ledger, that transaction has to go to every other node in the network so that they can update their copy as well. So all transactions have to go to all ledgers. That’s the minimum bandwidth required for this gossip protocol. What we do is create a graph that memorializes when people talk to each other, when nodes talk to each other. And so if Alice talks to Bob, Bob creates a circle that goes into this graph. If you were to draw the graph, you know, it get lines and circles. Bob creates a circle that memorializes this synching event between himself and Alice. And as part of that circle, Bob creates some transactions that would be understood by the application and they’re the payload that then go in this circle. All right. Also in the circle is encapsulated a timestamp and a couple of other hashes and I’ll get to what those are in just a moment. But that’s it. You know, there’s a data structure that represents this circle. And then the data structure is a payload of events, a timestamp and then two hashes and that gets signed, that structure gets signed.

Brian: And maybe just to sort of bring this one up here, gossip protocols are being used by existing blockchains like Bitcoin and others, basically used to, again, propagate transactions, fill up the mine pool, propagate blocks, but do not use in a consensus, to kind of separate networks. Or in Bitcoin, I think there’s even something some like if relay network between miners to propagate blocks faster or maybe does that use also some kind of gossip protocol probably?

Mance: They’re used everywhere. I mean, gossip protocols as a category are used all throughout all kinds of applications across computer science. So yes, that’s a very well understood approach and we’re certainly not unique in the fact that we are using the gossip protocol. What we’re unique in is what we’re gossiping, you know. We’re gossiping the transactions as part of that event, but we’re also gossiping two hashes that represent the last event that was passed to me by somebody else. That event with its package of transactions gets hashed, that hash goes into my event, this new event and the last event I created, it gets hashed, and its hash goes into my event. And so this event with all of the different data items that I’ve said plus the two hashes that link back to two prior events that then gets gossiped to the network. And when members of the network received these events through the gossip protocol, they can, by using these hashes, create the hashgraph, the hashes’ link make it possible to link together these events in a certain order that creates the hashgraph. And provably, each node in the network ends up having the same hashgraph. They’re identical, at least through a point in time. They’re identical. And so new events are always flowing in on top, you know, you’re always gossiping and talking to other members of the network. The network just runs as fast as it can. And at a moment in time, it becomes the case that all the events prior to that moment in time are known by all the members of the network. And you agree on the hashgraph. So that’s important. That’s the gossip about gossip that we’re talking about. And the inspiration was that if we do that, there is enough information there that we can take a pure voting-based algorithm and use the information in the local copy of the hashgraph and instead of sending votes over the network, you know, asking each member of the network to cast a vote on a certain question, we just locally say, for this question, what would each member of the network vote if they were to cast a vote and there’s enough information in this hashgraph to answer that question for every member.

And so it’s gossip about gossip with virtual voting. You never have to send to vote over the network and you’re done. That’s it. And it maintains the properties of the pure voting-based systems in the sense that it’s asynchronous BFT. It achieves for the first time ever the gold standard in a security as it relates to distributed systems and it does it at scale. And also, in addition, because you have to send the transactions to everybody. That’s the baseline. And all we’re adding is two hashes to that event, that message that’s being sent, we don’t think you can do any better in terms of bandwidth efficiency, so we’ve achieved the best one can achieve in terms of bandwidth efficiency and we’ve achieved asynchronous BFT for the first time. That’s why we make—I realize that’s a really bold statement, but that’s why we the claim that Leemon has actually solved the problem of distributed consensus at scale with the Hashgraph.

Meher: It doesn’t actually pretty easy to imagine the big contribution of Hashgraph, right? We can harken back to that imagination that there’s a city and the generals gathered around the city, so let’s say 50 generals, the three of us are part of these 50. So we each have our individual armies, Brian, Mance and me and we are part of this group. And the group of 50 generals, we must attack together, and we have to sort of agree on a time and let’s say not only do we have to agree on a time, but we have to agree on a sequence of times. So maybe if you have to agree on when will we fire the cannons, when will we re-deploy the artillery and there’s a sequence of times that we want to agree on. And if we agree on this sequence of times correctly, then our attack will happen flawlessly. If we don’t agree on it or some of us agree on one sequence and the other is another sequence, then our attack can fail. And then there are some generals inside the camp that are traitorous, and they want half of us to attack through one plan and the other half to attack through other plan. So what is Bitcoin? Bitcoin is the idea that, hey, all of us are going to solve puzzles and one of us is going to win at solving the puzzle and then the person that wins at solving the puzzle then announce one time to do something like we’ll fire the cannons and then he’s going to solve the puzzle and broadcast the solution and the time of firing, they can store all of the other generals. Then all of the other generals, now all of us will start working on the next problem. And then one of us will solve that problem and the solution to that problem is additive on the solution of the first problem. And that the second solution broadcast the second time to do something else, like deploy the artillery. Then a third round will happen. All these generals will solve a different problem. Again, one of them will solve it and so on. So that’s Bitcoin, effectively.

A leader-based system is one of the generals, let’s say me, will become the leader. And I will say, let’s fire the cannons at like 12PM and I’ll sign this statement and I’ll send this message to all of the generals through my messenger. So generals are connected to messengers like delivery boys that can go and deliver the message. The other generals will receive my message and vote affirmatively, right? Okay. Yeah, let’s agree to attack at 12. And then these votes are then sent through delivery boys to other generals. And if at some point, I as a general receive a message with enough votes saying, let’s attack at 12. I accept that as the correct time to, you know, fire the cannons or whatever. And then somebody else becomes the general that proposes the time to deploy the artillery. All of us vote on it. And then finally, if there are enough votes, we agree on the time of deploy. The artillery then, Brian, let’s say, creates a message. Okay, let’s send in the cavalry at this time. And then we vote, and we decide on when to send the cavalry. So that is how like a leader-based system will work. And to me, Hashgraph appears to be in both Bitcoin and a leader-based system. The thing that is getting exchanged between the generals is data about the strategy itself, at what time should we do what, but in Hashgraph [overlapping].

Mance: That’s the application level information.

Meher: That’s the application level information, right? What should we do when, whereas in Hashgraph, what you’re saying effectively is what if in addition to application level information, when to do what. We also record the sequence of what the messenger boys have done in order to send these messages across to the generals. So I’m a general, let’s say I say, okay, attack at 12 and I sent it to Mance. Mance receives that messenger boy. In Mance’s letter, it says, okay, I received a messenger boy from Meher that told me this.

Mance: Right.

Meher: Then you add information about what you think. I think not only should we deploy the cannons at 12, but we should deploy the artillery at 1PM. And then you also put in some past information that when was the last time you sent a messenger boy out to some other general and you’ve compiled all of this into a packet and then you instruct your messenger where to go to another general. And when the general receives your messenger boy, he is going to create another data packet, but include the behavior of your messenger boy in his data packet. And so once sort of these messenger boys ends up delivering and you start to collect information about who proposed what and how the messenger boys worked after certain point in time, everyone will agree on who proposed what and what the messenger boys did.

Mance: Right.

Meher: And based on that, they can come to the conclusion on when to deploy the cannons, when to deploy the artillery, something like that.

Mance: That’s it. And that it’s great that you use that fundamental analogy for the different categories. And that’s right. I mean, that’s sort of demonstrates when we say we’re gossiping, we’re not gossiping transactions. Of course, the transactions go along for the ride as a payload in the event, but the event is describing what the two hashes, you know, who we talked to last, effectively, right? It’s the hash of the last event that was given to me and the hash for the last event that I created. And so it’s the gossip about gossip, the metadata that’s on top of the transactions that’s important in creating the hashgraph. And that’s important also, you know, it’s interesting to me that hashgraph as a term is becoming increasingly used. And you know, there are a lot of platforms, an increasing number of platforms in the market today that are based on DAGs (Directed Acyclic Graphs). And of course, hashgraph is a directed acyclic graph as well. It’s a DAG. The fundamental difference, what makes a hashgraph a hashgraph is what that DAG represents. And in our case, it represents the gossip flow, the flow of the communication across the network as opposed to the transactions. That’s how we’re fundamentally different than everything else. And we define hashgraph to be a DAG that represents the communication across the network.

Brian: I mean, listeners will probably have some familiarity with DAG and DAG-based systems. We did do a podcast about this before, about something called SPECTRE, which is a paper by a two Israeli academics on this topic. Now that is also seems very interesting. So why is this gossip about gossip so fundamentally different from having a DAG-based structure that has transactions of blocks?

Mance: Well, the information’s not there to calculate the votes. You have to know how Alice would reason about the information she received from Bob, as opposed to, you know, if I just sent out as a transaction and that don’t tell her anything else, she doesn’t know where that transaction came from and she can’t reason about where I received the transaction from in the first place, right? All of that information is lost, if all you include are the transactions. You really need to include the information that represents the flow of the transactions across the network in order to be able to use the voting-based algorithms.

Brian: Right, right. Now that makes sense. I think that’s very well put, right? So the essence is you can kind of think of it like this, right? What you guys are doing is almost continuously that all the nodes kind of update, okay, this is what I’m seeing and that gets kind of all adjusted and so of course, they can say, okay, now I’m using these simple rules and will say this is the order of transactions and I know what everyone else would do because I have kind of synched up on what information they received in what order.

Mance: Yeah. So everybody, when I talk, when Alice talks to Bob, for example, she has a graph, a hashgraph, Bob has a hashgraph, they’re identical up to a point. And then Alice knows some things that Bob doesn’t and vice versa, and they share with each other the delta. What Alice knows that Bob doesn’t, she gives to Bob and vice versa. At the end of that synching event, their hashgraphs are synched. In other words, Bob now knows everything that Alice knows and vice versa, and they can reason about where transactions came from, who talked to who and when. And because everybody’s gossiping with everybody all the time, the local copy of the hashgraph is always being brought into increasing sync across the network and at a moment in time, they are in sync and they will forever be in sync from that moment in time.

Brian: So that actually brings up an interesting point. So there is a term finality, right? Which is used in, for example, in things like Tendermint where every block is final, right? So you know, okay, block takes three seconds and it’s final. In proof-of-work systems, it’s much harder. So one never really has finality. So one uses these, like [indiscernible 0:52:10] finality. People are probably familiar with that in Bitcoin for a long time, you know, or maybe still to a large extent, one would kind of say, okay, six transactions, oh six confirmations. So once a transaction is six blocks deep, you kind of considerate it final, but you know, it’s not exactly the same thing. And then in Ethereum as well with Casper, I think there will be some kind of finality coming, but they will take some time to get there. What does it look now in Hashgraph? At what point, do you know, how many seconds does it take until you can say it’s final and how does that depend on, you know, the number of nodes that are participating in this?

Mance: So what I can tell you are sort of what we’ve seen empirically and then explain it. So an information that’s gossiped into the network, it goes out to the entire network exponentially fast. The amount of time it takes to get to every node is logarithmic in the number of nodes. And so that’s an important point. The amount of time it takes information to get from the originator to everybody else is logarithmic in the number of nodes. In terms of finality, what we see is that on average, it takes about eight gossip periods. I’ll call that a gossip period. It takes about eight gossip periods before the transactions are final. On average. It can be more, it can be less, but on average, it’s about eight gusset periods before the transactions are final. Now, finality, in our case, is fundamentally different than saying proof-of-work blockchain. In proof-of-work blockchain, you have a partial order on the transactions. What you want to know for sure is that two coins, or that a person can’t double spend a single coin. And so if Bob tries to spend that coin in two different places, double spin the coin, then the partial ordering of those transactions to ensure that you can’t do that is all that’s important. In a total ordering system, every transaction gets put into a total order across all transactions for the entire universe. That’s a much harder problem to solve. And so we solve, when we say finality, what we mean is that we have a total order on all transactions across the entire universe. And that becomes really important when you start thinking about treating this like a database. You know, if you literally have databases underneath the nodes then you care about the order of every transaction and you have to have a total order to ensure ACID compliance across all the databases. And we have that.

The other thing that’s different is that when we achieve finality, we’re a 100 percent sure of the fact that it’s final. And we know for certain that every other member of the network knows for certain that it’s final and it will never change, guaranteed. So it’s fundamentally different than Bitcoin, for example, where you have blocks and with each new block on top of the chain, you are probabilistically more certain that, you know, that the coins not going to disappear if you’re the merchant, for example. And so, you know, we achieve a higher level of certainty or finality than what’s currently achieved in Bitcoin. And that’s also part of the definition of BFT. Just being BFT, you can’t be BFT unless you have 100 percent certainty that at a moment in time, everyone agrees on the order and the order will never change. And so Bitcoin blockchain is not BFT by definition.

Brian: And so you were speaking about eight gossip periods and how long was, let’s assume we have now a public Hashgraph network.

Mance: Sure, sure. So empirically what we’ve done in the lab, and we’ve done lots of tests both in a single data center, a cross country, you know, using AWS for example, from Oregon to Virginia, cross data search, cross country, and then globally. And what we’re able to see in those contexts with a varying number of nodes, we haven’t achieved hundreds of nodes yet. We haven’t tested hundreds of nodes yet. We’ll achieve it. We just don’t have the quota from Amazon to do hundreds of nodes yet. It’s that, you know, depending on the number of nodes and the geo distribution, we go anywhere from sub second finality, total order finality, with 100 percent certainty, to seconds of finality. And it may end up being that it’s tens of seconds finality on the order of transactions when we go global at scale it. It remains to be seen. We’re still doing that work.

Brian: So of course, that kind of already answered this a little bit into a topic that I’m sure many people will be curious about, which is, you know, is there going to be a public Hashgraph network and especially, you know, how would that look like in a public chain context? Now we’re speaking about nodes and just gossip about gossip. So presumably, let’s say if I could spin up lots of nodes, run lots of nodes and kind of gossip fake information stuff. I mean, I presume I will be able to disrupt this process. Up to, you know, how is that security—

Mance: Well, the real question you’re asking is, how do we go from a permission that work where there’s one vote per computer and you have to give permission to a computer to join the network to an open network where you can’t have one vote per computer? Because to your point, you know, somebody could stand up a bunch of sock puppets and if they stand up enough, then they can have two thirds of the total vote, which would dictate the order of transactions. In all cases, you know, whether it’s Hashgraph or leader-based systems or any of these distributed consensus algorithms, what’s required to be public or open is a scarce resource. You have to make it expensive for a would-be attacker to stand up all those sock puppets or you have to make it impossible for them to, you have to make it impossible for them to have enough of the voting weight if you’re weighting votes to be able to influence the order of the transactions.

And so a public network built on Hashgraph would be the same. You know, that’s why the cryptocurrencies are actually required in the public networks. A lot of people view the cryptocurrencies as sort of the point of the public network. That’s not how we view it. I mean, if there were a public network that’s built on Hashgraph, it would be to make it possible to build globally distributed applications and it turns out you have to have a scarce resource. And the cryptocurrency serves as that scarce resource. And so it’s certainly would be a feature, but it’s a necessary feature for the bigger vision of globally distributed applications. And yeah, there would be a cryptocurrency associated with that and there would be a way of using the number of coins that perhaps one owns when they run a node, it’s a mechanism for weighting the influence of their vote on the consensus voting process. And so instead of it being one computer, one vote, it’s one coin, one vote. And then there’s a whole set of issues that you have to deal with if you take that approach which I’m not prepared to go into today, certainly we will, you know, we would explore that in great detail, you know, when that becomes appropriate for a public network.

Brian: So that’s very interesting. Of course, that makes perfect sense. I think the aspect of having kind of coins or stake or something like that that raised votes and protects from civil attacks, you know, having where many of these networks are going. And that seems very reasonable. One thing I’m curious here in such a scenario, would you need the block reward or some other kind of incentivization for people to participate in this or is that not necessary?

Mance: Well, yeah, no. Absolutely. I don’t think there’s a free launch and a lot of the systems that we see today really don’t account for all the costs that are associated with being a node and I view those as flaws in the economic underpinnings of those platforms. So for example, you know if you’re running a full node then you are a spending money on bandwidth, you spend money on storage and you spend money on CPU cycles. And I think all three of those a full node would be, you would want to compensate a full node for the amount of resource that they provide in each one of those categories. And so if you’re going to have a public network at scale and it’s open consensus, anybody who wants to run a node can run a node, then you have to have a system of economics that actually counts for all of the costs in order to be viable at full scale at maturity. And yeah, certainly that would be part of a public platform.

Meher: So of course, Mance, Hashgraph kind of looks like a very interesting consensus algorithm and deploying it in a public setting could make an interesting asset or coin or however we might choose to see it. My question would be what other applications—it’s a two-part question. First part is what other applications are you deploying Hashgraph to in the short-term apart from potential public system. And the second question is, since you and your team are approaching the problem of consensus from a very different perspective to people from the blockchain industry itself, how do you see the blockchain industry evolve if your consensus algorithm proved successful?

Mance: Interesting. Okay. Well, to start, yeah, we started by addressing enterprise use cases, right? I mean, that’s where Swirlds is focused as a company. And the first use case, Ping Identity, I’d mentioned that I was a Head of Labs and Architecture for Ping. The first actual application built on top of the Hashgraph was an identity focused application that actually solves sort of an obscure problem in the world of identity. It’s something associated with the protocol code called OpenID Connect (OIDC) and Ping has then taken that and propose that to the OIDF, the standards body associated with this as a way to address the problem in the standard. And they continue to pursue that even now that I’m gone.

We then approached the Credit Union industry and the Credit Unions as an industry wanted to create a platform that makes it possible for the 6,000 Credit Unions in North America that represented 105 million members to build arbitrary distributed applications. I mean, not just Fiserv-related applications (Financial Services-related applications), but in addition to those, messaging applications, information exchange as it relates to risk, for example. Of course, identity related applications. And they’ve now created a new organization called CULedger that stands for Credit Union Ledger. The industry, CEOs voted as an industry to create new service organization, CULedger. And CULedger now is capitalized. It has a CEO. The CEO came from Mastercard, a Senior Exec from Mastercard and they this year are rolling out this new platform built entirely on Hashgraph and there will be in app store. They have an existing, you know, robust third-party developer system that already builds applications for the industry. Well, now, they’ll be able to build distributed applications and market those through the app store as a channel to the 6,000 Credit Unions. And so that will be the first large scale deployment of the technology and that’s happening even as we speak. And then there’s a large pipeline of other customers that are cross industry. There’s healthcare. There is supply chain management. There are other financial services organizations that are using the technology. None of these have been yet made public and they will be. They’re less mature than CULedger is, but you know, they’re coming, and you’ll see those this year. And so that’s, you know, that’s what’s Swirlds has been focused on.

As an industry on the public side, what I think will happen, you know, with or without Hashgraph, and I think Hashgraph maybe will accelerate this in some ways, is that the stack of technologies that are there to deliver the product, the elements of that stack will become increasingly differentiated. I mean, we start with Bitcoin. It’s like one monolithic code based and in some ways it’s a mess. What we would like to see and what I’m sure we will see as the industry matures, is that each layer of the stack, for example, with Hashgraph, what we have is a consensus. I think of it as a consensus server. Maybe server is not the right word, it’s an SDK built in Java, entirely in Java, that handles communications across all the nodes. It implements that gossip protocol that we’ve talked about and puts arbitrary transactions in consensus order. So that is it. That’s all it does. That’s the foundation. That’s the consensus layer, the bottom of the stack. If there were going to be a public network, then there would be a whole range of services that go on top of that layer, that consensus layer. There’d be a services layer and you know, one of those might be a cryptocurrency, another one might be storage, another one would be smart contracts, et cetera, et cetera, et cetera. And anonymity services, what you can imagine, what kind of services you might want. I could see and expect that there will be competition for providing best of breed in the market for maybe each one of these services and for the consensus layer itself. And that’s how markets mature. And so, you know, we will certainly hopefully accelerate that process. We have our own ideas about how all of this technology should work in a mature well architected stack, you know, that serves all the use cases of the industry and I think that the industry will, you know, come to agreement on what the definition of those pieces are and you’ll see companies spring up that do one of those things and they do it really well. That’s what I would expect.

Brian: Cool. Well, maybe one last question here. What’s the timeline of Hashgraph? Where are you guys at currently and what’s the roadmap for the next, you know, maybe 12 to 24 months?

Mance: Yeah. Well, so the work that we’re doing on that consensus layer that I just mentioned is coming close to version 1.0 capability. That work has been going on for years. This is a complicated thing. So Hashgraph, in some ways, is far more complicated to implement than blockchain is to implement. And you know, making a gossip protocol that is highly performant is a work of art. And so a lot’s gone into that. So we will continue to harden that consensus layer. And then even with or without a public network, it’s still important to have that services layer that I just described. All of the services I’ve mentioned have value even in a permission context, right? And so, you know, we will build out that services layer. And everybody wonders if there will be a public network. It’s fair to speculate that there will be, you probably would be right. It’s so, you know, we’re not making any announcements but we’re doing all the things that anybody would do in our situation. [Laughs]

Brian: Okay, okay. I think that that is the most elegant and non-announcement announcement that I’ve heard. [Laughs]

Mance: Non-announcement. Thank you. That’s right. That’s a non-announcement.

Brian: Yeah. I mean, you guys certainly have already published a lot of material about Hashgraph writing in detail, technical papers and comparisons and I mean, that’s great. We have already a lot of substance there to dive into. And there’s also some nice, nice interviews, nice talks to that are available. Well, I think we’re at the end of the episode. Thanks so much, Mance, for joining us today. It’s been a pleasure to learn about Hashgraph. I’m super excited to see what’s going to come out of that, you know, whether your claims are true and it’s going to be as revolutionary, I don’t know, but it’s certainly original and it’s certainly something truly novel with interesting properties. So I think that, you know, I’m thrilled about, I’m sure Meher and many others are thrilled about it to see what comes out of that, so thanks so much.

Mance: Well, thank you. No, thank you so much. I mean, it is different than anything else I’ve seen in the market. We’re very proud of that fact. We don’t want to look like anybody else. And so our approach has been fundamentally different as well and it’s all a very staid, well thought out, mature approach to both the technology and the marketplace. So we’re looking forward to it as well. Thank you for your interest.

Brian: Yeah, and thanks so much for listening. For once again, tuning in. Of course, we’re going to put a link to their Hashgraph whitepaper and many of the other materials, website, in the show notes, if people want to learn more about it and diver deeper, they can go there. And yeah, thanks so much for our listeners for tuning in. So we put out the episodes of Epicenter every Tuesday, usually, although sometimes a little bit later. And you can subscribe to the show in iTunes, SoundCloud, or your favorite podcast application, or you can watch our videos on youtube.com/epicenterbitcoin, I think. And you can support the show by leaving us a review on iTunes or you can of course send us a tip in Bitcoin and Bitcoin Cash or Ether. So thanks so much and we look forward to being back next week.

0:00:00 | -:--:--

Subcribe to the podcast

New episodes every Tuesday