We enter dozens of trust relationships ever time we interact with the Web. Browsers, ISPs, DNS providers, cloud hosting companies, all the way down to the handful of people who control certificate root keys; we rely on the integrity of these intermediaries to serve reliable, and accurate information. The concentration of power by any one of these actors threatens to compromise the very foundational principles of the Web. Decentralized technologies, like Bitcoin, Ethereum, Tor, and IPFS seek to reverse this trend.
We’re joined by Nick Sullivan, Chief Cryptographer at Cloudflare. Founded less than 10 years ago, the company offers content delivery services (CND), DNS, and DDoS protection to over 12 million websites. The company contributes to open source cryptography libraries, some of which are used by Etherum. They recently launched an IPFS gateway and features which allow users to have strong guarantees as to the integrity of the content.
Topics we discussed in this episode
- Nick’s background as a cryptographer and previous position at Apple
- The Internet’s infrastructure and trust model
- How Cloudflare is experimenting with IPFS
- The challenges to hosting static websites with IPFS
- Cloudflare’s Onion routing service (Tor) and the benefits to users
- The Roughtime protocol and encrypted SNI
- Cloudflare’s contribution to open-source cryptography libraries
- The vulnerabilities of DNS and Cloudflare’s free private DNS service (22.214.171.124)
- Microsoft Azure: Deploy enterprise-ready consortium blockchain networks that scale in just a few clicks. More at aka.ms/epicenter.
Sebastien Couture: Hi welcome to Epicenter. My name is Sebastien Couture.
Sunny Aggarwal: My name is Sunny Aggarwal.
Sebastien: Hey Sunny, how’s it going?
Sunny: Going well, how about you?
Sebastien: Yeah, pretty good. It’s been a while since we’ve done this together.
Sunny: Yeah, definitely.
Sebastien: I think Enigma was the last one. You’ve done some great episodes. Congratulations on the Coil. It was terrific.
Sunny: Thank you.
Sebastien: I listened to it twice. That’s how good it was. Today we’re speaking with Nick Sullivan. Nick Sullivan is the head of cryptography at Cloudflare. Cloudflare is not one of your typical companies that we usually cover on the podcast. It’s more of a traditional internet company. But they’re doing some really, really interesting stuff with cryptography that I really wasn’t quite aware of before recording this podcast.
Sunny: Yeah. They do a lot, too, like especially like help bring a lot of decentralization technologies, like Tor and IPFS and what not, to the masses. A lot of blockchain companies are building really cool tech. But it’s truly hard to get this stuff into the hands of every day users. Cloudflare is making a lot of this stuff a lot more accessible.
Sebastien: Yeah, absolutely. I think just the fact that a company like Cloudflare is writing blog posts, quite like long and detailed blog post about what is IPFS and how they’re using it. These posts are read by probably tens of thousands of people outside of the crypto space. It’s great for the ecosystem, I think. Yeah, Nick was a great guest. Very articulate.
Sunny: I really am a huge fan of, I’ve always been a huge fan of decentralization projects that aren’t necessarily like blockchain focused. Cloudflare has been working a lot with Tor and IPFS, which really excited me.
Sebastien: Yeah. We hope you’ll enjoy this episode with Nick. We do have a couple of announcements. I think on the last episode I was on, I mentioned that I’ll be at the Hyperledger Global Forum. It is from the 12th to the 15th. I’ll be there on the 12th and the 13th. I have the discount code now. If you’re interested in attending, it is in Basel, Switzerland. It’s Brian’s home town and Meher’s old city where he used to live. I’ve never been there so I’m excited. There’s a discount code for 15%. It is H-G-F-18-NEWS, so H-G-F-18-NEWS. You can go to the events page. If you search for Hyperledger Global Forum, you’ll find it. If you can’t, remember this discount code. We tweeted it a few days ago so you can see it. It was tweeted on the 23rd of November so you can always go back to our Twitter feeds and see it. Hopefully if you’re there, come say hi. I’d be glad to see you. Sunny, you mentioned you’re also attending two events.
Sunny: Yeah. Next month, December 11th, there’s a company called DoraHacks, which has hosted a bunch of really cool blockchain hackathons throughout the world. In China, in Berlin, and Toronto. They’re actually holding their first event here in SF on December 11th. It’s completely free to anyone to attend. I’ll be speaking and mentoring at this event. If you want to come hack with me on some cool stuff, definitely check it out. It’s free. Just look it up on event bride, DoraHacks SF.
Sebastien: I got a question for you. I’ve never attended a hackathon. What would you suggest? I know how to code. I don’t code like every day anymore, but at some point in my life, I was like a developer. I have some experience with stuff like node. My smart contract development skills are near zero. But practically speaking, if I’m interested in learning how to do things, is a hackathon a good place to jump in both feet first? Is that not recommended for someone like me?
Sunny: I would say that, there’s usually two classes of people who attend hackathons. There’s people who are there to go win, like they want to build a cool project, and end up with something at the end of the weekend. There’s often people who are there who just like to learn something. I’ve worn both hats throughout my hackathon career, if you will. Sometimes I go in with a project I really want to build. I’m like, want to get this done in this weekend. I’ll do that. But then there’s sometimes where it’s like, you know, I just want to learn a new piece of technology. I just try to choose a very, very simple project. It’s like, honestly, on hackathons, when you’re experimenting a new technology. Anyone who’s done this before, you know that half a time it goes into installing the software, which is not fun. But yeah, definitely look at, check out tutorials. I would say, spend a lot of hackathon, instead of like building the product first, spend the first half almost going through tutorials. Then the second half, if you’re feeling comfortable, then start trying to work on a project directly.
Sebastien: Okay. It might be a good exercise to properly take some time and go do some tutorials around people that can perhaps mentor you. A good opportunity to learn if you’re not going to build a proper project.
Sunny: At hackathon, there’s often a lot of mentors there. That’s honestly sometimes one of the most underused amenities that hackathons offer. Definitely talk to the mentors. Then also, especially when I’m doing one of the more learning style, I really like to not show up with a team. I really like to get to the hackathon and find new people there to work with. It just makes it a much more fun experience in my opinion.
Sebastien: Cool. Thanks for the tips. I think maybe I’ll look out for some hackathons to attend then.
Sebastien: All right. Without further ado, here’s Nick Sullivan of Cloudflare. Hi, so we’re here today with Nick Sullivan, who’s head of cryptography at Cloudflare. Nick, thanks for joining us today.
Nick Sullivan: Absolutely. Thanks for having me.
Sebastien: We’re really excited about this show. When I found out that the Cloudflare was dabbling with IPFS, it led me to do a bit more research about what you guys are doing in the area of cryptography. It turns out that you guys are doing a lot of really, really cool, interesting stuff. I’m always really fascinated when companies in the more traditional web space, sort of intersect with companies that were more familiar within the blockchain space, and projects and companies like in the sense of IPFS. We’re really happy to have you on. Let’s start off by talking a bit about your background, and how you got involved in cryptography, and how you landed as a head of cryptography at Cloudflare.
Nick: Sure. I have always been interested in mathematics and solving problems and puzzles and cryptography in general. When I went to school in Canada, University of Waterloo, I did a pure math degree and was enthralled by the abstract notion of taking, understanding the mathematical world. Understanding how objects fit together, how prime numbers work, how you could take something like as simple as 2 and 3 and 5 and 7 and you have this infinite number of interesting problems and challenges to go through, to discover those. After I did a master’s degree in cryptography, I got into the computer security world and worked for a little bit at Semantec. I wrote some documents, basically, on the internet security in general. They have this thing called the internet security thread report that kind of helped analyze what’s going on online. My two passions were the internet and understanding what people are doing in this really amazing interconnected network that we all enjoy as internet and cryptography, which is the science of secret information.
After leaving Semantec, I joined Apple where I worked on a lot of some sort of secret cryptography related efforts for about six years or so. Eventually I learned about this company called Cloudflare, which was a very young start up at the time. But was doing some really interesting things. For example, they had withstood what was at the time, the largest distributed denial of service in history. A lot of what Cloudflare was doing was really interesting to me because they were offering a free service to help accelerate the web as well as protect it from threats. They’re kind of at the center of everything that was going on online. When I joined cryptography, I was the first security engineering focused person at the company. I’ve been here for about five and a half years, growing the team. The company is growing tremendously since then. We’re now a big start up, if you will. Still private company. I started the cryptography team at Cloudflare in order to use this really interesting tool which is cryptography encryption, hatch functions, all these really cool math science that lets you protect information online, as well as provide properties like integrity and non-recreation. I started building a team to help take cryptography and apply it to some of the bigger problems that Cloudflare is facing. To basically spearhead new research in this area. This is what I’ve been doing ever since.
Sebastien: When you were in college studying, or university as us Canadians say, study in cryptography in Waterloo. Getting into your career. Did you have any idea that cryptography would become such an important thing today? Just if you think of blockchains, it’s such a central, it plays such a central role in the functioning of that technology. Also, just generally the web. Do you think that this was something that would become so massively important for the world?
Nick: It’s very hard to see what happened, right? It’s hard to predict what happens. For example, my thesis was on elliptic curve cryptography, which at the time was barely, barely ever used for anything in production. It was, you could use SSL for your website. You’d have encryption for your website. But everything that people were using was based on Diffie-Hellman and RSA, which were the two standard algorithms developed in the 70’s. Elliptic curves were just kind of a new thing. Now, this is actually the fundamental glue that holds together Bitcoin, as well as Ethereum. It’s also the most fundamental cryptography for protecting information online when you’re browsing the internet. It was very hard to see at the time that this interest of mine would become one of the key technologies to enable technology in 21st century.
Sunny: Could you give us a little bit of a brief lowdown on what Cloudflare is overall? It’s like a traditional blockchain company. Some of our listeners I’m sure have heard of Cloudflare but maybe don’t know quite exactly what they do. It’s a relatively young company, actually, right? I think only nine years. Somehow it’s grown to become this centrepiece, very integral part of the entire web infrastructure. Can you tell us a little bit about what are the different kind of things Cloudflare is working on and whatnot.
Nick: Sure, yeah. Cloudflare is an internet security and performance company. The mission of Cloudflare is to help build a better internet. That’s really what we’re trying to do, is folks who operate websites and who operate web services, and who offer services online. Whether you’re the smallest sort of individual hosting your own blog, to a very large corporation, a large enterprise that has massive sets of customers, very, very high requirements, what Cloudflare does is just help make your site, or your property faster, more secure, more available, and give you insights. The way that Cloudflare does this is using the two main, traditional protocols on the internet, http or https, the encrypted version NDNS. Cloudflare has data centers distributed all around the world, over a hundred and fifty. I don’t have the exact number now. But basically in every continent except for Antarctica.
The way it works is if you sign up for Cloudflare, rather than visitors to your site going directly to your site, which could have to travel across the entire world, which due to speed of light, considerations can actually slow things down. You connect to the nearest location. If we have, if you have sort of static content on your site, we can serve it directly from there. We can also apply rules. Rules to protect against different types of attacks. If you think of people doing sequel injections of cross head scripting attacks, all these sort of ramped security things, by being able to inspect the traffic. We can block these attacks. The part that’s closer to my responsibility is that we also can in private, provide encryption.
In the early days of the web, and some of the more challenging things that a web administrator has to do is set up encryption and encryption security for your website. To move from http to https, you have to buy a certificate, or get a certificate issued, and manage the configuration and do these sort of things that are a little tricky. Cloudflare makes that kind of dead simply and handles it on your behalf. Cloudflare as a service has grown tremendously. One of the reasons of that is what we offer a free service. There’s over 11 million domains or so that you sway those free service, which is probably why so many people have heard of it. Yeah, you can sign up for Cloudflare and get denial of service.
Protection, if someone’s trying to knock you off the internet. We can see a bad traffic then we can kind of keep you online while other people are trying to take of you off. It’s great because having all of these different customers gives us a visibility into what’s really happening on the internet. We take what we see from the general set of customers. If you see the attack against one customer, you can use it to protect other people. It’s a real center of the internet kind of thing, where things go through us and we learn about it, and we help make the internet better. We’re not only involved in just providing this service. We’re also, we really care about making the internet scare going forward. Making the internet better. We’re involved in standards, for example. TLS 1.3, which is the recent encryption standard for websites. We were closely involved with that. My team, we do a lot of research on the cryptography side, to see what new ways we can change things, so that in the future, using the internet is safer, more secure, faster than it is today.
Sunny: Are you using your own dark fiber between data centers?
Nick: No, we use the internet which is why we rely on strong encryption so much. Every one of Cloudflare’s data centers is independent and, I guess you could say technically decentralized. Although administratively centralized. We communicate over the internet, over different interconnections with different networks. Cloudflare is actually the most connected network on the internet. We have more hearing sessions with other networks than anybody else online.
Sebastien: Yes. We use Cloudflare on the website. We use the paint service, and I also use it on some other websites like the free service. I kind of see Cloudflare as this nice blanket of security, but that also provides like a bunch of optimizations. It serves your CSS and you java script super fast, and your HTML. It has these built in fortress. You can call upon at will if you’re being attacked, come into action. If certain rules are being triggered. Yeah, it’s a really great service. No wonder that a lot of people are using it. It does show up in a lot of places on the internet. You very often see Cloudflare landing pages and like CAPTCHA landing pages quite a bit online. We’ll come back to the CAPTCHA thing later.
But in September, you and some colleagues of yours wrote a series of blog posts and will link to these. I strongly encourage that anyone listen to this, too. Check out these blog posts because they’re really terrific. It’s called Crypto Week, so welcome to Crypto Week in which you describe all the different things that Cloudflare is doing. With Crypto, sort of innovative stuff. With IPFS, with the rest of the tour, like DNSEC. Reading these blog posts, I was like, these are great primers for anybody that’s really looking to understand fundamentally how this stuff works. Like how does a DNS, like how does your HTP request function? When you call a website, who are the different parties at play here? Where are the trust points? Where are the vulnerabilities? How is Cloudflare doing it better? I thought these posters were terrific. But in this post you mentioned the trust relationships that one has to engage in when using the internet. Whether that’s visiting a website or chatting online, or like using social media, what are your thoughts about how we trust the internet in a broad scale? Do you think most people have a good understanding of where the trust points are on the internet? If not, how can companies like Cloudflare help make that better?
Nick: Yeah. I would say in general people don’t understand the trust relationships online. You enter in a website and you go to that website and it comes to you. You enter in host name or your elf and it goes to you. You click on a link, or open an app and you just get content. But there’s a lot of interesting things that go on behind the scenes. A lot of these have to do with trust and trusting and actually the implicit trust that is built into the technology that you’re using to browse the internet to show you what you expect, and to make sure that what you’re getting is something that you’re intending to get. There are a lot of parties that are involved in this. Some of the very obvious ones are registrars. A registrar is a company that you use to buy a domain name. If you buy google.com, or mysite.com, then you have a registrar. You kind of work with this registrar to make sure that your website is advertised. Your registrar is connected to a DNS provider. When you type in Cloudflare.com into your browser behind the scenes, you have to know what IP address is on. There’s this entire system called DNS, which is a name system. Which is managed by a lot of different entities around it. It’s sort of one of the first decentralized systems where, I guess higher systems out there. You have to look at where who.com is. Then .com tells you who example.com is. Then you talk to example.com and then it will tell you what IP address you actually use to connect to example.com. From just the names to numbers, respect of the internet is based on IP addresses, your numbers. DNS is kind of the phonebook that goes from your name to a number. Other pieces that you have to trust involve when you’re doing encrypted connections. If you’re going to an HTTPS version of a site, that site has a cryptographic key. This is embedded into a certificate. They present you a certificate and you do this sort of handshake, and then you have a secure channel.
One of the things that your browser has to do is know how to trust which certificates. Correspond to which websites. This is another system, another sort of system of different organizations that make up something called the public key infrastructure. Your browser trusts a bunch of certificate authorities who are the only ones that are allowed to mint certificates for different host names. The system has been around since the 90’s. There’s been some problems with it over time. Certificate authorities have been compromised and that’s put a lot of people at risk. Certificates themselves need to have an expiration period, or else certificates from the 1990’s using old cryptography that’s been broken, would still be valid. There’s a lot of challenges with trusting this. We don’t even need, we can’t even go into this even more. But even at the lower layers of the internet, IP addresses, the internet is a set of hundreds of thousands of interconnected networks, that have to actually exchange data. When you’re one network and you say, hey, this IP address 126.96.36.199 or 188.8.131.52 belongs to me, then well, you need to actually trust that. when someone says yeah, send that traffic to me, but it actually belongs to you. There’s multiple different layers. The trouble really goes into this in depth.
As a general user, all of these is happening behind the scenes. You really have to trust it. There’s the very minimal thing that you have in browsers which is that padlock, which does imply something. It implies that the certificate that you’re getting is valid for the site. This is the site that you’re trying to go to. But there’s a lot of threats out there. There’s a lot of ways that people try to manipulate this and hijack this and steal people’s traffic. But generally, this is not a well understood thing by the public. Companies like Cloudflare are investing in various technologies that help simplify this for folks, help make it so that if we are connecting with other entities around the internet, that we can trust them. We have to agree on protocols to do this, and define these protocols and implement them, and get everyone to kind of agree on standards. That’s one of the interesting organizational challenges, interorganizational challenges that, that we have to deal on what’s right now. But luckily for our security and for people’s privacy online, is that there are a lot of organizations we do care about this and who are impacted when these things happen. Companies like Cloudflare and others are working to help improve the situation.
Sunny: A lot of these authorities that you mentioned like, for example, the certificate authorities. You mentioned DNS as like a hierarchical system, where do these authorities come from? Who decided them? Was it just happened to be like all the companies who are around back in the 80’s? They just happened to be grandfather Dan, or how does that process work?
Nick: Well, yeah, the internet has evolved over the years in various different ways. We can go into like the origins of the internet as a dark book project. The switch to TCPIP in the 80’s, and the evolution of the DNS. But it really happened organically over time. Then some organizational bodies have been put in place to help guide this. For example, internet protocols, there’s a volunteer group called the internet engineering task force, IETF. If you’ve heard about RFCs, when people say, RFC whatever, whatever. This is a certain protocol. DNS is a set of RFCs. That’s what the IETF does. There’s AYANA which is an organization that is associated with managing names and numbers. They have lots of processes around that. There’s I can, there’s the set of regional registries. There’s the entire IP space North America has a group called Aaron. They distribute up the IPs to different organizations by different bids. These are often organizations that are mix of profit, non-profit, but generally have a mandate to be good stewards for the internet, and to make sure that this is, this technology that we all rely on is something that is available for everyone in the world, that enables kind of equal access, and that continues to grow in terms of having commercial and non-commercial uses..
Sunny: One thing I find interesting is, often when people are talking about like cryptography/blockchain things, there seems to often be like three somewhat independent goals that often get correlated together. But I think actually, it should often be thought off by someone independent. I think the three here, what I see is like privacy, security and decentralization. The third one, decentralization is just like very vague concept that came up in the last few years along with the blockchain space. Reading through your blog post, welcome to Crypto Week, you talk about a lot of the stuff about the mutability that IPFS provide, which just kind of goes along with security. You talk about the privacy that Tor provides, but not so much talk about like decentralization. Would that be a fair characteristic to say that when you guys are approaching this like cryptography on the internet, you guys are really much more focused. Almost like you’re willing to accept this authorities and centralization that exist on the internet blog, trying to focus primarily on improvement. Almost like becoming one of the central authorities on the internet. But really trying to focus on pushing the security and privacy side of things. Would that be like a fair characterization?
Nick: Well, I would say that Cloudflare is trying to serve its customers. Cloudflare’s customers are not only websites and web services that use Cloudflare. But you think of the users of the internet as a whole. If the internet becomes more functional, and if people are happier online, and are more likely to do business online, then it leads to the growth of the entire industry. Security is one of the very, very, most important things for the company. If you get high, or somebody steals data from your website. Someone tries to mess around with your users. This is gonna impact trust. It’s gonna impact the bottom line for a bunch of businesses. Same with privacy. If you think of how people are really waking up to privacy online and what you share and what the motivations of organizations that are based on monetizing individuals actions online, have done, and how that’s grown, I think it’s another really big, really big salient thing to human. Security and privacy I think are things that human beings understand and relate to, and businesses understand.
Decentralization is, it’s more of a second order goal, right? If you don’t have decentralization, you have these, if you have sort of fully centralized systems, you have these really, really inherent risks to your system. If you think back to the mid 20th century, the telephone system in the United States, bell had this massive monopoly over the way that communications, telecommunications happened. That led to a lot of really fascinating and amazing innovations. You think of the transistor, a lot of radio communications and all the sorts of amazing things they created, and they actually did connect everybody online. But until bell was broken up, we didn’t have this disability for all of these internet companies to kind of come out of nowhere and be able to compete with each other. You have centralization and I guess, if you think in the corporate terms, monopolies are ways to build wealth and make something really good. But it also leads to the ability, the tendency to kind of abuse power. Having a diversity of participants, a diversity of views, and a diversity of components in the system. I guess decentralization is one component of that is I guess a result of having a lot of different participants, is something that actually really helps innovation, helps competition, and helps thing grow. It’s less relevant to individual customers and people. But it is a second order goal. It is something that we think about as well. When talking about the cloud computing space, and how people are running services, we do worry about companies that are massive, central points of lock in. If you think of the AWS reinvent conferences going on this year. It’s the largest trade conference in the United States. That’s a company that wants everybody to put all of their computing workloads onto a single company. There’s a lot of lock in associated with that. I think from a cryptography perspective, decentralization is important. But I think also from a business perspective, having a lot of different options is important for a healthy ecosystem.
Sebastien: Yeah, there’s this thing that you might be familiar with which is Zooko’s Triangle. Zooko Wilcox-O’Hearn, the founder of Zcash and Zooko’s Triangle. It’s like you have security, decentralization, and human readable names. I think there’s a lot of overlap with this question here where I think like, user experience also plays a big role, or should be considered in how we build systems. If you had a system that’s like secure and easy to use, where you don’t have this robustness which is meant to be brought on by decentralization, then really, you might have to choose between two of those in a the three points on the triangle. I don’t know if someone will actually solve that, but it seems difficult.
Nick: Yeah, it is difficult. There are trade-offs that you can make in an one of these little corners. Finding the right ones are, finding the right trade-offs are hard to do. But considering where we are as a status quo, there’s always improvements to be made to try to square Zooko’s Triangle, if you will.
Sebastien: Yeah. Let’s move on to the core topic today of what we wanted to bring you on to discuss, and that’s IPFS. Within this Crypto Week series of blog posts, there was two blog posts about IPFS. One that’s sort of explaining what IPFS is for the average person who doesn’t necessarily know about IPFS. Another post that describe this experiment was based on this concept of end to end integrity. Could you describe, why is Cloudflare experimenting with the IPFS? What are you guys doing here?
Nick: Yeah. I think one of the important things that Cloudflare’s trying to do is to, as I mentioned, make the internet better. One of the aspects of this is connecting users of the web to some of these new networks that have values and have properties that the current web doesn’t have. IPFS is one of them, as a content addressed network, every piece of content has a hash, as a specific unique fingerprint associated with it. I’m like the wed where you look things up by a names. With IPFS, you can look things up via a fingerprint of what they are. The traditional web is not necessarily immutable. You have different things that can happen. You have a lot of very dynamic web pages. You have services like Cloudflare that can see and detect things going wrong, and sort of modify and optimize things on the fly, which is great. But with IPFS, there are certain use cases that people have for this where they just want things to absolutely be guaranteed that you’re getting exactly what you were set.
If you think of things like package managers, or image sharing, or things like this where you have something that’s static that’s never gonna change, then IPFS makes a lot of sense. The IPFS gateway is the first of what we’re calling the distributed web gateway, which is a way to access IPFS as a network through HDP. People have web browsers. People don’t necessarily, on a broader sense, there’s a lot of experts and people who are interested in the space who are really keen on these decentralized networks who run nodes and are happy to do these sort of things. But the general populous has a web browser, and they know how to use a web browser. What this gateway does is allows people with web browsers to connect directly to IPFS. IPFS is static. Cloudflare’s a really, really great service for that because we can do caching. We can keep copies of data really close to people. We can distribute data all around the world.
You mentioned the experiment that we did, which is a browser plugin for intent integrity. I guess one of the purists’ complaint about having a gateway to something like IPFS, is that as a question is, what if the gateway changes the value, or changes the content? The value of IPFS is in the fact that it’s content addressed. If you build a website, it’s guaranteed to be the same for every single person who sees it. There’s no censorship, there’s nothing like that. IT’s just, you publish one thing once, and then it becomes, they’re in the universe forever. This is why it’s called the interplanetary file system, or one of the reason is that, publish something once, it’s available at all times. If you have a gateway, HTTP is, as I mentioned, it’s not really based on this integrity concept. But with an IPFS gateway, you can put the hash as part of the URL. With this extension, you actually can validate that that hash in the URL matches the hash that you expect. The way that it’s actually chained together is with DNS. If you have a website, you can say, in the typical sense you have, here’s my host name, and it gives you an IP address. That gives you the address. This is about routing. With our IPFS experiment, you have, this is the hostname, this is the hash that represents the content on this website. What the browser extension does is that it makes sure that what you’re seeing on the site matches exactly what was published in the DNS. This kind of ties is with our other efforts of the week, especially DNS which is just signatures in the DNS itself. If you trust the DNS, and you trust the DNS central authority, then this is a way to put IPFS into an existing system, to help validate the integrity from within the browser.
Sunny: How is like the adoption been of this Cloudflare IPFS? You can almost consider IPFS as a CDN of sort, a Content Delivery Network. Have you seen that like Cloudflare’s offering has like helped increase the adoption? Because I actually tried to put my website on IPFS. It’s been a while actually. It’s probably over a year now. It’s probably a little bit more immature. I had quite a hard time doing so. You guys have built a lot of tooling to make this easier and stuff. How has been like that public reception and stuff to this?
Nick: Yeah, I think people are really excited about the IPFS gateway. They’re really excited because of the possibilities that it unlocks. Content hosting site on IPFS, I agree, it’s relatively immature. If you want to host something on IPFS, you can host it from your local laptop, or you can use one of these services, that’s a pinning service. But yeah, the publishing side of it, I think needs some development. But actually integrating the access side is where the gateway really shines. We’ve seen all sorts of different customers, or websites, or properties that really believe in decentralization, and believe in having a source of truth for their data that is distributed beyond their own data centers. This is actually good for things like disaster recovery. They need a way to boot strap their app. They need a way to boot strap their application. The fundamental belief is that we want to build this in a distributed way. But, we don’t necessarily, you know, one of the drawbacks of IPFS as it is, is it’s relatively slow to actually get content. Having this gateway is a way to speed things up. You get all the benefits of Cloudflare in front of this network that you have, integrity protection, and you have decentralization. It’s been coming up. We’ve definitely seen a lot more adoption since we’ve launched this several months ago. Not just from the distributed application space, but also from more traditional companies as well that have an interest in decentralization.
Sebastien: If I can just rephrase what you guys are doing here. There’s different components, I think, that need to be separated out itself. The first is, an IPFS gateway. There are tons of IPFS gateways out there. I think most of our listeners are probably familiar with them. There are these websites that you go to this URL. Example, dashgateway.com, I think is one of them. You go to this website, you pop in, you just add the IPFS hash to the URL. It serves like this gateway is in the back end connected to an IPFS note. It is serving to you the content on the IPFS network. The vulnerability here is that perhaps this website is doing a man in the middle type of attack where it’s serving you another piece of content than the one that you initially requested. You have really no way of knowing that unless you do an MB5, or verify the hash of the content once you’ve downloaded it, and verify that it matches with the hash of the address.
Nick: Yeah, that’s right.
Sebastien: What you guys are doing is like a step beyond that. You’re actually putting one of those Gateways in the Cloudflare sort of wrapper. All the IPFS, all the content on IPFS is now available, superfast, in one of these 150 data centers that you mentioned earlier.
Nick: Yeah, that’s right. That’s the club where IPFS gateway is, yeah. It’s like you take any typical gateway and then you cloud verify it.
Sebastien: That’s great because all of a sudden you have this really fast content delivery network that’s serving up IPFS content. It’s kind of similar, I guess. It reminds me of this project we had a few weeks ago called blocks road, which is like networks for blockchains. But yeah, it’s sort of similar to that. But then the issue with that is that if you’re using it, maybe you would trust examplegateway.com because some nice crypto person is hosting it. Some of you may meet in the conference, you might trust that person. The issue here is that, people might not trust Cloudflare, or at least Cloudflare would like to prove that the content that they’re delivering to you is actually the content that you requested. What you’ve built here is a browser plugin that checks the IPFS network and make sure that that content matches the hash that you requested.
Nick: More specifically, it checks that the hash, if you’re using the Cloudflare gateway as Cloudflare-ipfs.com/your hash value, it checks the value of the content against that hash. The really, really cool part that I didn’t actually go into detail but …
Sebastien: Sorry, so it checks the value of the content against the hash. It does this in the browser. It doesn’t go and do an IPFS request, a parallel request to verify. It checks the hash and it does the hashing algorithm internally, and verifies if it matches. It’s like doing an MD5 verification.
Nick: It’s like MD5 but better hash function. MD5 is a little breakable right now. But the other really cool thing that you can do with Cloudflare’s gateway is bring your own hostname. Rather than have Cloudflare-IPFS.com/whatever, you can just have mywebsite.com and you just say mywebsite.com is on IPFS. Here’s the hash of the root file of my website.
Sebastien: Okay, so this is the third thing I wanted to talk about. There’s the gateway, there’s the verification tool. But then what wraps this together is saying, okay now, as a Cloudflare user, as someone who has a website hosted on Cloudflare like Epicenter for instance, you can set this this up on your website, or your Cloudflare account and tell Cloudflare, index might, you guys are running in IPFS mode, and it creates, basically, a copy of your website and all the web pages on your website, static content on this IPFS node that is now available to the entire IPFS ecosystem.
Nick: Yeah, it’s sort of like that. It’s more that you have to put your content onto IPFS some way. If someone tries to access your website through Cloudflare, we will fetch it from IPFS. Cloudflare as a service doesn’t host content. This is sort of a very important key part of what Cloudflare does, is we cache content. We need a place, some source of truth. If you’re gonna use the service, you can run a local note on your computer and say, I’m gonna host from here. We will grab it from there, we’ll keep a copy around as long as we can and serve it from Cloudflare’s cache. Alternatively, you could pay a service to keep a copy of your content on IPFS. That’s the host. Then Cloudflare just goes on to IPFS, fetches your content, puts it around the world and anybody who wants it can get it through us. Does that make sense?
Sebastien: Yeah, that makes sense.
Nick: We’re not actually hosting things on IPFS. Although if you fetch something through IPFS, our node will have a copy of it, so it actually helps improve the duplication of content in IPFS. Which is really important because if there’s only one copy of your content in the IPFS network, then if that copy goes offline, then the content is not longer available.
Sunny: Essentially what you guys have done is allowed for that third part of this project, you kind of allowed IPFS to almost integrate well with the existing DNS system, right? I can now have my website accessible. My IPFS hosted website accessible through my own personal domain name, but still going through Cloudflare is like CDN?
Nick: Yeah, that’s right. Because we are so good at issuing certificates and kind of managing that, then you also get encryption for that website, sort of automatically.
Sunny: How do you see the future of IPFS? Do you see it being sort of a complementary service, or protocol to http, or it’s more competitive? Do you see like maybe websites will be served over http, but like certain assets on the website are over IPFS? How do you see this like amalgamation of these two protocols going forward?
Nick: It’s an interesting question because nobody really knows. The hope is that IPFS provides a specific niche in a specific property that HTTP doesn’t, and that the expectation would be that they would both live in parallel. You can’t necessarily do a lot of dynamic stuff with IPFS. But the integrity protection that it has, and then the actual distributed nature of the hosting, I think, makes it useful for specific applications. I think you will find applications that are mostly GDP, applications that are mostly IPFS and applications that are sort of a mix of the two. It really depends on how well browsers and other technologies adopt this. If you have like a mobile app that has native IPFS support, or mobile SDK that comes up with that native IPFS support, then maybe it will become more popular in apps that would need this. But yeah, I tend to see them as complimentary. They both have their advantages and disadvantages.
Sebastien: Yeah, I was speaking with Juan Benet at Web3. I kind of bumped into him and asking him about this very thing. From his perspective, browser support is at least, partly possible. I guess like chromium is supporting it now, and like they’re diversions, and maybe Firefox will support it sooner. I can’t remember exactly. But browser support is coming. I was actually quite surprised to see how fast it has come. We had them on, I guess it was episode 100, well to a 163 weeks ago or something like that. I thought this will take years to get integrated in the browsers. But it seems like it’s moving much faster.
Nick: Yeah. My understanding is that the path that they’re taking is first exposing IPFS as a first order protocol. You have http:// whatever. You can have IPFS:// whatever and the transition to get there is that if you have IPFS://, you can register a plugin that is able to handle that for you. That’s sort of the first step. Then eventually down the line, the IPFS node will potentially be native in the browser. But right now it’s all browser extensions.
Sebastien: What did you learn from this?
Sunny: Cool. One of my favorites stories actually regarding like IPFS and gateway is I was talking to Jeromy Johnson from Protocol Labs. About a year ago, last Devcon 3, so November 2017. This was like right around, like right after, like the whole Catalonian, cattle on, referendum around independence that was going on. What was happening during that process was the Spanish government, there were a lot of pro-referendum websites. People, like website showing people, like how to go vote, like reasons why, just general pro websites. The Spanish government was censoring these and shutting a lot of them down. What was really cool was IPFS was actually being used to keep some of these websites up. People were like hosting them on IPFS. I thought it was really cool because it was one of the first times that this generation of decentralization technologies has really been used like Cosico, like a tangible impact on like world politics or what not. But then there’s something interesting happening where the websites were being hosted on IPFS. But everyone who’s accessing them through the IPS got IO gateway. What the Spanish government essentially ended doing was actually censoring the IPFSl.io gateway. Now people weren’t even aware of any other gateways. People didn’t have the software. It’s not easy to install the IPFS software. It has suddenly become very inaccessible to them. This leads into the other, one of the other kind of centrepieces of your crypto week that you had was about Tor. How do you see this interesting relationship between IPS and to, and what kind of IPFS gain by being served overture?
Nick: Yeah. I think of Tor as in the same family of technologies as IPFS and a lot of these new blockchain distributed web type technologies. Because it really is a lot of independent nodes that work together to create a property that you wouldn’t get with the regular web. With Tor, what it does is it provides you with routing anonymity. It uses a layered encryption approach to do so. In terms of their trade-offs, latency is one that they just don’t really care about. It’s actually anonymity is much more important than getting things quick. The typical web, I mean, the unencrypted web and potentially even IPFS, if you’re talking about distributing this content, it’s the opposite of anonymous, right? You’re connecting directly with another person and requesting a very specific thing. They know what you’re asking and they know who you are. But it provides integrity. You have one network that provides integrity, and one network that provides anonymity. It sort of made sense to, if you want both, you can kind of put on top of the other.
What Cloudflare launched during crypto week, was essentially a way to access the Tor network. It’s kind of like clouds that are put on IPFS node into the IPFS network. Clubs don’t put a Tor node into the Tor network as well. This Tor node is used to route any traffic to any site that’s on Cloudflare. If you connect through Cloudflare’s Tor node, which is an address. We’ve got about ten of them. If you connect to any one of those and make a request for any site that’s on Cloudflare, it kind of goes though. The bottom of the diagram that I think you’re referencing on the page shows a user going future, and then connecting at the Tor point through Cloudflare and then to the Cloudflare IPFS gateway and then to IPFS. I think if you’re doing so, you’re gonna get a very slow connection, but it’s going to be very private. Even Cloudflare doesn’t know who you are. But you also gain the integrity properties of IPFS. They’re pretty cool complementary technologies if you’re okay with things being extremely slow.
Sunny: I see. This whole onion routing service that you guys built, I know in like the past year, especially on like hacker news and stuff, there’s a lot of, people like to like blame Cloudflare’s, like especially the recapture features, sort of like the degradation of the user experience on Tor. I always thought that it was a bit of like an unfair blaming. But could you explain a little bit of why this whole recapture system is so necessary in Tor? Then how your onion routing service protocol helps resolve some of those pinpoints?
Nick: Yeah, absolutely. As I mentioned, people come to cloud surf for security, inside acceleration, things like this. Security is one of the main things. If you talk to the average web master, or the person running the website, they actually don’t really have a very favourable opinion of Tor because as an anonymity network, it’s very easy to send abusive traffic through it, and not have to deal with the consequences. A lot of the traffic that actually comes through Tor and comes through exit nodes is attack traffic. It hits our web application firewall. We say, what is this, and sort of block it.
The way that the Cloudflare is currently set up, and we’re hoping to improve the system is, is to use something called IP reputation and IP reputation databases to help make a determination as to how likely an HTTP request is going to be malicious or not, or a part of a flood or not. Is this an attack or not? What we do is we use a capture to kind of prove that it’s to force the user coming through to force, to prove that they are a human, or at least able to solve one of these human interaction puzzles. Once they prove that they’re a person, then we say, okay, great. You can come through, do whatever you want with this website. But where are you coming from seems to have a lot of bad requests. The kind of danger level gets elevated. This is something that our customers expect, is that they have to pay for bandwidth, they have to pay for what it takes to administer a site and run it and deal with comment spam, and deal with all these sort of things.
This IP reputation is a very coarse way of lowering the amount of crap that you get, if you will, on the site. Because of how Tor works, is that there’s a couple, there’s a small set of computers that are called Tor exit nodes where the traffic goes into the Tor network, and exit out of those nodes into the internet. These IPs tend to be given a pretty bad reputation because there’s so much bad stuff coming from them. This is kind of the crux of the reason why people see so many CAPTCHAs while using Tor, and why Cloudflare is sort of being blamed for the degradation of this network. We didn’t like that. We think that Tor is a valuable tool. But we still need to protect our customers from attacks, and these are, this is who we’re building the service for. These are the people who we want to use Cloudflare. We still want to give them that service.
But we also think that the secondary effects on the internet as a whole are important as well. Having more people use an anonymity network, having people use, gain this properties of these alternative networks if they choose to use them, and not be punished for it, is something that we’re really interested in. What our Tor gateway does is it allows folks who are browsing websites on Tor, to actually access Cloudflare websites through, as I mentioned, a node that’s running in the Tor network, that has an onion address. I guess every time that you connect through the Tor network to an onion service, you connect through a circuit. There’s an entry node, there’s a transit node, there’s third node, and then you can connect to the site. Every one of these circuits is unique for every person. When you run an onion service, you actually get a circuit ID. You get to know whether or not two different connections to the same service are from two different people. Because of that, you can actually apply policies on a very selective basis.
If someone is actually sending a lot of comment spam, then you can say, this circuit is bad. We can block this without blocking legitimate people. I think this is one of the great things that we helped put together with this, with this Tor thing. We worked with the Tor browser team as well, to help implement this. If you visit a site that’s on Cloudflare, we’ll send an HTTP header that says, hey, by the way, if you’re gonna reconnect, we have all these onion addresses. You can just use these and connect through Tor, instead of connecting through an IP address. This has been very successful actually. We turned it on for all the Cloudflare sites.
Sebastien: With all of this, you guys are quite involved in the open source phase. In fact, Sunny was mentioning earlier, and I wasn’t really aware of this, that you guys have quite a few crypto libraries that are open source. In fact, some of them are being used by Ethereum, and a bunch of other websites, like pretty much off the internet is using you crypto libraries, how does this, and this experimentation with IPFS, and this Tor stuff that you guys are working on, how does this all fit in to your business model? Are there specific businesses here that you’re looking to develop? Is it more just being at the cutting edge of these technologies and allowing the experience of everybody using the web to be improved?
Nick: It’s part of the mission statement of the company, which is to help build a better internet, and open source is something that’s core to what we are. I think Cloudflare doesn’t necessarily have secret sauce in the software, right? Almost everything that we use, we try to open source. Because it will be usable for other folks online. For example, four years ago we released a library called CFSSL which was a go based certificate authority. You can use it to build certificates and build a peak inside your own organization. It actually got picked up by Let’s Encrypt. Now it’s the core of the Let’s Encrypt certificate authority, as well as sales force and a bunch of other really big companies are using it. We’ve contributed code to the go standard library. The Key 256 which is one of the most well, commonly used elliptic curves, one of Cloudflare’s engineers, we optimized it because we do so much cryptography. Why not share this with the world? I think it’s, there’s no draw back for everybody having a better version of cryptographic tools. If you have a faster library that’s secure and safe, put it out there for people to use.
Sunny: I guess so far we’ve talked a lot about two major decentralization technology which is IPFS and Tor. But one that we haven’t really talked too much about yet, which is probably one of the most well-known is blockchains, right? I was wondering, how do you guys think about blockchains? I know you have this one protocol you dubbed clock chain, as a joke. That one, talking about a timing system for SSL certificates, you know, synchronized clocks. But another option is, I actually worked on a project where instead of like doing SSL certificate expiration, you can do a blockchain access, a public bulletin board where you list compromised signing systems. Another used case where I think blockchains within the web infrastructure is throughout this entire thing you guys have talked a lot about using the DNS system. You talked about how you’re using DNS for the IPFS resolution with the ER. You have this other project called encrypted SNI, which you’re trying to basically create like a PKI. You’re using the DNS system to do that as well. Like we mentioned, the DNS system is a very hierarchical system. Have you ever thought about maybe exploring the option of using blockchains to do so? We mentioned Zooko’s Triangle earlier as well. The cool thing about Zooko’s Triangle was this whole thing about human readability and centralized internet security. But Aaron Schwartz actually had this observation that a blockchain actually is a way to get around Zooko’s Triangle. That kind of led to projects like name coin and handshake and things like this. I guess my overall question here is, how do you guys think about integrating blockchain technology into some of your offerings, or just in the general web infrastructure as a whole?
Nick: Yeah. I think there’s another kind of trilemma that our CEO Matthew Prince put out in a blog post about Tor a few years ago, about making things useable, secure and having low latency. I think when you’re in the web context, this is something that’s very underrated, is the ability to get things fast, and to get things immediately. When it comes to certificates and time, and a lot of different things. If you’re connecting to a website, 100 milliseconds is gonna kill you. There’s a number of initiatives that we’re interested in that are blockchain-esque, that are blockchain that sort of seem blockchain-esque. One of those is certificate transparency.
One of the, I guess, one of the main differences here is that in a lot of the blockchain technologies that we’re talking about, we’re talking about fully trust less decentralized systems where you have a lot of different peers. This is why consensus is so important, is being able to have all these different peers all sort of agree on a specific thing. I think in the web PKI, and at least in the website situation, that’s fine but that’s sort of step too far. Or at least it’s a step that’s a little bigger than the technology is willing to take us right now. Certificate transparency is an example of one step. It’s essentially a hash tree of all the certificates that have ever been issued. For certificate transparency to work, you need independent groups to manage these certificates as well. You end up in something that’s sort of analogous of a permission blockchain. With certificate transparency, you actually don’t have to do the lookup on the machine. You don’t have to run a node on your machine. You don’t have to synchronise with the blockchain. The cost of latency to a system like this is not big enough to slow its progress. I think that the main challenge for integrating web PKI traditional web technologies and blockchains is really about being fast and being able to synchronise things, and being able to transfer data fast, and being able to have fast consensus. Having a fully trust less system is not necessarily conducive to that. Although we’ve seen some pretty good experiments in that direction.
Sunny: I see, cool. Earlier you had mentioned that this IPFS gateway is just one of the first projects in this larger decentralized gateway series of projects, almost. What are some of the other projects that are like you have plan in the sphere of decentralized web? One that I thought, I think it would be really cool was like, it would be like maybe in your Cloudflare DNS, that one, maybe integrate like name coin resolution, I thought would be a really cool idea. But I don’t know. What are some of the other ones that you guys are thinking about?
Nick: We’ve talked to the name coin folks. We’ve talked to folks at Ethereum. We’re really kind of testing the waters at this point. Right now we’re mostly investing in, how can we make the IPFS gateway better. That’s what the short term roadmap looks like. But down the road, there’s so many interesting technology in this space, solving different problems. You shouldn’t be surprised to see any one of those pop up down the line.
Sebastien: Yeah. You mentioned that one. That’s basically a free DNS service that you provide. It’s similar to like Google DNS or open DNS, something like that. But with privacy, apparently, I was reading your website earlier, and I guess KPMG is auditing your serves to make sure that you’re not actually like logging anything. Privacy is a big deal here. I’m curious, what goes into buying, with people sort of hear about flipping domain names, and paying enormous amounts of money for domain names, what goes into buying the IP address .184.108.40.206?
Nick: Well, we didn’t buy .220.127.116.11. It’s actually, I mentioned how there’s different authorities that manage IPs. The one space is actually owned by AP Nick, which is the Asian Pacific region for distributing IPs. They never thought that they would be possible to even give this IP address to anybody, because it was so, so bad in terms of the amount of garbage traffic that would come to it. Anybody who’s building any sort of test for an IP address in any documentation is gonna say 1111. It’s just the simplest example that you can use. There’s an enormous amount of internet background radiation hitting the 1.1.1 IP address that they were like, we can’t allocate this. There’s no reason anybody would ever want to use it. It’s basically constantly under DDOS from just the background internet radiation. Cloudflare was one of the organizations that, in the world, one of the few that could have actually, that’s not big deal to handle a bunch of unexpected traffic. We made a deal with AP Nick, and they’re lending us the IP address for this project. It’s been a pretty fruitful collaboration with them so far. A really successful project.
Sunny: That’s pretty funny. It kind of shows off your DDOS capabilities as well, protection capabilities.
Nick: Yeah, absolutely.
Sebastien: One other things we should have mentioned earlier, but I guess in your office in the lobby or something, there’s a bunch of lava lamps there, generating entropy, can you tell us about that?
Nick: Sure, yeah. Anybody who saw the first episode of NCIS this year, might recognize they kind of stole the plot idea from Cloudflare’s office. But yeah, we have a wall in our front lobby that has about a hundred lava lamps. We record it with a digital recorder. We turned that data stream into a source of random numbers that we actually send out to our data centers and our servers and feed it in as an additional random source.
Sebastien: Is there any academic research or anything like that that would suggest that lava lamps are actually random?
Nick: Well lava lamps themselves are pretty unpredictable. But the main thing is it doesn’t really matter if you have a sufficiently advanced camera. There’s gonna be enough noise in it to actually create enough entropy to be a useful source. Also the lighting is a big part of it at any time of day, you’re gonna have different sources of light and people walking in front of the camera. There’s enough entropy in an HD film to use for a lifetime.
Sunny: I’m sure the temperature fluctuations in the room also affect the lava lamp as well?
Nick: Yeah. It’s very hard to predict the levels of the lava lamp. But it’s to predict everything else, all the other atmosphere conditions, basically impossible. Even if they were predictable, we mix it in with other sources such as hardware random numbers.
Sebastien: Okay well, with that, Nick, I want to thank you for coming on the show today. It was a fascinating discussion. I look forward to seeing what comes out of Cloudflare in the future. I think now that things are so easy, thanks to Cloudflare. We might look into making our website available as an onion domain or like available IPFS, do something like that.
Nick: Yeah, absolutely. Thanks for having me on.