Multiple Vantage Point Validation to Secure Domain Validation

This OTF-supported effort aids the deployment of a more secure domain validation protocol to secure Internet domain validation against attackers that manipulate Internet routing via Border Gateway Protocol (BGP) hijack…
Mon, 2022-06-27 14:00

The Public Key Infrastructure (PKI) protects users from malicious man-in-the-middle attacks by having trusted Certificate Authorities (CAs) vouch for the identity of servers on the Internet through digitally signed certificates – usually displayed to users on their Internet browser via a small padlock icon near the address bar. Ironically, the mechanism that CAs use to issue certificates (domain validation) is itself vulnerable to man-in-the-middle attacks by network-level adversaries.

Adversaries that are capable of launching Border Gateway Protocol (BGP) attacks can spoof the domain validation process and obtain digital certificates for domains they do not control. This has devastating implications for the Internet freedom community because it allows for BGP attacks (which are often used by nation-state actors) to compromise high-security TLS traffic like the HTTPS connections that are used in banking, and medical services. Thus, the privacy and security of Internet communications has a critical dependence on the domain validation protocol.

A strong countermeasure to protect domain validation from BGP attacks is multiple vantage point validation. Using multiple vantage point validation, a CA validates a domain from multiple diverse vantage points spread throughout the Internet. This ensures the CA has a global view of Internet routing and prevents the CA from falling victim to BGP attacks that often only affect a portion of the Internet.

Through the multiple vantage point domain validation project, we were able to deploy and rigorously verify the effectiveness of multiple vantage point domain validation at the world’s largest certificate authority (Let’s Encrypt). We made the first steps towards industry-wide adoption by publishing our results in our Usenix Security ‘21 paper (“Experiences Deploying Multi-Vantage-Point Domain Validation at Let’s Encrypt”), which was a finalist in the CSAW’21 applied research competition (https://www.csaw.io/research).

We successfully explored the design space of multi-vantage-point domain validation to achieve (1) security via sufficiently diverse vantage points, (2) performance by ensuring low latency and overhead in certificate issuance, (3) manageability by complying with CA/Browser forum requirements, and requiring minimal changes to CA operations, and (4) a low benign failure rate for legitimate requests.

Our open-source implementation deployed by the Let’s Encrypt CA in February 2020 has secured the issuance of more than a billion certificates during the first year of its deployment. Using real-world operational data from Let’s Encrypt we found that our approach has negligible latency and communication overhead, and a benign failure rate comparable to conventional designs with one vantage point. Finally, we evaluated the security improvements using a combination of ethically conducted real-world BGP hijacks, Internet-scale traceroute experiments, and a novel BGP simulation framework. We showed that multi-vantage-point domain validation can thwart the vast majority of BGP attacks.

We have already received interest in our work from certificate authorities like Digicert and we hope to further disseminate this research moving forward. Below we explain how we addressed the objectives of our research project (for reference, we refer to Let’s Encrypt’s specific implementation of Multi Vantage Point Domain Validation, or MVP-DV, as MultiVA for short):

(1) Reduce false positives associated with MVP-DV: We performed an extensive analysis of false positives and identified the causes of false positives as well as two practical mitigations: a quorum policy and customer retries. Analyzing the data we collected from Let’s Encrypt’s deployment, we found that the most significant source of false positives associated with multiple vantage point domain validation was DNS propagation delay. Customers often requested certificates shortly after they registered domains (or moved domains to new hosting providers). Under these circumstances, it was not uncommon that the customer’s DNS records had been updated at some authoritative servers but had not propagated globally.

Thus, when multiple vantage point validation was performed, remote vantage points would contact out-of-date DNS resolvers and get incorrect (or missing) records for the customers domain. In light of this, we employed a quorum policy that allowed one remote vantage point to fail. Our security analysis showed that even under this configuration the system still offered substantial security, and allowing a vantage point to fail meant that even if some DNS records were out of date, a customer could still get a certificate (thus reducing false positives). We also encouraged customers to simply retry validation later (potentially after DNS records had fully propagated). Looking at Let’s Encrypt logs, we found that around half of all validations that were rejected due to multiVA were eventually retried and led to successful certificates within 20 days. With these two measures together, we found false positives to be well less than 1% of validations allowing the system to operate at scale.

(2) Quantify security improvement of MVP-DV with theoretical and simulation-based analysis: We innovated on Internet topology simulations to perform extensive simulations of the multi vantage point validation. We developed IP-prefix-level simulations (as opposed to the AS-level simulations used in previous work) to improve accuracy and used public BGP data and the bdrmap tool to create an accurate model of the different Let’s Encrypt data centers used for validation. With these improvements we found that multiVA improved the resilience of the median customer domain to BGP attacks from .62 (pre-deployment) to .94 (using multiVA). We further identified potential extensions to the multiVA system that could offer .99 resilience and we are working with Let’s Encrypt on the deployment of these extensions.

(3) Validate efficiency of MVP-DV empirically: To empirically validate multiVA we launched real-world BGP attacks using the PEERING testbed and measured multiVA’s ability to prevent certificate issuance under these attacks. Using the PEERING node wisc01 as the victim and all other active PEERING nodes as potential attackers, we found that in the absence of multiVA, five of the six attacks we tested succeeded. Under Let’s Encrypt’s multiVA deployment, only one of the six attacks succeeded.

(4) Evaluate usability and practicality of MVP-DV: We performed extensive analysis of Let’s Encrypt’s operating logs to establish the usability and practicality of multiVA. We found that multiVA actually has a negligible latency impact because the network performance of remote vantage points actually out-performed Let’s Encrypt’s original data centers. Thus, even with the overhead of sending remote procedure calls to remote vantage points, the vast majority of validations were still determined by the time it took Let’s Encrypt’s original data center to perform validation. Furthermore, costs associated with multiVA were minimal with it costing around $100 per month per vantage point to operate the system, even at Let’s Encrypt’s scale of roughly 1.5 million certificates a day.

(5) Tooling and dissemination, including development of open-source MVP-DV and tools for facilitating MVP-DV deployment, and dissemination of code, tutorials, best practices, and deployment plan of MVP-DV: Our primary method of dissemination was through our peer-reviewed Usenix Security ‘21 paper. In addition, Let’s Encrypt’s deployment of multiVA is well documented and open source and is available at https://github.com/letsencrypt/boulder.

Overall, our project made huge strides in securing domain validation from BGP attacks. Not only does multiVA represent the first security improvement of domain validation to protect against BGP attacks, but our work shows that multiVA is practical even at the largest scale in the industry and operates at a cost (both financial, engineering, and operating) that is reasonable even for some of the smaller players in the CA industry. With the results presented in our Usenix Security paper, we are optimistic about the next steps towards a wider deployment of multiVA.

Projects Mentioned