Let’s Encrypt has been providing free TLS/SSL Certificates from April 2016. These certificates are for encrypting all communication between a web server and its users. Certificate Authorities (CAs) cryptographically sign these certificates to mark them for their authenticity. Every CA has to make sure that the domain they’re making an issuance to doesn’t have another CA listed in the “CAA Field” of the domain’s DNS records. All CAs must abide by CAA (Certificate Authority Authorization) specifications or it could lead to heavy penalties. This security standard was first approved in 2017 and has been active since.
To make sure that they’re not issuing certificates to domains that they’re not authorized to, CAs use their own software to filter them out.
Let’s Encrypt uses such software, called “Boulder”. This checks CAA records while validating the subscriber’s control of a domain name. They consider a validation workable for 30 days, and while most of the subscribers issue a certificate immediately after the domain control validation, some of them give it time and that requires Boulder to check the CAA records a second time to make sure it’s all still good.
Boulder was hit by a bug back in July 2019 and issued 3,048,289 certificates after. This was what Jacob Hoffman-Andrews, a Let’s Encrypt Engineer had to say about this bug: “When a certificate request contained N domain names that needed CAA rechecking, Boulder would pick one domain name and check it N times. What this means in practice is that if a subscriber validated a domain name at time X, and the CAA records for that domain at time X allowed Let’s Encrypt issuance, that subscriber would be able to issue a certificate containing that domain name until X+30 days, even if someone later installed CAA records on that domain name that prohibit issuance by Let’s Encrypt.” He went on to say that the bug was first detected at 2020-02-29 03:08 UTC and was fixed by 05:22 UTC, and the issuance was re-enabled.
However, 3 Million certificates were signed in those few months, and they have no choice but to revoke all of them so they’d reissue for all the domains they’re authorized for. At 00:00 on March 4th, 2020, this came into effect and all the impacted certificates have been triggering errors trying to run on browsers and such. New TLS certificates must be requested to replace the old ones.
The domain owners are certainly not happy with the way Let’s Encrypt handled this problem. The engineers at Let’s Encrypt are assuring that this wasn’t the first time a CA got affected by bugs and issues and they were all smoothly dealt with, and so will this.
They proceeded to say that out of all the 116 million certificates that Let’s Encrypt has activated, only 2.6% of them were impacted by this issue. And 1 million of them were only duplicates of other domains so the actual number would only be around 2 million. According to them, the most commonly affected certificates were those that are reissued very frequently, which is why so many affected certificates are duplicates.
Despite minor issues in the past, this was the first big problem that a bug has caused for Let’s Encrypt. It’s still the most successful CA hitherto, having issued its one-billionth free TLS/SSL certificate recently.
They’ve announced to push back the deadline to let their subscribers work with this issue on their end. Whoever’s using Let’s Encrypt can check here for the list of impacted domains and here to see if they’re one of them. More information is on the help section of its website