Google and others contribute TLS certificates to Certificate Transparency logs as they crawl the web, but there is less systematic effort devoted to discovering certificates from secure, non-HTTPS protocols. I searched the Rapid7 “More SSL” scan log for novel certificates and contributed them to CT logs.

Certificate transparency is an effort to publicly log all publicly trusted TLS certificates. Publishing certificates has a number of benefits, including allowing site owners to detect fraudulent certificates issued in their name, and allowing more comprehensive oversight of CA operations.

Google Chrome will begin requiring CT records to trust newly-issued certificates, beginning in the next month, but historical certificates may not have been logged, and site operators may prefer not to log certificates which are not intended for web browsers.

TLS certificates protecting HTTPS connections are collected by Google’s spider and added to CT logs. I hypothesized that TLS-protected, non-HTTPS services might be an additional source of novel certificates for logging, and could reveal unidentified misissuance.

Rapid7’s Project Sonar scans selected services on the entire IPv4 internet at regular intervals and produces reports containing, among other things, any TLS certificates provided by the scanned hosts. This work is based on the moressl dataset, which contains certificates discovered while crawling services including SMTP, IMAP, and FTP in both TLS-wrapped and STARTSSL modes.

## Discovered certificates

This data set did indeed reveal many certificates which had not been previously logged.

• The moressl dataset contains 10,341,552 certificates as of Mar 22, 2018
• 2,172,408 certificates (21%) had valid signatures and chained to roots included in the Mozilla trust store
• 56,249 trusted certificates (3.6% of trusted certificates) were novel to crt.sh, which has views of many public CT logs
• 22,893 trusted certificates (40% of novel certificates) were unexpired
• 2,279 unexpired, trusted certificates were novel but contained a SCT proof, meaning that a precertificate had already been disclosed to CT, but the final, issued certificate had not been logged

This validates the idea that non-HTTPS TLS services are an important source of undisclosed certificates, and confirms the novelty of this analysis. :)

Although CT signature embedding will become more common, which requires that CAs submit precertificates to logs before certificate issuance, some CAs do not have a practice of subsequently logging the final issued certificate containing the embedded CT signature proofs. Although the precertificates are sufficient to identify most cases of misissuance, understanding which proofs are actually embedded in certificates helps assess the impact of decertifying logs, so collecting these issued certificates is still worthwhile even in the presence of a logged precertificate.

## cablint results

Running cablint, which audits certificates to the CA/Browser Forum Baseline Requirements, over the unexpired certificates reveals a variety of badly encoded or otherwise non-compliant certificates, presented here in approximate order of interest.

### Noncompliance possibly revealed by this work

These certificates were unrevoked at the time of discovery. I believe these issues with these issuers have not been previously discussed, or were considered resolved, or minimally that they were the only results I saw for these CAs on the misissued.com cablint page. Please let me know if you think I shouldn’t take credit for them. :)

I have not applied cablint to all certificates in CT; it is completely possible that other certificates with these problems, from these issuers or others, have already been publicly disclosed. But these were the ones that you couldn’t have seen before yesterday!

Issuer Last issued Certificate Details
GoDaddy 2013 1 ≥ 60 month validity period
Digicert 2016 1 IP address encoded as a DNS SAN
ABB (Digicert) 2016 1 CN not in SAN *
NetLock 2017 1, 2, 3 Invalid EKU (KeyAgreement for RSA key)
KPN (Logius/PKIoverheid) 2016 1, 2, 3 Invalid EKU (KeyAgreement for RSA key)
ECCE 001 (Digicert) 2018 1, 2 Invalid string encoding
Tenera (Digicert) 2017 1, 2 Invalid string lengths
Entrust 2015 1 Invalid string encoding

* A certificate with the same problem from the same issuer was revoked after it was disclosed last August.

### Noncompliance already visible from previously logged certificates

• The HydrantID SSL ICA G2 CA is trusted by Mozilla (via QuoVadis) for TLS authentication, but issues certs intended for IPSEC and which lack serverAuth and clientAuth EKU values, which are not BR-compliant (7.1.2.3.f). These certificates are compliant; zlint was reporting them incorrectly. I regret the error.
• Digicert has issued certificates with underscores in DNS names
• Microsoft’s CA software inserts trailing NULs in the CPSuri field and fails to flag the KeyUsage extension as critical.

### Otherwise notable

• Comodo issued certificates in 2017 and 2014 where the CN is a U-label and only the matching A-label is present as a SAN; this situation, but not this certificate, was discussed on m.d.s.p
• Thawte very recently revoked two certificates containing only metadata in the subject OU identifier; 1, 2

## Discussion

This work shows that non-HTTPS servers present a meaningful volume of undisclosed browser-trusted TLS certificates, and that these certificates contain interesting examples of BR violations.

It is not obvious to me that the rate or distribution of problems in this data set is different than they would be from a random sample of certificates already disclosed to CT, which describes an interesting follow-on experiment. The misissued.com cablint page monitors recent, but not historical, disclosures, and unknown horrors may lurk in CT’s merkled depths. Cthulhu fhtagn!

Happily, systematic evaluations of the quality of historical logged certificates have been undertaken recently, including work presented a few days ago by Oliver Gasser at TU Munich, and an upcoming conference paper by Deepak Kumar at UIUC and colleagues. Their work on logged certificates provides a useful baseline for future comparisons to discovered certificates.

The source code for retrieving and processing scan data is available at https://github.com/tdsmith/tattle.

Other outcomes of this work included a bug report against pyca/cryptography.

## Acknowledgements

Thanks to:

Any errors are, of course, mine alone.