What the heck is so hard about Email Validation?

Posted: 05.29.2024

Nobody really tells you this, but did you know “com” or “co.uk” are TLDs or top-level domain names?

Let’s take a typical HTML5 email field for a spin:

<input type="email" name="email" required>

Modern browsers properly conform to RFC 5322, this allows some emails to be valid such as “admin@com”. While this is correct per the specification, to cast such a wide net ends up hurting businesses that rely on email and lead capture. Your average user isn’t a super-admin at a TLD, and you end up with many incomplete email addresses added to sending lists and campaigns.

As always it is a best practice to validate email addresses in the backend, though it is desirable to also provide users with instant feedback when filling in a form. One way of doing this with HTML5 form fields is by specifying the pattern attribute.

The Problem

While there are a lot of great tutorials and examples for using HTML5 email fields with the pattern attribute, a shift in how Chrome treats RegEx patterns has left many web devs out in the cold.

Like us you have probably come across many articles or stackoverflow posts suggesting the pattern be formatted as [a-z0-9._%+-]+@[a-z0-9.-]+\.[a-z]{2,}. Adding this to our email field is easy:

<input type="email" name="email" required pattern="[a-z0-9._%+-]+@[a-z0-9.-]+\.[a-z]{2,}">

However, when testing an address such as “email@domain” this fails to fix our incomplete email address produces an error in the browser’s console like so:

Pattern attribute value [a-z0-9._%+-]+@[a-z0-9.-]+\.[a-z]{2,} is not a valid regular expression.
Uncaught SyntaxError: Invalid regular expression: /[a-z0-9._%+-]+@[a-z0-9.-]+\.[a-z]{2,}/v.
Invalid character in character class

What Happened?

In Chrome version 88, released in January 2021, changes were made to the way the browser handles regular expressions used in HTML5 form validation. Specifically, Chrome’s V8 engine introduced stricter parsing of regular expressions, which affected the handling of character classes, particularly with special characters.

As stated in the Chrome release notes

“The V8 engine now throws an error for invalid regular expressions in pattern attributes, which previously were silently accepted. This includes regular expressions with character classes containing unescaped special characters.”

The Solution

The fix is easy, but understanding what this means can take a moment. Some previously valid patterns now produce errors, specifically those with a character class including unescaped special characters such as  ( ) [ ] { } / - \ |.

Special characters in RegEx are used to provide functions such as capture groups or ranges of characters to match. One example is a-z which matches all letters A through Z. If you wanted to match a literal - it needs to be escaped like so \-.

Looking at our pattern above, we are using several if not all of these special characters. Most are used correctly to signify the special meaning, however there are two - characters used to match a literal dash in the email. Let’s update our RegEx pattern:

<input type="email" name="email" required pattern="[a-z0-9._%+\-]+@[a-z0-9.\-]+\.[a-z]{2,}">

Testing with “email@domain” now alerts the user the email is not in the requested format.

Now we finally have proper front-end validation for email addresses!

Why Email Validation Matters

Email validation is crucial for several reasons:

  • Data Quality: Ensures that the collected email addresses are correctly formatted and valid, which is essential for maintaining high-quality data.
  • User Experience: Prevents users from submitting incorrect email addresses, reducing frustration and ensuring they can receive important communications.
  • Security: Helps in preventing malicious inputs that could exploit your system.
  • Deliverability: Reduces the chance of bounce rates by ensuring emails are correctly formatted and valid, improving the overall deliverability of your emails.

Conclusion

Email validation might seem straightforward, but the nuances of handling regular expressions and browser behavior can complicate things. By understanding these challenges and applying the correct patterns, you can ensure robust email validation on your forms.

At Epogee Design, our developers stay updated with the latest best practices to ensure that every element of your website, including email validation, works flawlessly. Contact us today to find out how our small team can make a big difference for your business.