Find Duplicates: Email/Email_Local Element Modifier

Options
Dan Mason
Dan Mason Experian Contributor
edited July 4 in Ideas board

What problem are you facing?

The Email and Email_Local elements in Find Duplicates Standardization currently retain all characters by default, including dots (.) and plus-aliases (+alias) in the local part of email addresses. This means that emails like johnsmith@email .com, john.smith@email .com, and john+alias@email .com are treated as completely different email addresses during duplicate detection, even though many email providers (particularly Gmail) treat these variations as identical.

What impact does this problem have on you/your business?

This creates gaps in duplicate detection accuracy, allowing individuals to circumvent email-based matching by using common email aliasing strategies. Since many email providers ignore dots in the local part and treat plus-aliases as the same inbox, our duplicate detection should align with this real-world behavior. The current approach means we're missing potential duplicates that could be critical for business processes like fraud detection, customer identification, or account management.

Do you have any existing workarounds? If so, please describe those.

Currently, we need to create derived fields that manually clean email addresses before using them in blocking keys and matching rules. This involves:

  • Removing all dots from the local part of email addresses
  • Removing any +<alias> portion before the @ symbol
  • Using these cleaned fields instead of the native Email/Email_Local elements

This approach is not appropriate for a highly available and scalable Real Time solution.

Do you have any suggestions to solve the problem? Feel free to add images if this helps.

Could element modifiers be added to Email/Email_Local elements (similar to how Root Names work) that would automatically handle common email aliasing strategies? Specifically:

  1. Dot Removal Modifier: Automatically removes dots from the local part of email addresses
  2. Plus-Alias Removal Modifier: Automatically removes any +<alias> portion from the local part before the @ symbol

This would allow users to choose whether they want strict email matching (current behavior) or normalized email matching (proposed enhancement) depending on their specific use case. The modifiers could be applied directly to the Email/Email_Local elements in blocking keys and matching rules.

3
3 votes

Gathering interest · Last Updated

Comments

  • I believe the dot removal modifier should only be considered within a specific (internal?) list of known email domains (such as Gmail) that ignore dots, as many email domains do not disregard them, and they are an essential part of an email, this way it should be a bit more robust and avoid end-user error of using dot removal without covering dot-sensitivity case in the rules in particular.