Inferring Gender from Title or Honorific and Gender Reversal

Clinton Jones · December 2019

Sometimes it might be useful to determine the gender of a person from their honorific, so for example deciding that the greeting should be Sir or Madam depending on the Mr or Mrs or Miss honorific.

The interesting aspect of this inferral task is that in Data Studio there are a good many ways that you can achieve this result.

Here's one that uses the EQUALS condition evaluation and IF THEN ELSE operators

The result is then as follows

You can also look at using lookups but this would require you to have a lookup list that has both Honorific and Saluation in the same dataset.

Is there another way you could do this, perhaps using a regular expression?

Nigel Light · December 2019

You might hit trouble when eg you start getting military rankings, clergy eg Major Smith, The Right Reverend Green - Male or Female?

The recent adoption of Mx causes similar issues too..

Clinton Jones · December 2019

@Nigel Light yes that is a good point but then you might use a combination of conditional IF then logic. Most important though, is aside from the use of a lookup, is this really the only way to determine an appropriate salutation?

Nigel Light · December 2019

Lookup table?

It would then allow the table to be used in other applications and promote improved governance rather than hard-coding in the program itself

Nige

Clinton Jones · December 2019

Yes, you could consider something that appends a salutation based on the honorific identified

Luke Westlake · January 2020

From a real-world usage perspective, whilst this is a useful calculation, it's use also needs to be carefully considered. Here's a few consulting recommendations:

Use honorifics only to derive a possible gender, not to make assumptions.
Any column should be named as "possible/derived" with metadata/data dictionary entries indicating the derivation method.
Suitable use cases include:
- In a data quality rule to compare an honorific with a gender flag field to look for "possible" but not "certain" mismatches
- Analytics with large enough sample sizes to minimise errors (assuming no systematic errors)
- Light touch personalisation with marketing, but still without assuming gender (caution still required)
Unsuitable use cases include:
- Creating a new gender flag (or changing an existing one) from a derived calculation
- Updating source systems with new gender flags
- Indicating gender in marketing or service communications

A good example of a suitable use case might be using this calculation to "blank out" potentially inaccurate gender flags prior to loading a dataset to a marketing platform prior to a personalised campaign.

Inferring Gender from Title or Honorific and Gender Reversal

Comments

Categories