Address validation - format the column first
I have a postal address column that has the data in one column and is in this format:
1 Smith Street MOE VIC 3220
No commas. I am unable to do an address validation and the file is sent to me automatically so no manual intervention. I am looking to see if l can change the file before sending the data but is there anyway l could split this to 2 columns
1 smith Street MOE VIC 3220 - the address validation will read this
Is there a way to do by capital letters as that is where the split will need to happen
Hey! You will be signed out in 60 seconds due to inactivity. Click here to continue using the site.
@Carolyn that particular address in MOE doesn't look right.
If you're trying to get this address to parse out into constituent pieces, it isn't straight forward.
Ahead of pushing the data into the Batch engine, we standardize the data using country specific standardization. In principle this means that we work out the locality, the post code, the province, the street/avenue/road and house number all through the standardization logic.
You can somewhat achieve this using data manipulations like searches within the string to find the position of the Street Avenue Road Way Boulevard etc identifiers but it is cumbersome and you would need to build for all possible roadway identfiiers and their alternates.
I have previously built a compound reusable function that will do this but I would not be confident in saying that it covers all scenarios.
When you have addresses like these, what is the batch address engine giving you as splits?
@Carolyn here is one way to do it, but it might not be ideal, depending on how standardized your input address format is.
I'll first show you what i mean with some sample addresses here. As you can see the first one split out okay, but the subsequent ones may not be accurate depending on where the capital letters start. Also, with this approach, i assume the minimum length for the locality is 3.
This is achieved by adding a Transformation step, and having 2 additional columns for Part 1 and 2 of the address respectively. Part 1 is expected to consist of the premises and street, part 2 is expected to consist of the locality state and postcode.
For part 1, we use a regular expression as the separator value for a split function where we expect 3 or more capital letters between A-Z. We would then return the 1st portion of the results.
For part 2, we would use the Part 1 value as the separator. We would then return the 2nd portion of the results.
Hope this helps.
@Carolyn this example of @Sueann See seems to get you what you asked for however, i did observe in my sample data sets that very rural addresses don't get handled well, and in particular when the state is SA or WA. Be sure to check all your rows as carefully as practically possible Also towns that start with ST like ST KILDA wont get handled correctly the ST splits off.
thank you l will try these today. However if l get the address to have a comma after each space, address validation likes it.
I did use parse but it worked then didn't work. So l cant rely on it
So if the address is 1, smith, street, MOE, VIC, 3220 - sorry l used a madeup address for privacy reasons - address validation works and that is all l need.
Is there an easy step to do this outside of parse. I had it working with parse by regular expression but l went back in and it wasn't working.
I will try above and see how l go
@Carolyn parse should work as shown, the issue might be handling the states and the locale but since you have a limited number of states you could pull those out also.
or you could use this if you don't care about length of the uppercsae string
@Carolyn you can use the Replace function to replace Space with comma if you just want to get to 1, smith, street, MOE, VIC, 3220.
However, looking back at a previous post here https://community.experianaperture.io/discussion/comment/645#Comment_645 i've done a test before using this method and the address validation results may not be as good as having the proper address components, though it is better than using space.
Thank you Clinton and Sueann. This has helped me again.
Really appreciate your quick responses. Really helping finish my build