Address Recognition
ShipEngine can use machine learning and natural language processing (NLP) to parse addresses data from unstructured text using the /v1/addresses/recognize
endpoint.
Data can often enter your system as unstructured text (as with emails, SMS messages, support tickets, or other documents). ShipEngine's address recognition endpoint saves you from parsing this text manually and trying to extract the useful data within it. Instead, you can send us the unstructured text and we'll return whatever address data it contains.
Our machine learning model learns and improves over time so it becomes more accurate and better at understanding different writing patterns.
Example Use Case
Let's say you receive an order via email. You can send the text of the email to ShipEngine and it will automatically extract the customer's address.
Here's an example of unstructured text from an email:
I need to send a package to my friend Amanda Miller’s house at 525 Winchester Blvd in San Jose (that's california, obviously). The zip code there is 95128.
When you send this information to ShipEngine via the /v1/addresses/recognize
it will recognize the following pieces of information:
Property | Value |
---|---|
person | Amanda Miller |
address | Amanda Miller 525 Winchester Blvd San Jose, CA 95128 |
address_residential_indicator | residential |
address_line1 | 525 Winchester Blvd |
city_locality | San Jose |
state_province | CA |
postal_code | 95128 |
For more complex parsing that includes other types of shipment data beyond just the address, see our Shipment Recognition page.
Requirements
- This endpoint is only available to accounts on the Advanced plan or higher.
- The unstructured text goes into the
text
property as a string in the request body. - ShipEngine NLP currently supports English text and can recognize addresses for the following countries:
- Australia
- Canada
- Ireland
- New Zealand
- United Kingdom
- United States
Already-Known Properties
You can specify any already-known properties for your address in the request. This can help you automatically define any known variables you might collect, such as:
name
city_locality
state_province
postal_code
country_code
Entity Types
The response includes an entities
array, which breaks down the separate pieces that the NLP model parsed from the unstructured text. Each type of information is called an "entity". For example, an address, a city, and a phone number would all be individual entities. Additionally, entities can have one or more attributes.
ShipEngine's address recognition can currently recognize the following types of entities and the associated attributes:
Entity Type | Recognized Attributes |
---|---|
address | direction: enumerated string (from or to ) name: string company_name: string phone: string address_line1: string address_line2: string address_line3: string city_locality: string state_province: string postal_code: string country_code: string address_residential_indicator: enumerated string ( yes , no , or unknown ) |
address_line | line: number (usually 1, 2 or 3) value: string (ex: "525 Winchester Blvd") |
city_locality | value: string |
country | name: string value: string |
number | type: enumerated string (cardial , ordinal , or percentage ) value: number |
person | value: string |
phone_number | value: string |
postal_code | value: string |
residential_indicator | value: enumerated string (yes , no , or unknown ) |
state_province | name: string (ex: "Texas", "Quebec", "New South Wales") value: string (ex: "TX", "QC", "NSW") country: string (ex: "US", "CA", "AU") |
Example Request & Response
We'll use the example use case from above in our example request, with the additional known properties for name
and country_code
.
PUT /v1/addresses/recognize
Example Response
The response includes a score
property with a decimal number to indicate level of confidence in the parsing accuracy. In this example, the response has an overall score of 0.971069... which indicates a 97% confidence that it parsed the text correctly. The score value can help your application programmatically decide if you will need any additional input or verification from your user.
The entities
array breaks down the recognized data further into their own individual objects with the attributes as properties, the result, and the confidence score for each entity.