Geographic data can come in various granularities: zip, state, country, etc. Sometimes, you have data that is more granular than what you want to show in your analysis. For example, you might have data on the zip code level, but you want to analyze it by state. Displayr can automatically convert smaller geographic regions into bigger ones automatically to save you much time in mapping things manually.
This article describes how you can automatically combine smaller geographic categories like zip codes or postal codes into larger ones like regions or states.
Requirements
A dataset that contains variables with geographic data.
Currently, the regions that are supported are:
-
United States
-
Canada
-
United Kingdom
-
Europe
-
Australia
-
New Zealand
Options are also available if the user has data from two adjacent regions (e.g. if doing a multi-country study):
-
United States and Canada
-
Europe (including UK)
-
Australia and New Zealand
The output can be one of several different geographic designations, including states, provinces, regions, counties, and countries, and these depend on the way each of the regions defines their geographic levels. Note that our geographic map visualization shown above does not support all aggregations, see How to Access an Exhaustive List of Geographic Entities Available for Geographic Maps.
Method
Use Case 1 - Combining zip codes, postcodes, and other unambiguous geographies like state
When the input variable the user has selected is not ambiguous, they just need to run the option from the menu. There are some geographic names, like cities, which are ambiguous (they can refer to more than one place), and for this, see the next option.
In this example, we will combine UK Postcodes into Counties
- Select the variable you want to aggregate (in this case postcode) in the Data Sources tree.
- Hover and click + > Ready-Made New Variables > Automatically Combine Categories > By Geography > United Kingdom > To Counties. A new variable will appear just below the original.
- To view the results, drag the original variable and the new variable onto the same page.
- At this point, you might decide that the counties aren't very useful and you would prefer regions instead. To do so, select the new variable you created from the Data Sources tree.
- In Object Inspector, select Data > Automatically Combine Categories > Output geographic type menu and change the selection to Region. There are other possible selections as well.
The results are as follows:
Use Case 2 - Combining ambiguous place names
Some geographic names can refer to more than one place. For example, there are multiple places called “Brooklyn” in the United States. It is impossible for the software to know exactly which “Brooklyn” is which unless the user provides some additional, unambiguous information. For example, if combining place names from the US into counties, the user could supply an additional variable telling us what State each place is in. Then the places could be mapped to counties. The feature will detect if there is ambiguity in the data the user has selected, and it will prompt the user to select an additional variable to disambiguate the places.
For example, assume you want to combine United States cities into counties.
- Select the "city" variable from the Data Sources tree.
- Hover and click + > Ready-Made New Variables > Automatically Combine Categories > By Geography > United States> To Counties.
- A message will appear confirming that you are using a variable that contains city or town data:
- Click Yes.
- Select a secondary variable, such as "State" and click OK. This information will be used to help it identify the correct location for each city in cases of ambiguity.
The results are as follows:
Use Case 3 - World Region
The World section of the menu is not limited to any specific region or country, but it is limited in the type of data it can use.
The World section can map either:
-
A pair of latitude/longitude variables
-
A single variable containing IP addresses
into either the Country that corresponds to that data point or the State or Province.
In the example below, we have latitude and longitude coordinates stored in two variables.
- Select the latitude and longitude variables from the Data Sources tree.
- Hover and click + > Ready-Made New Variables > Automatically Combine Categories > By Geography > World> To States/Provinces.
Options
The following options appear in the object inspector of the resulting variable with "Combined" in its name.
Variable The input variable containing geographic data to be combined into categories.
Combine by Use this control to toggle between the other methods for combining categories.
World region The geographic region that the input data/variable comes from.
Input data type The type of data/geographic unit, such as States, Postcodes, or Place (city, town, etc.), that the input variable contains.
Output geographic type The desired geographic unit to combine the input data into. Must be a larger type than Input data type; e.g. it is possible to map U.S. counties to U.S. states, but not the other way around.
Check spelling If this option is selected then approximate matching is performed using the Levenshtein distance, instead of requiring exact matching when looking up the input data values in the regional database.
Check neighboring region Select this option if the input data comes from more than one region than the one specified by World region. For example, with World region set to USA and this option selected, matches for the input data will also be looked for within Canada.
Supplementary variable Only shown when Input data type is Place (city, town, etc.). Use this dropbox to supply an additional variable with geographic info (such as state or region) to disambiguate place names in the input data that could represent multiple distinct locations in the region.
Next
How to Create a Geographic Map
How to Automatically Combine Categories - By Value
How to Automatically Combine Categories - By Pattern (CHAID)