Geographic data can come in various granularities, zip, state, country, etc. Sometimes, you have data that is more granular than what you want to show in your analysis. For example, you might have data on the zipcode level, but you want to analyze it by state. Displayr can automatically convert smaller geographic regions into bigger ones automatically to save you much time in mapping things manually.
This article describes how you can automatically combine smaller geographic categories like zip codes or postal codes into larger ones like regions or states.
Requirements
Please note these steps require a Displayr license.
A dataset which contains geographic variables.
Currently, the regions that are supported are:
-
United States
-
Canada
-
United Kingdom
-
Europe
-
Australia
-
New Zealand
Options are also available if the user has data from two adjacent regions (e.g. if doing a multi-country study):
-
United States and Canada
-
Europe (including UK)
-
Australia and New Zealand
The output can be one of several different geographic designations, including states, provinces, regions, counties, and countries, and these depend on the way each of the regions define their geographic levels. Do note that our geographic map visualization shown above does not support all aggregations, see How to Access an Exhaustive List of Geographic Entities Available for Geographic Maps.
Method
Use Case 1 - Combining zip codes, postcodes, and other unambiguous geographies like state
When the input variable the user has selected is not ambiguous, they just need to run the option from the menu. There are some geographic names, like cities, which are ambiguous (they can refer to more than one place), and for this, see the next option.
In this example, we will combine UK Postcodes into Counties
- Select the variable you want to aggregate (in this case postcode) and click the plus sign to the right of the variable
- From the Insert Variable(s) menu, select Ready Made New Variables > Automatically Combine Categories > By Geography > United Kingdom > To Counties.
The results are as follows:
( As an alternative, select the from the tool bar and navigate to Data > Variables > New > Ready Made New Variables > Automatically Combine Categories > By Geography > United Kingdom > To Counties)
- To view the results, drag the original variable and the new variable onto the same page.
- At this point, you might decide that the counties aren't very useful and you would prefer regions instead. To do so, select the new variable you created from the Data Sets tree
- In Object Inspector, select Data > Automatically Combine Categories > Output geographic type menu,
- Change the selection to Region. There are other possible selections as well.
The results are as follows:
Use Case 2 - Combining ambiguous place names
Some geographic names can refer to more than one place. For example, there are multiple places called “Brooklyn” in the United States. It is impossible for the software to know exactly which “Brooklyn” is which unless the user provides some additional, unambiguous information. For example, if combining place names from the US into counties, the user could supply an additional variable telling us what State each place is in. Then the places could be mapped to counties. The feature will detect if there is ambiguity in the data the user has selected, and it will prompt the user to select an additional variable to disambiguate the places.
For example, assume you want to combine United States cities into counties.
- Select city from the Data Sets tree and click the plus sign to the right of the variable
- From the Insert Variable(s) menu, select Ready Made New Variables > Automatically Combine Categories > By Geography > United States> To Counties.
- Click Yes.
- Click State:State and then OK. This information will be used to help it identify the correct location for each city in cases of ambiguity.
The results are as follows:
Use Case 3 - World Region
The World section of the menu is not limited to any specific region or country, but it is limited in the type of data it can use.
The World section can map either:
-
A pair of latitude/longitude variables
-
A single variable containing IP addresses
into either the Country that corresponds to that data point or the State or Province.
- Select both latitude and longitude from the Data Sets tree and click the plus sign to the right of the variables
- From the Insert Variable(s) menu, select Ready Made New Variables > Automatically Combine Categories > By Geography > World> To States/Provinces.
Options
Variable The input variable containing geographic data to be combined into categories.
Combine by Use this control to toggle between the other methods for combining categories in the the Automatically Combine Categories menu such as By Value > Tidy Categories.
World region The geographic region that the input data/variable comes from.
Input data type The type of data/geographic unit, such as States, Postcodes, or Place (city, town, etc.), that the input variable contains.
Output geographic type The desired geographic unit to combine the input data into. Must be a larger type than Input data type; e.g. it is possible to map U.S. counties to U.S. states, but not the other way around.
Check spelling If this option is selected then approximate matching is performed using the Levenshtein distance, instead of requiring exact matching when looking up the input data values in the regional database.
Check neighboring region Select this option if the input data comes from more than one region than the one specified by World region. For example, with World region set to USA and this option selected, matches for the input data will also be looked for within Canada.
Supplementary variable Only shown when Input data type is Place (city, town, etc.). Use this dropbox to supply an additional variable with geographic info (such as state or region) to disambiguate place names in the input data that could represent multiple distinct locations in the region.
Next
How to Create a Geographic Map
How to Automatically Combine Categories - By Value
How to Automatically Combine Categories - By Pattern (CHAID)
How to Recode into Existing or New Variables
How to Recode Variables Using Category Midpoints
How to Recode High Values (Capping) in Numeric Variables
How to Recode Low Values (Capping) in Numeric Variables
How to Recode Numeric Variable(s) from Code/Category Midpoints