Chatham County, North Carolina Address Import
Completed
This import has now been completed.
Goals
To add every missing address in Chatham County to OpenStreetMap without creating duplicates.
Schedule
The data has already been converted into OSM format and all duplicates of existing addresses in OSM have been removed. It can be found here. The data will be imported on the 21st and 22nd of July, 2018. The data will be imported in 5 chunks of less than 10,000 addresses to avoid possible disasters.
Import Data
Background
Data source site: Chatham County Open Data Portal
Data license: Public Domain, but no direct resale allowed. (You can't download the data and sell it, but adding it to OSM and then selling OSM data as a whole is allowed)
Link to permission: none
ODbL Compliance verified: Yes, via direct communication with the GIS department.
OSM Data Files
Already transformed data that will be imported.
Import Type
One time import that will be completed in five chunks with JOSM.
Data Preparation
Data Reduction & Simplification
Tagging Plans
For the source tagging, it is planned that the tagging "source:addr"="Chatham County Open Data" will be used on each address. addr:housenumber, addr:street, addr:city, addr:state, and addr:postcode will also be used on each address. addr:unit will be used on addresses that include a unit. addr:country will not be used.
Changeset Tags
"source"="Chatham County Open Data Portal", "source:website"="https://gisopendata-chathamncgis.opendata.arcgis.com/datasets/7f7881ca05a94fa1ac6fe618e84fb725_0", and "source:date"="July 5th, 2018" will be used on the changeset. Adequate comments such as "Imported addresses in Chatham County https://wiki.openstreetmap.org/wiki/Chatham_County,_North_Carolina_Address_Import (3/5)" will also be used.
Data Transformation
I have already completed all needed data transformation using a combination of JOSM, the opendata plugin, and a custom made XML editing Java program.
- First, I downloaded the data in Shapefile format from Chatham County Open Data.
- Then, I opened the data with JOSM using the opendata plugin.
- I then saved the file.
- Then, I downloaded every object with "addr:housenumber" and "addr:street" in Chatham County using the Overpass API. I saved that data as a file as well.
- I then ran my address editing program plugging in the new data and existing data. The program corrected casing (CHAPEL HILL -> Chapel Hill), created addr:street ("streetDirection"="N", "streetName" = "COLUMBIA", & "streetType"="ST" -> "addr:street"="North Columbia Street"), and removed duplicates by comparing the dataset to the existing addresses in the OSM database.
- I opened the file created by the program with JOSM and did some final modifications.
- Finally, I saved the ready-for-upload files on Google drive so that anyone could view them.
Data Transformation Results
Data Merge Workflow
Team Approach
I will be working on this import alone. It is very simple, because it only involves adding nodes to the database. For conflation/data merge, all merging will be done after upload. Since Chatham County is almost blank on OpenStreetMap, the number of existing POIs and buildings that need to be merged with the addresses is very small. Merging after the fact is a much simpler way to do an import like this.
Workflow
For each of the 5 smaller address OSM files (available on Google Drive):
- Download from Google Drive.
- Open with JOSM.
- Upload. (Each file is 5000 - 9000 addresses, so the max changes limit of 10K will not be a problem)
I will be using the account LeifRasmussen_import to do the import. This should make reverting changes easier in the unlikely event of a screw-up.
Conflation
My data transformation Java program automatically removes duplicates of existing addresses from the dataset. Conflation will not be a problem.