Utah/UtahBuildingsImport
Utah Buildings Import is an import of the Utah AGRC dataset which contains building footprints and address information covering sections of the state of Utah. If you'd like to help with this import, message OSM user osmjwh to get started.
Goals
Improve address and building footprint coverage in Utah
Schedule
2017-11 to 2017-12: Planning stage
2018-01 to ??: Manual implementation and minor ongoing updates to processing scripts
Import Data
Background
The data is sourced from the Utah AGRC, which is a publicly funded governmental dataset. This import focuses primarily on the Building and Address datasets.
Data source site: https://gis.utah.gov/data/location/
Data license: No license (explicit permission has been given to use to contribute to OSM)
Type of license (if applicable): n/a (No license has been selected yet)
Link to permission (if required): https://lists.openstreetmap.org/pipermail/imports-us/2018-January/000836.html
OSM attribution (if required): n/a
ODbL Compliance verified: n/a
Import Details
Import Type
One time import, done in many separate uploads using a manual extraction process from QGIS, script-assisted processing with JOSM, and a manual review and upload.
Team Approach
OSM user osmjwh plans to lead the effort, but others are absolutely welcome to help.
Conflation Plan
At this time, we plan to have a manual conflation strategy, where imports are kept to a small enough size that identifying any duplicated features is easy. When duplicated features are found, the import will be compared to the existing feature, and the one with a more detailed footprint will be kept. All tags will be merged so nothing is lost from the deletion of the less detailed feature, and any conflicting tags will be investigated for quality and the higher quality one will be taken.
OSM User Requirement
These imports must be made using the UtahBuildingsImport OSM user. If interested in contributing, contact osmjwh to get credentials for this user.
Data Processing Scripts
A large portion of the data processing is scripted, using the JOSM/Plugins/Scripting and the scripts found at the OSM Utah Buildings Imports code repository.
Changeset Tags
Changesets will be tagged with the following tags:
- comment=UtahBuildingsImport - Include a description of what area is being imported here
- source=Utah AGRC
- import=yes
QA
We plan to QA using a peer review method.
Workflow
Before Starting
Make sure you have the following installed:
The following plugins are optional, but very helpful:
- JOSM/Plugins/Scripting if you would like to use the automated scripts.
- JOSM/Plugins/utilsplugin2 eases splitting building footprints on objects like duplexes.
Data Extraction/Preparation
The dataset will be processed in small subsets of roughly 10-30 city blocks. QGIS is used to extract this subset from the AddressPoints.shp and Buildings.shp files using the following process for each file. Note that the same area should be extracted for both files.
- Open the shapefile in QGISE
- Select the entities to upload
- It is usually easier to orient yourself if you have aerial imagery, available by downloading the OpenLayers plugin for QGIS.
- Copy the selected entities (for addresses, this may take a while)
- Paste as new vector layer (Edit -> Paste Features As... -> New Vector Layer)
- Save this new layer as ESRI shapefile
Then, open both of the new ESRI shapefiles in JOSM using the opendata plugin, by following the instructions here.
Example
See *.shp file examples of extracted Address and Building datasets here
Data Processing
Buildings Shapefile
After preparing the data as specified in the "Data Extraction" section, remove all tags from the Buildings layer. Select all items in this layer, de-select all nodes, and assign all the polygons the tag of building=yes.
If present, process the Type tag as appropriate based on knowledge of OSM tags to further refine the building tag value.
This process is automated by running the "UtahBuildingsImport_Buildings.js" script from the OSM Utah Buildings Imports code repository.
Addresses Shapefile
Process the address points layer by converting the following AGRC tags to the relevant OSM tags:
Shapefile Attribute | OSM Tag | Description |
---|---|---|
AddNum | addr:housenumber=* | The house number of the address. |
City | addr:city=* | The city of the address in all caps. Decapitalize prior to assigning to the OSM tag. |
ParcelID | utahagrc:parcelid=* | The Utah AGRC parcel ID number |
PtLocation | name=* | The common name of the address in all caps. This is not present on most address points. |
State | addr:state=* | The US State of the address. These should all be 'UT'. |
ZipCode | addr:postcode=* | The postcode of the address |
Futhermore, keep the following tags (if present) to help in processing street/location names:
Shapefile Attribute | Description |
---|---|
PrefixDir | The street prefix in N, S, E, or W, representing North, South, East, or West. For example, for 123 S 4500 E, PrefixDir=S. Only present where applicable. |
PtType | Denotes the type of building. Common values are 'Residential', 'Commercal', etc. |
StreetName | The name of the street in all caps. This does not include the street type. For example, for Abc Ave, StreetName=ABC. |
StreetType | The type of the street in all caps. For example, for Abc Ave, StreetType=AVE. Options include 'AVE', 'CIR', 'DR', 'ST', 'WAY',etc. |
SuffixDir | The street suffix in N, S, E, or W, representing North, South, East, or West. For example, for 123 S 4500 E, PrefixDir=E. Only present where applicable. |
To process the street names, follow the steps below:
- For each street in the dataset, find all entities in the layer that have a StreetName matching it. Combine the StreetName and StreetType or SuffixDir tags to create the addr:street OSM tag.
- For named streets (e.g. Wilson Ave), the StreetType tag contains the type of street (St, Ave, Cir, etc). For example, for Wilson Ave, StreetName = Wilson and StreetType = Ave. If there are multiple street types for the same name (e.g. Wilson Ave and Wilson Ct), make sure to include the StreetType in your entity query.
- For numbered streets (e.g. 700 East), the SuffixDir tag contains the direction. For example, for 700 East, StreetName = 700 and SuffixDir = E.
- Assign the selection an addr:street OSM tag according to the tagging above. This will require decapitalizing the street name and fully spelling out the street type or direction.
- Remove all points with duplicate addresses, especially those with a PtType="BASE ADDRESS" tag.
The process above is automated by running the "UtahBuildingsImport_Addresses.js" script from the OSM Utah Buildings Imports code repository.
Next, merge the Address and Building layers together.
Merging AGRC Shapefiles
Using the "BuildingsTools" JOSM plugin, merge the address points into the building shapes (Data -> Merge Address Points). This will only merge the addresses if there is a single address inside of the building footprint. You will likely run into the following issues:
- The address point lies just outside of a building footprint. In this case, just move the address inside of the building footprint and merge the address points again.
- A building has 3 address points with the same house number within the same building. Just delete two of these address points and merge the address points again.
- A building has 2 address points with different house numbers within the same building. If this is a house, it is likely a duplex. You may choose to split the house into two separate shapes, each with the relevant house number.
- A commercial building has many address points. You may choose to split the commercial building into smaller spaces, or leave the address points unmerged depending on the building configuration. It may be easier to merge the data into any existing OSM nodes if the addresses are not merged into the buildings.
Processing Merged AGRC Data
Process the PtType tag using the following guidance:
- If the UnitType tag is "APT", adjust building=yes to building=apartments.
- If the PtType tag is "Commercial", adjust building=yes to building=commercial.
- If the PtType tag is "Residential", and the building tag hasn't been adjusted to apartments in the steps above, apply the tag building=house.
This process above is automated by running the "UtahBuildingsImport_Merged.js" script from the OSM Utah Buildings Imports code repository.
Review
Next, review the remaining footprints with building=yes.
- There may be some larger public buildings, like churches or libraries, so tag them accordingly.
- Some of these may be sheds, in which case apply the building=shed tag.
- Most of the remaining entities are garages, so apply the building=garage tag.
Using an aerial imagery underlay that has been calibrated using GPS traces, adjust the building entities so that they match the aerial imagery footprints.
If using automated scripts, review the objects that have the "utahagrc:review" tag. As these objects are reviewed and tagging is corrected, remove the "utahagrc:review" tag.
At this point, remove all AGRC tagging that was not removed earlier in the process. All that should remain are name, building, and various addr tags.
Example
See a *.osm file example of a fully processed dataset (the same as was extracted above) here
Data Merging
Finally, data should be merged into a downloaded OSM layer by selecting both layers and clicking 'Merge'. Then upload this layer and tag the changeset as specified in the Import Details section above.
Note, with large changesets, uploads can take a long time. To speed it up, consider adjusting the upload to create a single changeset for a larger number (5000 or so) objects at once.
Warning about canceling uploads
DO NOT CANCEL IN THE MIDDLE OF THE UPLOAD! The uploaded data will be committed to OSM, but the remaining will not. If you just try to upload of all the data again, it will cause a large number of upload errors because it will identify that duplicates are being uploaded. At this point, you have a few options:
- Revert your changeset and re-upload your work (not especially easy or clean)
- Figure out what was committed by your change and remove it from either your upload or OSM.
Even after this, you will likely deal with duplicated objects, for which the verifier can be a big help.
Basically, it turns into a huge mess so just don't cancel your uploads halfway through.