Santa Clara County, California/San Jose building import
This is an in-progress import of municipal building and address data throughout the city of San José, California.
Source
The City of San José makes several GIS layers available as shapefiles. Basemap contains parcel outlines, while Basemap_2 contains address points, building footprints, and condominium outlines. The data comes projected in a mix of NAD 1983/CORS96 California state plane 3 (ESRI:103240) and NAD83 California zone 3 (EPSG:2227). Basemap_2 is the same shapefile from which San José sidewalks were imported in 2017 and 2018. The buildings are generally from 2006. Parcel and condo outlines will not be imported, rather they are used in the import script to cross-reference addresses to buildings where possible. Building footprints include height and elevation information, and address points include full address information as well as unit type, "place type," exploded street name, associated parcel, and "Addtl_Loc" which seems to refer to the parcel owner. The city website claims the data is updated monthly, but we do not plan on doing a continuous import.
License
It is not copyrighted because (lacking an exception in statute like those for works of the Department of Toxic Substances Control or works of certain colleges established by statute) "unrestricted disclosure is required".
↑ This template should only be used on file pages.
Preprocessing
start.sh downloads and imports the shapefile data and existing OSM data into a PostgreSQL/PostGIS database, runs merge.sql, and runs ogr2osm once for each TAZ.
- removes inactive addresses;
- finds and deletes address points that don't have a matching street name in OSM;
- detects and merges groups of address points on grids;
- flags addresses and buildings that intersect existing OSM data;
- on condo parcels where there is only one address or merged address, merges addresses onto the building closest to the address point;
- on small- to medium-sized parcels where there is only one address or merged address that is not a hospital or school, merges addresses onto the building closest to the address point;
- on parcels where there are more buildings than addresses and every address point intersects a building, merges addresses onto intersecting buildings;
- on parcels where all addresses have a common Addtl_Loc value, assign that value to the parcel; and
- assigns buildings, addresses, and named parcels to the nearest TAZ key.
basemap.py is an ogr2osm translation filter that maps fields in the shapefile or database to OSM tags.
Some parts of the scripts above are based on the scripts used in the Hamilton County Building Import. The TAZ areas are the same as the ones used in the crossings part of the sidewalk import.
Tag mapping
Source tag | OSM tag |
---|---|
always | building=yes |
BLDGHEIGHT | height=* |
BLDGELEV | ele=* |
Source tag | OSM tag |
---|---|
Inc_Muni | addr:city=* |
Add_Number, AddNum_Suf | addr:housenumber=* |
CompName | addr:street=* |
Unit | addr:unit=* |
Post_Code | addr:postcode=* |
Place_Type | see below |
Source value | OSM tag |
---|---|
ED | amenity=school |
FB | amenity=place_of_worship |
GO | office=government |
GQ | amenity=social_facility |
HS | amenity=hospital |
HT | tourism=hotel |
RE | club=sport |
RT | amenity=restaurant |
RL | shop=yes |
TR | public_transport=platform |
Source value | OSM tag |
---|---|
BU | building=commercial |
ED | building=school |
FB | building=religious |
GO | building=government |
HS | building=hospital |
HT | building=hotel |
MH | building=static_caravan |
Condominium | building=residential |
MF | |
RL | building=retail |
RT | |
SF | building=house |
Source tag | OSM tag |
---|---|
always | landuse=residential |
Addtl_Loc | name=* |
Issues needing manual resolution
- Some buildings have a negative height.
- The building data is at least two years old; some buildings have been demolished or rebuilt during that time. Check against aerial imagery, and if there is no outline for a new building, remove building=* and add demolished:building=*.
- Occasionally, the merge script will choose the incorrect building to tag with an address, or the data has mismatching parcel information for an address.
- The merge script leaves address points for cases where it isn't sure they can be matched. In cases where there are no conflicts and the assignment is obvious, lone address points should be merged onto buildings.
- Some merged address points include so many unit numbers that the field length exceeds the maximum.
- Addresses tagged as "Retail" are imported as shop=yes, which, while valuable information, is considered a tagging mistake and is not rendered in the default tile layer. These should be made into specific shop types, but can be resolved after the import.
- Some buildings are split into different pieces in the data set. Ideally, these should be made into building:part=*s.
- Sometimes it's better to assign an address to a site boundary than a single building, for example hospitals, schools, or apartment complexes. The merge script leaves the address points separate to manually evaluate these cases.
- Some street names in the address data don't exactly match any street names in OSM. Reconciling them needs external validation, so those address points will be saved for after the import.
Workflow
We are using the OSMUS Tasking Manager to distribute the generated tasks of non-conflicting data to volunteers to review and import. Importing conflicting data will be done similarly in a second phase after all non-conflicting data is imported.
Import accounts
- impiooort (on osm, edits, contrib, heatmap, chngset com.)
- Minh Nguyen_sjmport (on osm, edits, contrib, heatmap, chngset com.)
- doktorpixel14_import (on osm, edits, contrib, heatmap, chngset com.)
External links
- Phase 1 tasking manager project: adding new buildings
- Phase 2 tasking manager project: updating preexisting buildings
- Code for San José issue tracking this import