Import/Colorado Addresses
The Colorado Addresses Import is an import of Colorado Addresses (covering Colorado in the United States).
Goals
- Import addresses from the Colorado Address set into OpenStreetMap, preferentially using more localized datasets when available. If you know of a dataset that is Public Domain, or the locality will allow its use in OpenStreetMap (e.g., waiver), it still needs to go through the import process, although I am open to using that dataset in my import over the Colorado Address set.
Current Status
Import is in planning
Schedule
- January, 2020 - Checked license of data (Public Domain/permission obtained)
- January, 2020 - Evaluate the data quality
- January, 2020 - Upload data to an OSM-like server for use with MapWithAI (JOSM plugin)
- January, 2020 - Begin the import
Outreach, QA, Review & Feedback
- Imports mailing list TODO
How to Respond
I'll be watching for feedback on:
- the discussion side of this page
- the email thread
Data Source
StateWide
Website: https://data.colorado.gov/, specifically: https://data.colorado.gov/State/Statewide-Aggregate-Addresses-in-Colorado-2019-Pub/n7je-akky
Data license: Public Domain
Type of license (if applicable): Public Domain
OSM attribution (if required): http://wiki.openstreetmap.org/wiki/Contributors#Colorado
ODbL Compliance verified: yes
Mesa County
Website: https://emap.mesacounty.us/DownloadData/, (search for E911 address points)
Data license: TODO check -- I have previously gotten permission to use any of their data in OSM
Type of license (if applicable): TODO check
OSM attribution (if required): http://wiki.openstreetmap.org/wiki/Contributors#Colorado
ODbL Compliance verified: yes
Permission: https://github.com/osmlab/editor-layer-index/blob/gh-pages/sources/north-america/us/co/Mesa_County_Data.pdf
Data Processing Plan
All data manipulation is performed using ogr2osm
Translating Tags
See Scripts.
Conflating into OSM
The translated files will be served with https://gitlab.com/smocktaylor/serve_osm_files for use with the JOSM MapWithAI plugin.
Current issues:
- No conflation takes place server side, so while all addresses in an area may be added, the service will still provide the addresses (possible duplicates, JOSM's address checker should help with this).
Manual Review
Same as with the MapWithAI road/building datasets.
ChangeSet Tags (MAY CHANGE -- DEPENDS ON TOOL, source shouldn't change, however)
- comment=Importing Colorado Addresses (only if that is the sum of the changeset)
- osm_wiki_documentation_page=https://wiki.openstreetmap.org/wiki/Import/Colorado_Addresses (dependent upon contributor remembering to add it)
- source=Statewide Aggregate Addresses in Colorado 2019 (Public) or source=Mesa County GIS E911 Addresses
Sample Data
Post Conflation Data -- note 1525 Lola Court and 1674 Myers Lane
Risks & Known Issues
TODO
Scripts
Colorado
"""
A translation function for Colorado Public Domain address data in ogr2osm
"""
import generic_addresses
PREFIX_DIR = "PreDir"
PREFIX_TYPE = "PreType"
ADDR_HOUSENUMBER = "AddrNum"
ADDR_HOUSENUMBER_SUFFIX = "NumSuf"
STREET_NAME = "StreetName"
STREET_TYPE = "PostType"
POSTFIX_DIR = "PostDir"
ZIPCODE = "Zipcode"
STATE = "STATE"
CITY = "PlaceName"
COUNTY = "County"
UNIT = "UnitNumber"
def filterTags(attrs):
if not attrs:
return
tags = generic_addresses.parseAddressTags(
attrs,
prefix_dir=PREFIX_DIR,
prefix_type=PREFIX_TYPE,
addr_housenumber=ADDR_HOUSENUMBER,
addr_housenumber_suffix=ADDR_HOUSENUMBER_SUFFIX,
unit=UNIT,
street_name=STREET_NAME,
street_type=STREET_TYPE,
postfix_dir=POSTFIX_DIR,
city=CITY,
state=STATE,
zipcode=ZIPCODE,
)
generic_addresses.removeEmpty(tags)
return tags
Mesa County Specific
With the following script, the January 2020 data set output https://drive.google.com/open?id=1Sg2mRCED9PGpPjp3vQCK38JPf8In8xXY.
Please note that (a) everything should be expanded, (b) there is an actual "E Road", "N Road", and so on, and (c) streets and addresses really do have "1/2", "3/4", and other fractions in them.
"""
A translation function for Mesa County E911 address data in ogr2osm
"""
import generic_addresses
PREFIX_DIR = "PREFIX_DIR"
PREFIX_TYPE = "PREFIX_TYP"
ADDR_HOUSENUMBER = "HOUSE_NUMB"
ADDR_HOUSENUMBER_SUFFIX = "HOUSE_SUFF"
STREET_NAME = "STREET_NAM"
STREET_TYPE = "STREET_TYP"
POSTFIX_DIR = "SUFFIX_DIR"
ZIPCODE = "ZIP"
STATE = "STATE"
CITY = "CITY"
COUNTY = None
UNIT = "UNIT"
def filterTags(attrs):
if not attrs:
return
tags = generic_addresses.parseAddressTags(
attrs,
prefix_dir=PREFIX_DIR,
prefix_type=PREFIX_TYPE,
addr_housenumber=ADDR_HOUSENUMBER,
addr_housenumber_suffix=ADDR_HOUSENUMBER_SUFFIX,
unit=UNIT,
street_name=STREET_NAME,
street_type=STREET_TYPE,
postfix_dir=POSTFIX_DIR,
city=CITY,
state=STATE,
zipcode=ZIPCODE,
)
generic_addresses.removeEmpty(tags)
return tags
Generic Script
This should be named generic_addresses.py
in the translations directory. (This should probably be added to ogr2osm proper).
"""
A generic address parsing function
"""
import re
def parseAddressTags(
attrs,
prefix_dir=None,
prefix_type=None,
addr_housenumber=None,
addr_housenumber_suffix=None,
unit=None,
street_name=None,
street_type=None,
postfix_dir=None,
city=None,
state=None,
zipcode=None,
):
tags = {}
if city and city in attrs and attrs[city].strip():
tags["addr:city"] = attrs[city].title()
# COMB_HOUSE is HOUSE_NUMB + " " + HOUSE_SUFF
if addr_housenumber and addr_housenumber in attrs and attrs[addr_housenumber].strip():
addr = []
addr.append(attrs[addr_housenumber])
if addr_housenumber_suffix and addr_housenumber_suffix in attrs and attrs[addr_housenumber_suffix].strip():
addr.append(attrs[addr_housenumber_suffix])
tags["addr:housenumber"] = " ".join(addr)
if unit and unit in attrs and attrs[unit].strip():
tags["addr:unit"] = attrs[unit]
if state and state in attrs and attrs[state].strip():
tags["addr:state"] = attrs[state]
if street_name and street_name in attrs and attrs[street_name].strip():
address = []
if prefix_dir and prefix_dir in attrs and attrs[prefix_dir].strip():
address.append(translateName(attrs[prefix_dir]))
if prefix_type and prefix_type in attrs and attrs[prefix_type].strip():
address.append(translateName(attrs[prefix_type]))
address.append(streetNameCasing(attrs[street_name]))
if street_type and street_type in attrs and attrs[street_type].strip():
address.append(translateName(attrs[street_type]))
if postfix_dir and postfix_dir in attrs and attrs[postfix_dir].strip():
address.append(translateName(attrs[postfix_dir]))
tags["addr:street"] = (" ".join(address)).strip()
if zipcode and zipcode in attrs and attrs[zipcode].strip():
tags["addr:postcode"] = attrs[zipcode]
return tags
def streetNameCasing(name):
if len(name) <= 2 or (
len(name.split(" ")) == 2
and len(name.split(" ")[0]) <= 2
and isDigitOrFraction(name.split(" ")[1])
):
return name.strip()
return mc_name(numberCasingInName(translateName(name, letterNames=True))).replace(
" And ", " and "
).strip()
def mc_name(name):
returnWord = []
for word in name.split(" "):
if word.startswith("Mc"):
char = word[2].upper()
word = word[:2] + char + word[3:]
elif word.startswith("Mac"):
char = word[3].upper()
word = word[:3] + char + word[4:]
returnWord.append(word)
return " ".join(returnWord)
def isDigitOrFraction(word):
pattern = re.compile("([0-9/])+")
return pattern.match(word)
def numberCasingInName(name):
cased = ""
number = False
for char in name:
if number:
cased += char.lower()
else:
cased += char
if char.isdigit():
number = True
else:
number = False
return cased
def removeEmpty(tags):
"""
Remove empty tags
"""
toRemove = []
for i in tags:
if not tags[i].strip():
toRemove.append(i)
for i in toRemove:
del tags[i]
def translateName(rawname, letterNames=False):
"""
A general purpose name expander.
"""
suffixlookup = {
"Ave": "Avenue",
"Blvd": "Boulevard",
"Cir": "Circle",
"Cl": "Close",
"Conn": "Connector",
"Cres": "Crescent",
"Crt": "Court",
"Ct": "Court",
"Div": "Diversion",
"Dr": "Drive",
"E": "East",
"Gr": "Grove",
"Hwy": "Highway",
"Lane": "Lane",
"Ln": "Lane",
"Lndg": "Landing",
"N": "North",
"Pkwy": "Parkway",
"Pl": "Place",
"Pt": "Point",
"Rd": "Road",
"Rwy": "Railway",
"S": "South",
"Sq": "Square",
"St": "Street",
"Sw": "South West",
"Trl": "Trail",
"W": "West",
}
# Assume letter names can have A and AA, but not AAA
if letterNames:
toDelete = []
for entry in suffixlookup:
if len(entry) <= 2:
toDelete.append(entry)
for entry in toDelete:
del suffixlookup[entry]
newName = ""
for partName in rawname.title().split():
newName = newName + " " + suffixlookup.get(partName, partName)
return newName.strip()
Further Readings
A similar import in MA: Import/Catalogue/MassGIS Addresses