Osmsync
Osmsync is an automated edits software library that can be used to synchronized an external dataset with osm. It loads data from the external source, loads the matching data from osm and from these prepares a "diff" or difference report and a JOSM compatible changeset which allows the proposed changes to be reviewed prior to upload. As with any other automated edit, it is essential that people read and adhere to the Automated Edits code of conduct.
It can be used for the initial import, for adding records later and to alter or rename keys. It can also be used to feed community edits back to the external source, although you should be mindful that data retrieved from OpenStreetMap is covered by the Open Database License which may be incompatible with the licensing restrictions associated with the external data source.
Operation
Osmsync loads both datasets and compares. Certain keys are defined as "master" in the external dataset. In the case of a chain store this might be the hours of operation, or the phone number of the store. Other keys are mastered in osm. In almost all cases osm is master for the exact coordinates, though an osmsync plugin may flag locations differ by more than 100 meters. Thus osm mappers can significantly enhance osmsync data without interfering with the conflation or data freshness updates.
Example datasets include:
- Car sharing locations, sourced from the car sharing reservation system. Count vehicles and vehicle types at each location.
- Government stream flow gauges, with a link to real time flow data.
- Current bus stop locations (but see also the gtfs-osm-sync project).
How it works
Osmsync stores a "conflation key" with each osm object, and uses it to match up data later:
source:pkey=xxxx source=osmsync:yyyy
The source:pkey tag documents the primary key in the original dataset (if there is no primary key in the original dataset a hash is used instead). If these keys are left alone by future mappers, the import will proceed smoothly. Deletion of these keys can lead to duplicates.
Download osmsync library
Osmsync library can be downloaded from OpenStreetMap SVN repository
Example Control File
#!/usr/bin/python ## ## osmsync module to import USGS water data (specifically stream guaging ## stations) from osmsync import osmsync class osmsync_usgs_waterdata(osmsync): # Sample waterdata as supplied: # <site> # <agency_cd>USGS</agency_cd> # <site_no>09423350</site_no> # <station_nm>CARUTHERS C NR IVANPAH CA</station_nm> # <site_tp_cd>ST</site_tp_cd> # <dec_lat_va>35.24498915</dec_lat_va><dec_long_va>-115.29887590</dec_long_va> # <coord_acy_cd>F</coord_acy_cd> # <dec_lat_long_datum_cd>NAD83</dec_lat_long_datum_cd> # </site> def fetch_source(self, sourcedata): sourcenodes = {} req = urllib2.Request(sourcedata, headers=self.http_headers); tree = ElementTree.parse(urllib2.urlopen(req)) for site in tree.iter('site'): pkey = site.find('site_no').text.strip() node = {} node['tag'] = {} node['id'] = pkey node['lat'] = site.find('dec_lat_va').text.strip() node['lon'] = site.find('dec_long_va').text.strip() node['tag']['source:pkey'] = pkey node['tag']['man_made'] = 'monitoring_station' node['tag']['monitoring:river_level'] = 'yes' node['tag']['operator'] = site.find('agency_cd').text.strip() node['tag']['description'] = site.find('station_nm').text.strip() node['tag']['website'] = 'http://waterdata.usgs.gov/nwis/inventory/?site_no=' + pkey sourcenodes[pkey] = node source_is_master_for=['operator','website','description'] return(sourcenodes, source_is_master_for)
JOSM Extension
As of August 2011 JOSM will respect changeset notes produced by osmsync:
<osm version="0.6" generator="osmsync"> <changeset> <tag k="source" v="osmsync:ccs"/> <tag k="note" v="Prepared by osmsync: car share reservation system..."/> <tag k="conflation_key" v="source:pkey"/> </changeset> ...
See also
- Potential Datasources - List of datasources which could be imported, with descriptions of licensing status/investigation.
- Foundation/Import Support Working Group - a group formed by the foundation to facilitate investigation and potential import of datasets.