Import/Catalogue/Venice addresses import

From OpenStreetMap Wiki
Jump to navigation Jump to search

About

This page is about importing addresses in OSM planet file from the data provided by the Municipality of Venice (Italy).

The Municipality of Venice released their complete address data in 2023.

The import will be discussed in this Italian OSM mailing list [https:// TBD]. This wiki page will be result of consensus there.

Import Data

Background

Address format

House numbering follows the European scheme. An address is determined by its streetname and housenumber. Housenumber is also unique per street. In several cases Venice island itself features place addresses.

Housenumbers can include:

  • subordinates, noted with suffix letters (e.g. in "7a", subordinate "a" ); subordinates usually arise when a new house is built between existing houses with subsequent housenumbers
  • extensions, noted with a slash "/" followed by an integer; most cases occur when a single entrance is shared by different buildings.

Data quality

Possible offset issues due to source data or reprojection will be inspected. [Pellestrina housenumber offset]

Legal

Data source site: https://portale.comune.venezia.it/node/117/80

Data license: https://portale.comune.venezia.it/filebrowser/download/14249419

Type of license: IODL 2.0


Source data

Dataset identification string is "Strato03_GestioneViabilita_Indirizzi" and has been downloaded from toponimi numeri civici page, Venice municipality website.

Import Type

The dataset will be cleaned and OSM-formatted by Openrefine; then it will be conflated with OSM conflator and published in a shared audit maps prior to upload.

Data Preparation

Operations applied to original dataset are listed in this operations file. Due to dataset large size (86k nodes), import shall be split on a geografic base, ie: Pellestrina Island, Venice Island, Mestre and Marghera mainland.

Tagging

The CSV file derived from QGIS conversion consists of a collection of punctual elements, one for each housenumber.

The following fields will be evaluated:

  • INDRIZZO: place, housenumber, subordinate (San Polo 1175b)
  • DENOMINAZI: street, housenumber, subordinate (Via Piave 13b)
  • lat
  • lon
  • IDMASTER: official housenumber id, used for conflation and optionally for OSM loc_ref tag

Housenumber

addr:housenumber has been built titlecasing INDIRIZZO and DENOMINAZI fields.


Changeset Tags

Changeset will be tagged with:

  • source=Comune di Venezia
  • source:license=ODBL 2.0
  • type=import
  • url=https://wiki.openstreetmap.org/w/index.php?title=Import/Catalogue/Venice_addresses_import

Thus people will know the data has been imported following the guidelines and they will find this page for details.

Data Transformation

After the data preparation process, the following workflow has been performed on a subset (Pellestrina Island):

  • dataset pruned records have been converted in a json file;
  • Json file has been processed thru OSM conflator, using this profile;
  • Preview conflated data has been uploaded in an audit map for shared review.

Data Transformation Results

After completion of the audit process, the OSM XML upload candidate file will be available here TODO

Data Merge Workflow

Non-node objects

Address data in Italy must be placed exclusively on nodes because the housenumber identifies the external access that leads from the street to the housing units (houses, stores, offices, etc). Please read https://wiki.openstreetmap.org/wiki/IT:Addresses#Regole_specifiche_per_l.27Italia (in Italian) for more details. At present date, query result for housenumbers applied to polygons or multipolygons count 1134 matches. Distance from dataset nodes and polygon centroids can often be more than conflation 10 meters usable radius, causing several cases (tagged with fixme "suppressed or wrong position: please check") that will need post-import QA inspection.

Conflation

Conflation is performed by OSM Conflator. Objects tagged ad natural=tree and denotation=natural_monument will be extracted from OSM in a bounding box defined by source dataset. Conflator output shall generate a public audit map for visual review.

OSM objects to be conflated

The following query gathers OSM objects for Pellestrina Island:

[out:xml][timeout:25];
area[name="TBD"]["admin_level"=TBD]->.searchArea;
(
  nwr["addr:housenumber"](area.searchArea);
);
out meta qt center;

At present (Febraury 2024) there are about 16k addresses already present in OpenStreetMap. In Pellestrina subset, addresses are about TBD and exported data from query above (export.osm) will be piped to conflator.

Addresses and tags already present are merged by conflator using authoritative addr:housenumber and addr:street. Existing OSM unmatched addresses will be kept in order not to remove other useful tags (amenities, shops, etc).

Matching addrs

Any matching between input dataset and OSM element within a range (defined in profile.py) shall be considered and a proposal for change will be displayed in an audit map as a blue pin.

New addrs

Any input dataset address which has not OSM matches around the above range, will generate a proposal for a new OSM address and will be displayed in an audit map as a green pin.

Not in dataset

Existing OSM elements which don't have an input dataset match will generate a proposal for a fixme tag; text shall be 'this addr is missing from source dataset: please check'. They will be displayed in an audit map as a blue pin.

Conflator output example

pi@raspberrypi:~/OSM conflate -i pellestrina.json --osm export.osm -v -c preview.json profile.py
08:37:53 Found 421 duplicates in the dataset 
08:37:53 Read 4876 items from the dataset 
08:37:53 Downloaded 1085 objects from OSM 
08:38:13 Matched 790 points 
08:38:13 Removed 401 unmatched duplicates 
08:38:13 Adding 3685 unmatched dataset points 
08:38:14 Deleted 0 and retagged 295 unmatched objects from OSM   
pi@raspberrypi:~/OSM

Conflator re-run

Once audit is completed, online data is downloaded from conflator project page (example) and reprocessed.

pi@raspberrypi:~/OSM conflate -i pellestrina.json  -a audit_pellestrina.json -o pellestrina.osm profile.py
[some echoes...]
pi@raspberrypi:~/OSM

Candidates

Municipio Audit published Post audit conflator run File
9 2020-06-01 2021-05-18 TBD.osm

Team Approach

This import is managed and supervised by:

During the upload process, the subset import will be evaluated; possibly the batching criteria will be municipal district (Municipio, in Italian).


Reverting

In case of import anomalies, changeset(s) will be reverted using OSM reverter scripts or, if possible, the JOSM Reverter Plugin.

Post-import QA

Street names

After the import, addr:street names could be slightly different than current street names.

These differences should be caught using OSM Inspector (map already centered on Milan).

Unmarked streets

The result can be used to locate areas where streets are missing.

Missing roads will be created in JOSM using PCN 2012 areal images.

Unnamed streets

The result can be used to derive street names for unnamed streets when all the nodes along the street have the same addr:street value.

Missing road names will be identified using the OpenStreetMap NoName Map Overlay:tms:http://tile3.poole.ch/noname/{zoom}/{x}/{y}.png

OSM Inspector can also be used to find these streets.

Non-node objects

Since several polygon and multipolygon OSM address objects will be tagged as in wrong place, manual adaptation or deletion has to be performed.

See also

The email to the Imports mailing list was sent on 2020-04-04 and can be found in the imports mailing list archives.