Import/Catalogue/Bed and Breakfast-RAFVG
About
This page is about importing procedure that has been followed in order to upload 410 new POIs in planet.osm. A 600+ rows Bed&Breakfast dataset is available with opendata licence issued by RAFVG w/o geo coordinates. Since housenumbers were recently imported from RAFVG dataset, geocoding has been applied.
Dataset
Bed and Breakfast is a rather new dataset (Oct 17) with more than 600 POIs. Many useful fields such as
- name and operator
- phone
- site
- opening hours
- category (standard, comfort, superior)
Cleaninig data
Such duty has been accomplished by OpenRefine and Reconcilie plugin, connected as a reconciliation service. In order to standardize messy B&B addresses (entered by B&B operators theirselves), Reconcile has been feed with an authoritative set of highway names, derived from overpass-turbo (see Strade d'Italia OSM diary entry).
Record format and tagging plan
RAFVG dataset table structure will be adapted and pruned thru OpenRefine.
Field | Name | Description:it | Description:en | Example | tagged as |
---|---|---|---|---|---|
1 | codice esercizio | codice univoco scheda BB | unique BB code | 48460 | ref |
2 | PROVINCIA | nome provincia | province name | Gorizia | |
3 | COMUNE | Nome Comune | Municipality name | Cormons | |
4 | CATEGORIA | categoria | category | Superior | |
5 | DENOMINAZIONE | denominazione | name | Il Gelso di Pippo Franco | |
6 | INDIRIZZO | indirizzo | address | Via Bancaria, 7 | |
7 | CAP | codice di avviamento postale | postcode | 33100 | |
8-10 | TELEFONO/FAX/CELLULARE | recapito telefonico | phone/fax/other | 0433 555666 | |
11 | posta elettronica | pippo@franco.it | |||
12 | N_CAMERE | posta elettronica | pippo@franco.it | ||
13 | N_POSTI_LETTO | posta elettronica | pippo@franco.it |
Geocoding
csvgecode command line tool has been used. An issue about aposrtrophe in municipalities and addresses has been bypassed removing such character from source data.
Here is a run using mapbox service:
$ csvgeocode input.csv output.csv --handler mapbox --delay 1000 --verbose --url "http://api.tiles.mapbox.com/v4/geocode/mapbox.places/{{INDIRIZZO}},{{CAP}} {{COMUNE}}.json?access_token="
Here nominatim, instead:
$ csvgeocode input.csv output.csv --handler osm --delay 1000 --verbose --url "http://nominatim.openstreetmap.org/search?q={{INDIRIZZO 1}}, {{COMUNE}}&format=json"
Rows geocoded: 468
Rows failed: 114
Time elapsed: 879.4 seconds
114 rows not geocoded exposed the geocoder issue with apostrophes in city field. Workaround to bypass such not escapable apostrophe is both removing it (ie: Farra d'Isonzo >> Farra disonzo) or use postcode instead. Same problem for address, but only solution found is removing char (ie: San Francesco d'Assisi >> San Francesco dassisi). Above edits are for geocoding sake only, since no addr:* shall be imported. Besides, part of "success" geocoding rows could have been geocoded even with missing housenumber, resulting in highway centroid coordinates. To limit these false positives, some municipalities w/o addresses (here in red) has been isolated and geocoded nodes excluded.
Conflating
Finally conflation has been run to generate an audit map. After manual check, 410 nodes has been written in fianl osm file.
Files
Please, find here files involved.