Import/Catalogue/AlimentareFVG

From OpenStreetMap Wiki
Jump to navigation Jump to search

About

This page is about importing 1389 food businesses dataset published by Confartigianato (CCIAA) of Friuli Venezia Giulia Region (RAFVG), Italy. The import is being discussed on the OSM mailing list. The import will be the result of consensus there.

Food-related POIs will include the following activities (in frequency order):

  • RISTORAZIONE PER ASPORTO PIZZA AL TAGLIO KEBAB ROSTICCERIE
  • GELATIERI E PASTICCERI
  • PANE PRODOTTI DA FORNO
  • LAVORAZIONE PROSCIUTTO E CARNI
  • SPECIALITA' GASTRONOMICHE PRODUZIONE
  • CASEARI
  • MOLITORI MULINI
  • VINO DISTILLATI E BEVANDE
  • BIRRA

Goals

This import aims to have an updated and mantainable set of POIs for the RAFVG territory (OSM admin_level=4). Import will be affected by geocoding issues, due to missing OSM addr data and/or mispellings in dataset. Not geocodable records should be reported to source.

Schedule

Starting from Oct 2019, a pilot import will be performed thru conflation and audit. GELATIERI E PASTICCERI activity is candidate. Progress will be trackable in two audit maps, namely ice_cream and pastry. Depending on mappers involved, import should take 20-40 days to be accomplished.

Import Data

Background

Dataset comes in xlsx (Excel format) readable single table. CCIAA declares dataset as not comprehensive, since not all businesses are enrolled in. Activities are categorized by "MESTIERE" field, allowing splitting in different conflations and import sessions. "MESTIERE" field is not a free format, but refers to ATECO codes and definitions.

Dataset does not feature geocoordinates, hence geocoding is being performed based on address fields.

Metadata

As defined in CCIAA Opendata page, datasets feature the following:

  • source name: Progetto Pilota Imprese artigiane FVG OSM
  • release date: 2019-09-03
  • last update: 2019-10-15
  • AOI: Friuli Venezia Giulia
  • operator: CCIAA

Legal

Record format and tagging plan

Table structure will be pruned and adapted thru OpenRefine; fields will be managed referring to addresses for geocoding and amenity and shop wiki pages for tagging map.

Below table list of useful input fields:

field sample tag notes
PARTITA-IVA TXT 0123456789 ref:vatin=IT00123456789 personal "Codice fiscale" erased
N-ALBO-AA GO-11703 ref=GO-11703
DENOMINAZIONE ALFREDO SERVIZI DI TOPRAN CUTIN A. name=ALFREDO SERVIZI

operator=TOPRAN CUTIN A.

splitted by 1st occurence of "DI" string
CAP 33100 postcode=33100
INDIRIZZO Via Antonio Gramsci 28/B addr:street=Via Antonio Gramsci

addr:housenumber=28/B

splitted by 1st number occourence
COMUNE Buttrio addr:city=Buttrio
TELEFONO 0432643333 phone=+39 0432 643333
ATTIVITA' LAVORAZIONE ARTIGIANALE DI PRODOTTI DOLCIARI amenity=ice_cream

and/or

shop=pastry

depending on "GELAT" and "PASTICC" occourences
MESTIERE GELATIERI E PASTICCIERI n/a used for dataset filtering
FORMA_GIUR SOCIETA' IN NOME COLLETTIVO TBD legal form

Import Type

It shall not be a blind import: source data shall be checked and audited by mappers through an audit support map.

Audit support map

The dataset will be imported on its regional base (OSM admin_level=4). OSM candidate nodes will be presented as pins on dedicated audit support maps.

Pins

  • Blue translucent: dataset position
  • Blue: OSM position (centroid if polygon)
  • Green: new POI, can be dragged in better position.

Fields

  • Yellow: proposed tag value substitution
  • Green: new tag

Goals

This audit aims to add missing source data POIs and to update OSM existing ones. Besides, you cat take the chance to check:

  • name typos
  • position
  • duplicates

For any doubt, "skip" will postpone POI audit or a "fixme" will be inherited by OSM candidate object.

Team Approach

Import will be managed by OSM user Cascafico; audit will be open to any OSM user accessing audit map.

Workflow

Step by step operations:

  1. dataset download
  2. OpenRefine operations
  3. conflation
  4. community audit
  5. audit freeze
  6. osm data generation
  7. final check thru JOSM

In case of import problems, changeset involved will be reverted using proper reverter

Data Preparation

The data is presented in a single Excel spreadsheet as a collection of punctual elements, one for each business. Data is importable by Openrefine.

Refining

Some normalizations require general refining operations. Below, a summary of actions performed thru OpenRefine operations:

  • name prepositions split DENOMINAZIONE in name and operator
  • names and operators to title case
  • address split in addr:street and addr:housenumber
  • nominatim geocoding

Tagging

Tag mapping for food businesses needs to evaluate two input fields, namely ATTIVITA' used as rough filter and MESTIERE.

Attivita'

Following list of ATTIVITA' values in frequency order:

  • RISTORAZIONE PER ASPORTO PIZZA AL TAGLIO KEBAB ROSTICCERIE
  • GELATIERI E PASTICCERI
  • PANE PRODOTTI DA FORNO
  • LAVORAZIONE PROSCIUTTO E CARNI
  • SPECIALITA' GASTRONOMICHE PRODUZIONE
  • CASEARI
  • BIRRA
  • VINO DISTILLATI E BEVANDE
  • MOLITORI MULINI


Mestiere

Each MESTIERE field is being mapped to OSM key=value thru Openrefine. Since business operators fill a free form for MESTIERE, standardization is being performed thru Openrefine; refer to below table for involved Openrefine operations and profile files.

tagging lookup table
MESTIERE operations profile
GELATIERI E PASTICCERI gelatieri e pasticceri ice_cream, pastry
RISTORAZIONE PER ASPORTO, PIZZA AL TAGLIO, KEBAB, ROSTICCERIE fast_food fast_food

Conflation

Conflation parameters are set in business specific profile, i.e. ice_cream profile. Herein:

  • Overpass-turbo query definitions
  • OSM Tags to be replaced
  • unmatched fixme value
  • max distance radius for matching

Upload

Data shall be uploaded manually thru JOSM editor. Dedicated upload account shall be CCIAAimport.

Changeset Tags

Changesets should be tagged with:

Progress

Mestiere tag audit audit status ready to upload changeset
GELATIERI E PASTICCERI ice_cream Icecream in FVG 100 % ice_cream.osm TBD