Slovak Libraries Import
Slovak Libraries Import
Import of Slovak libraries from SNK directory (Slovak National Library).
Goal
Add/complete publicly available libraries from SNK directory to OSM. The process should be repeated when SNK directory is updated (cca twice a year).
License
We obtained email approval from SNK on 2016-11-02.
Input Data
Input data is XLS file on SNK website with following columns:
- library name
- type
- status
- address
- address is string in format
A. Sládkoviča 2
, orHorná Štubňa 472
. If the string equals to city/village name, we assume such city is not using co called "orientation" numbers and only conscription numbers, and thus we do not work in such case with OSM tagsaddr:street
andaddr:streetnumber
.
- address is string in format
- pobox
- town/village name
- subregion
- region
- web page
- phone
Identifier
SNK directory does not contains simple integer-like identifier. Name of the library itself is not unique, nor library name + city/village. For the purpose of matching existing OSM libraries with SNK libraries and finding location of an SNK library we us matching by address (bbox of a subregion, city, street, streetnumber and visually check outcome of the script prior inserting the data into OSM.
Import Lifecycle
OSM contains cca 500 libraries in Slovakia (2016-11-09).
SNK directory contains 2349 records (2016-11-09), for the purpose of import we are interested in publicly available libraries:
- regional: 37
- city libraries: 104
- village libraries: 1601
Initial Import Phase - SNK to OSM sync
Step 1. Search via Overpass API
For each SNK entry we perform following search in selected sub-region bbox:
- A: we search among cca 500 currently existing libraries in OSM, i.e. we search for such nodes and ways which hold
name=snk.name
and tagamenity=library
- B: we search by address:
- if SNK entry contains in address part streetname, we search in OSM for such nodes and ways, which hold tags
addr:city=snk.city, add:street=snk.street, addr:streetnumber=snk.address_no
- if SNK entry does not contains streetname (i.e. it is a village without streetnames), we search OSM for such nodes and ways which hold tags
addr:city=snk.city, addr:housenumber=snk.address_number
- if SNK entry contains in address part streetname, we search in OSM for such nodes and ways, which hold tags
We search OSM only for nodes and ways (not relations), because there is only one existing library in Slovakia on relation, and address points in Slovakia are mostly on ways (buildings) and in lesser amount on nodes.
We pair the OSM search results with SNK records.
Step 2. - prepare osmc file
Rules:
- if SNK record already exists in OSM as
node
orway
withamenity:library
, we do not create newnode
but enrich the existing foundnode
. - for an SNK library, if we find on OSM only address node/way (not tagged as library), we do following:
- if it is
way
(building), we create newnode
in center which will holds amenity:library tag. - if it is
node
, we enrich it.
- if it is
- library name from SNK overrides library name in OSM
- existing
addr:
tags in OSM are not overriden with SNK data. Missingaddr:
tags are fetched from SNK data.
Note: slovak version of this page contains more detailed description on which particular addr tags are used and the rules to adopt them from SNK, but we have not translated it to english because the rules are specific to slovak addresses (thus non-slovak readers will be not familiar with them). We use the same address schema as we use in kapor import.
Script which implements rules above is launched with parameter subregion
and produces following two files:
<subregion>_matched_libraries_to_create_update_or_delete.osc
- for records from set A we have here
<modify>
osc operations. - for records from set B we have here
<create>
and<modify>
osc operations. - each entry has also tag
fixme:yes
to force user to review each entry manually.
- for records from set A we have here
<subregion>_nonmatched_amenity_libraries.osm
- contains those OSM nodes and ways from given subregion bbox which hold
amenity:library
and were not found with our search.
- contains those OSM nodes and ways from given subregion bbox which hold
Step 3. - Manual review and manual import
User will login to JOSM as SKlibraries_bot and opens both files.
User walks through all the nodes and ways in .osc
file and when visually confirms tags and location, removes tag fixme=yes
. Thanks to .osm
file user can see if a library proposed to be created in osc file actually does not exists in OSM (was not discovered by the search process). In such case user will merge the two libraries manually.
Continuous Updates
In initial import phase we expect to find location or existing entry for cca 50% SNK records. Most of the associations between SNK-OSM will be provided via address. Import of Slovak address points (kapor) is still an ongoing process which is proceeded subregion by subregion and currently cca 50% of subregions are imported so far.
We will repeat SNK import later along with the kapor import progress.
SNK directory is updated once or twice a year, after its update we will run the update process again, which will allow us track libraries which can be later disused (SNK directory contains information on disused libraries (snk.status = Zrušená || Stagnujúca
). Location of an existing library is not supposed to change in Slovakia (or it happens very rarely), thus we do not cover in our workflow the situation that a library should change its address.
Dedicated user
Data is imported under dedicated OSM account named SKlibraries_bot
Source code
https://github.com/Infolovec/mapakniznic.sk
run script via rake snk-to-osm <okres>
. (not fully working, work in progress ).
Contact
https://www.openstreetmap.org/user/Peter%20Vojtek/