Import Unocha Pcode
Goals
OpenStreetMap participates to the 2014 United Nations Mission for Ebola Emergency Response (UNMEER), providing Maps & Services to the humanitarian commmunity that deploy in West Africa. The Settlement Place names are a main element of the response. This helps the thousands of workers in the field to locate rapidly people at risk of contamination. OpenStreetMap Settlement places (nodes place=*) is the official deposit for this Mission.
The goal is to import the unocha:pcode ID provided by UN OCHA for each OSM node [place] for its inclusion in OpenStreetMap. The dataset contains OSM Places nodes for Guinea (8,221), Liberia (16,725) and Sierra Leone (9,137).
This import is to help the different humanitarian organizations that are fighting the spread of Ebola virus in West Africa to report with a unique id for each Settlement place.
Schedule
- Preparation, discussion - September 2014.
- Import - November 2014
Import Data
Data description
A variable unocha:pcode (UN unique ID) is added to each node[place] for Guinea, Liberia and Sierra Leone.
Background
ODbL Compliance verified: YES. See the terms of use:
Terms of Use (UNOCHA P-Code)
UNOCHA is willing to share P-Code for Guinea, Liberia and Sierra Leone upon the following data usage terms:
Warranty
The dataset is made available on an "as is" condition with no representation or guarantee made concerning the accuracy, currency, or completeness of the information provided. The depiction and use of P-Code in this dataset may include inaccuracies or errors and therefore it does not necessarily imply official endorsement or acceptance by the United Nations and UNOCHA as its entity.
Condition of Use
You are free to access, use, distribute and adapt our data as user, as long as you provide attribution or credit to UNOCHA.
Import Type
An extract of the OSM database for nodes[place] for Guinea, Liberia and Sierra Leone (CSV file) on which a variable unocha:pcode is added.
Data Preparation
Upon reception of the CSV file file from UNOCHA, a new extract of the OSM database for the three countries is made. The CSV file is merged with this new extract based on the OSM ID only if the node is still visible in OSM.
Validation and upload of the OSM file is done through JOSM for conflict management following new editions of the database after the extract (a few hours delay maximum).
Data Reduction & Simplification
The osmchange data is in osm format with a unocha:pcode=* added. There won't be any other reduction or simplification than that.
Tagging Plans
The following tag is added to each node
Changeset Tags
The following changeset tags are used:
- comment=UNMIL Liberia places import, #hotosm-task-652
- created_by=JOSM/version
- source=UNMIL
- import=yes
- url=https://wiki.openstreetmap.org/wiki/Import_Liberia_UNMIL_Places
Data Transformation
Original data is in shapefile format. It will be opened in JOSM with the OpenData plugin and then changed or deleted the fields according to the tagging schema.
Data Merge Workflow
Team Approach
Import will be undertaken by experienced OSM mappers, using an import specific OSM user account.
References
The import is being discussed in the import list and in the HOT list. There isn't a Liberian OSM community yet.
Workflow
A different wiki with the detailed workflow has been created for mappers to follow a consistent import process.
Reverse plan
In case of any trouble, JOSM reverter will be used.
Conflation
As of 13th September 2014, there are 7,510 place nodes and 113 place ways in the OSM database within Liberia. So they amount to nearly half the number of UNMIL nodes, and many of them will be duplicates of the UNMIL nodes that we will have to conflate manually.
The location of the UNMIL nodes is correct for a majority of the nodes. Generally speaking, when merging duplicated nodes, we will take the UNMIL nodes locations as reference.
First of all, we will check if the name of the place is spelled correctly and it respects the cartographic writing conventions.
With very few exceptions, each noun of the place must have its initial letter in capital and subsequent letters non-capital, like for example Kumah, Gorbo Jellue and Bitter Ball Camp. You should delete double spaces too.
The are around 300 nodes that have a slash "/" in their names. In this case, we will make sure there is an space before and after the slash.
There are 39 nodes that include an & in their name=* tag. In many cases, the & means and and should be changed to and, but even if we leave it as &, we have to make sure they have an space both sides.
There are 196 nodes with a #, most of them in the name=* tag, but some others have it in the is_in:clan=* tag, like Marbo #2 and Neezonnie #1. Most probably the meaning here is number. We will leave the # unchanged, but assuring there is a space before the # and none after. So Marbo #2 would be correct, and Marbo # 2 incorrect.
There are 433 nodes with parenthesis. The rule here is space before and no space after the (, and no space before and space after the ). Edward Peal Camp (Old) is correct, but Honeyahun(4) and Bennehglay( Leadopoep would be both incorrect (the later lacks the closing parenthesis too).
Sometimes, some names seem to be truncated at the end. For example, those that end with Village are sometimes truncated in Vi or Villa. In these cases, we should complete those names, and complete any abbreviation in general.
In case the user is not confident with the name of a node, s/he will add a fixme=Confirm name tag to the place node along with a note=* tag to indicate why s/he is not confident about the name.
The UNMIL file contains some pairs of identically named, close to each other, nodes. They seem to differ in the clan they belong to. In this case, we will delete one of the nodes, place the other in the correct position (if we can) and delete the is_in:clan=* tag. In other cases, we have the same pair of nodes but with different name. In this case, we will delete one of the nodes, place the remaining node in the correct position (if we can) and transfer the name as alt_name=* to that node. In all this cases, we will keep the unmil:id=* values of the two nodes (the node we keep and the node we delete) and place them separated by a ; in the unmil:id=* of the resulting node, like for example unmil:id=LBR12036;LBR12137.
We won't make any change in the name=* of any node with its name ending in Village, Town, Camp, Farm, etc. Bear in mind that many hamlets are named with Village at their end, for example, and this should never make us change the place=* value.
In case of doubt, we won't make any change to the name and we will add a fixme=* tag or note=* tag, and leave a remark on the comments window that pops up when marking the task as done in the HOT Tasking Manager.
In case a node's location is uncertain, we will leave the node in that position, and we'll add a fixme=Location approximate on the node
GNS nodes
Most of the place nodes we will encounter in the OSM database are from the GNS database (5,341 out of 7,510).
When finding a duplicated GNS place node already in the OSM database, we will proceed the following way:
1. If the name of the GNS node is the same as the UNMIL one, we will add the GNS source to the source tag of the UNMIL node ( source=UNMIL;GNS ) and then merge both nodes with the Merge tool (M) moving the resulting node only if needed.
2. If the GNS node name is different to the UNMIL counterpart, we will move the GNS name=* to alt_name=* in the GNS node, add the source:alt_name=GNS tag to that node, and then merge with the Merge tool both nodes, moving the resulting node only if needed.
Other non-GNS nodes
If the node has the source=survey tag, we will then keep the OSM node name=* as main name, and put the UNMIL one as alt_name=*. For other cases, use common sense, and write a fixme=* or note=* tag, if needed.
In case of serious doubt, we'll leave the node in the original position, add to it a fixme=* tag and leave a comment on the comments box when you save the task in the Tasking Manager, so it will be carefully reviewed during the validation process.
The nodes inside big towns and cities have to be retagged as place=suburb or place=neighbourhood, depending on the extension.
Nodes that are refugee camps should have the refugee=yes tag added.
For any further questions, you can contact the HOT mailing list.