Import Liberia UNMIL Places
Goals
The goal is to import the place names released by the United Nations Mission in Liberia (UNMIL) for its inclusion in OpenStreetMap. The dataset contains 14,020 place nodes within Liberia, that will have to be merged with 7,623 place objects already in the OSM database for that country.
This import will help the different humanitarian organizations that are fighting the spread of Ebola virus in West Africa.
For a better understanding of the import, please notice that Liberia is divided in 15 counties, that are further divided in a total of 90 districts, and those districts divided in clans.
Schedule
- Preparation, discussion - due to start the 13th of September 2014.
- Import - expected to start any time after the community has solved any issues, doubts or concerns about this import.
Import Data
Data description
The original dataset consists of a shapefile with 14,020 place names. It contains several field attributes. For the import we are mainly interested in the name field, and probably the UNMIL object id. You can download the data here.
Background
ODbL Compliance verified: YES. See the terms of use:
Terms of Use (UNMIL-GIS Populated Places Dataset)
UNMIL GIS Unit is willing to share geographic datasets on populated places in Liberia upon the following data usage terms:
Warranty
The dataset is made available on an "as is" condition with no representation or guarantee made concerning the accuracy, currency, or completeness of the information provided. The depiction and use of place names and related information in this dataset may include inaccuracies or errors and therefore it does not necessarily imply official endorsement or acceptance by the United Nations and UNMIL as its entity.
Condition of Use
You are free to access, use, distribute and adapt our data as user, as long as you provide attribution or credit to UNMIL.
Import Type
The import will be done manually through one job in the HOT Tasking Manager (TM), having for each task of the job only the place nodes that lie within the task tile, in a similar way as it was done for the Central African Republic UNICEF import. For each task, we will first check the UNMIL nodes to be imported, and second we will assess the data already in the OSM database against the UNMIL one for merging the UNMIL data into the OSM database. The OSM mappers who will contribute to this import job will follow a detailed workflow to accomplish this.
Data Preparation
Data Reduction & Simplification
The data is originally in shapefile format. The file will be opened in JOSM with the Opendata plugin, and then the fields that are usable will be retagged and the ones not usable will be deleted. To all of them a place=hamlet and a source=UNMIL will be added. There won't be any other reduction or simplification than that.
Tagging Plans
We propose to retag each place node with the following schema:
UNMIL field | Description | proposed OSM tag | Remarks |
---|---|---|---|
ADMINLVL | Values are 01, 02, 03 and 05. 01 is for country capital (Monrovia), 02 is for county capitals (15 in total), 03 is for district capitals (90 in total) and 05 is for the rest of the nodes. | We don't find it useful for this import, so we propose to ignore it. | This field could be used for tagging the capitals with capital=* and admin_level=* tags, if not yet tagged. |
CLNAME | Name of the clan the node belongs to | is_in:clan=* | The clans boundaries aren't yet in OSM |
CNAME | Name of the county the node belongs to | We ignore it | Counties boundaries are all mapped for Liberia. This field could be used to eventually correct the counties boundaries. |
DNAME | District name to which the node belongs | We ignore it | Districts boundaries are all mapped for Liberia. This field could be used to eventually correct the districts boundaries. |
FNAME | It's exactly the same as NAME_WS for all the 14,020 nodes. | name=* | |
LOC_TEMP_C | UNMIL object id | unmil:id=* | |
NAME_WS | It's exactly the same as FNAME for all the 14,020 nodes. | We ignore it | |
OBJECTID | Another object id | We ignore it | |
X_UTM | Lon coordinates | We ignore it | |
Y_UTM | Lat coordinates | We ignore it |
To these tags, we will add the following two:
- place=hamlet . 83% of the current place nodes in the OSM database for Liberia are tagged as hamlets, so we expect that the majority of the nodes to be imported will fall in this same value. For those that don't, users will manually upgrade them to place=village, place=neighbourhood, place=town, place=suburb or place=city, or downgrade them to place=isolated_dwelling. In case we don't know which kind of place it is (due to lack of high resolution aerial imagery, presence of clouds, etc.), we will retag it as place=unknown and write a fixme=* tag.
Changeset Tags
We will use the following changeset tags:
- comment=UNMIL Liberia places import, #hotosm-task-652
- created_by=JOSM/version
- source=UNMIL
- import=yes
- url=https://wiki.openstreetmap.org/wiki/Import_Liberia_UNMIL_Places
Data Transformation
Original data is in shapefile format. It will be opened in JOSM with the OpenData plugin and then changed or deleted the fields according to the tagging schema.
Data Merge Workflow
Team Approach
Import will be undertaken by experienced OSM mappers, using an import specific OSM user account.
References
The import is being discussed in the import list and in the HOT list. There isn't a Liberian OSM community yet.
Workflow
A different wiki with the detailed workflow has been created for mappers to follow a consistent import process.
Reverse plan
In case of any trouble, JOSM reverter will be used.
Conflation
As of 13th September 2014, there are 7,510 place nodes and 113 place ways in the OSM database within Liberia. So they amount to nearly half the number of UNMIL nodes, and many of them will be duplicates of the UNMIL nodes that we will have to conflate manually.
The location of the UNMIL nodes is correct for a majority of the nodes. Generally speaking, when merging duplicated nodes, we will take the UNMIL nodes locations as reference.
First of all, we will check if the name of the place is spelled correctly and it respects the cartographic writing conventions.
With very few exceptions, each noun of the place must have its initial letter in capital and subsequent letters non-capital, like for example Kumah, Gorbo Jellue and Bitter Ball Camp. You should delete double spaces too.
The are around 300 nodes that have a slash "/" in their names. In this case, we will make sure there is an space before and after the slash.
There are 39 nodes that include an & in their name=* tag. In many cases, the & means and and should be changed to and, but even if we leave it as &, we have to make sure they have an space both sides.
There are 196 nodes with a #, most of them in the name=* tag, but some others have it in the is_in:clan=* tag, like Marbo #2 and Neezonnie #1. Most probably the meaning here is number. We will leave the # unchanged, but assuring there is a space before the # and none after. So Marbo #2 would be correct, and Marbo # 2 incorrect.
There are 433 nodes with parenthesis. The rule here is space before and no space after the (, and no space before and space after the ). Edward Peal Camp (Old) is correct, but Honeyahun(4) and Bennehglay( Leadopoep would be both incorrect (the later lacks the closing parenthesis too).
Sometimes, some names seem to be truncated at the end. For example, those that end with Village are sometimes truncated in Vi or Villa. In these cases, we should complete those names, and complete any abbreviation in general.
In case the user is not confident with the name of a node, s/he will add a fixme=Confirm name tag to the place node along with a note=* tag to indicate why s/he is not confident about the name.
The UNMIL file contains some pairs of identically named, close to each other, nodes. They seem to differ in the clan they belong to. In this case, we will delete one of the nodes, place the other in the correct position (if we can) and delete the is_in:clan=* tag. In other cases, we have the same pair of nodes but with different name. In this case, we will delete one of the nodes, place the remaining node in the correct position (if we can) and transfer the name as alt_name=* to that node. In all this cases, we will keep the unmil:id=* values of the two nodes (the node we keep and the node we delete) and place them separated by a ; in the unmil:id=* of the resulting node, like for example unmil:id=LBR12036;LBR12137.
We won't make any change in the name=* of any node with its name ending in Village, Town, Camp, Farm, etc. Bear in mind that many hamlets are named with Village at their end, for example, and this should never make us change the place=* value.
In case of doubt, we won't make any change to the name and we will add a fixme=* tag or note=* tag, and leave a remark on the comments window that pops up when marking the task as done in the HOT Tasking Manager.
In case a node's location is uncertain, we will leave the node in that position, and we'll add a fixme=Location approximate on the node
GNS nodes
Most of the place nodes we will encounter in the OSM database are from the GNS database (5,341 out of 7,510).
When finding a duplicated GNS place node already in the OSM database, we will proceed the following way:
1. If the name of the GNS node is the same as the UNMIL one, we will add the GNS source to the source tag of the UNMIL node ( source=UNMIL;GNS ) and then merge both nodes with the Merge tool (M) moving the resulting node only if needed.
2. If the GNS node name is different to the UNMIL counterpart, we will move the GNS name=* to alt_name=* in the GNS node, add the source:alt_name=GNS tag to that node, and then merge with the Merge tool both nodes, moving the resulting node only if needed.
Other non-GNS nodes
If the node has the source=survey tag, we will then keep the OSM node name=* as main name, and put the UNMIL one as alt_name=*. For other cases, use common sense, and write a fixme=* or note=* tag, if needed.
In case of serious doubt, we'll leave the node in the original position, add to it a fixme=* tag and leave a comment on the comments box when you save the task in the Tasking Manager, so it will be carefully reviewed during the validation process.
The nodes inside big towns and cities have to be retagged as place=suburb or place=neighbourhood, depending on the extension.
Nodes that are refugee camps should have the refugee=yes tag added.
For any further questions, you can contact the HOT mailing list.