User:Fanfouer/References
References are existing values, visible from ground at first, read and used on a large diversity of features including roads, buildings, markers, etc.
From the very beginning, OSM didn't assigned references on its own and reuse ones made up by someone else.
They are useful to make links between OSM data and external data sets.
Mappers mostly add references with the ref=* key as they collect them in the field or carefully import public data.
However, values read in the field don't necessary comply with how operators' databases are filled (or vice versa). Differences may prevent reliable and relevant linkage to be done between OSM and third parties.
More elaborated strategies has to be defined to both preserve values read on ground and ones usable for data linkage. Some solutions are shown in this documentation.
Nonetheless, it is important to keep simple and readable on ground - or confirmed by public data - ref=* values prior to set up specific references described below, for sake of Verifiability.
Reference schemes
A reference is intended to clearly identify assets and provide a reliable way to distinguish them relatively to a given extent that is as important to define as the reference itself.
Values should be unique on ground at the chosen extent and concise enough to ease both ground or data maintenance processes and avoid confusion.
Reference values can carry meaningful information or not, mainly depending on the need of operators and their ecosystem.
Meaningful information can be city identifier in bus stops references as to allow quality check on bus stops location for instance.
OSM community doesn't build its own reference strategies but reuse many existing schemes.
References are existing values assigned by someone else and OSM integrates them as this.
It then makes data linkable across different databases and enable more powerful consistency checks.
Building such schemes often rely on norms providing us with guidelines complying with what stated upside. The whole game is often to find which one and hope its documentation is not under copyright.
Such conventions often avoid spaces, accents, mixing low and upper case to prevent any mistake in typing repetitive occurrences and improve reliability as well.
Finally, similar syntax is defined for OSM key names and values.
For instance ref=B152 fits better in these principles than name=Foo bar's Swimming pool.
Dealing with completion
As OpenStreetMap intends to document an ever changing world, reaching completion on a particular topic is hard and so does reference gathering.
Third parties may look to compare their asset management repositories with OSM and find differences but this particular job can be tough due to lack of references both on the ground or in the database.
Convincing companies or public operators to number their assets and displaying them in public space is more than a long term job and data linkage should be possible even when references are missing.
This documentation is intended to provide solutions to mostly deal with references and use OSM id as a fallback when nothing else is available.
Obviously, such usage of OSM data, including comparison checks, linkage in databases should be done with all due respect to the OdBL licence.
Track create/updates from OSM
Here is a proposed process to maintain efficient linkage between OSM and a third party database, propagating created and updated features geometry and tags.
Any appropriate specific reference is used to make a link between OSM and third party database.
If unknown, OSM id is used as a fallback. Known limitations of mapping upon OSM identifiers are mitigated here by tracking deletions below.
Key point is to properly conflate features, especially ways and polygon to be sure the association between OSM identifier and external referential is accurate.
All procedures mentioned here can be adapted to the specific case you intend to address, particularly precision threshold or detection patterns in geometries.
Track deletions from OSM
Tracking deletion works the same logic as upside.
Any appropriate reference is used to make a link between OSM and third party database.
If unknown, OSM identifier is used as a fallback.
All procedures mentioned here can be adapted to the specific case you intend to address.
Country specific references
There is wide use of ref:* as a namespace prefix. Since 2011 the use of a country code ref:FR:* for instance, etc. where keys are clearly country-specific. ref=* is mostly used for readable-on-ground values. More specific references can be used to document more elaborated strategies. It includes formal or sense carrying values, deducted from what is read on ground.
For instance, let's say all fire hydrants of a country got a national reference but ones can only read the local reference on ground, bound to the city the hydrants are installed in. National references should be deducted from local ones.
ref=* will got raw values, basically what is read on ground (not unique values nationally) and ref:contry:hydrant=* will got deduced, cleaned, validated and nationally unique values.
You can find more local repositories of specific refs following links below
See also
- Verifiability principle