Automated edits/TTmechanicalupdates/Fix issue with duplicated inner polygons in United States
Who
TomTom team using TTmechanicalupdates bot account.
The team can be contacted at OSM@tomtom.com.
Why
Based on the Osmose Rule 1170 Class 1 "Double inner polygon" (the geometry of the multipolygon inner ring is duplicated: one is in a relation but without a tag and another has tags but is not part of the relation), we have detected approximately 20,000 such issues in the United States of America alone.
61% of the USA issues is related to the National Hydrography Dataset (NHD) as a source of data. Looking at the description of NHD data on the OSM wiki, those polygons are usually imported via JOSM.
Example
This example shows the relation with ID=1397683. This relation has 24 members, where one way with ID=97163738 (hereinafter referred to as Inner Ring Way) has a duplicating way ID=97138927 (hereinafter referred to as Duplicating Way) which is assigned to the same nodes as the Inner Ring Way, but is not a member of the relation (ID=1397683). In addition, the Inner Ring Way has no tags assigned, and the Duplicating Way has assigned tags, which would suggest that the Duplicating Way should be a member of relation ID=1397683.
Algorithm
The bot takes violations from Osmose (rule id 1170, class 1) as input data. For each violation, data from OSM is fetched and violations are verified one more time.
Violations with common way ids (separate violations in Osmose, with repeating way id) are grouped into one changeset.
The following verifiers are executed:
- all ways are closed,
- all ways have the same nodes (direction of way digitalization and starting node can be different),
- the duplicating way should have tags*,
- the duplicating way is not a member of any relation,
- inner ring ways should have no tags,
- inner ring ways are a member of only 1 relation (the one from the Osmose violation),
- optional: relation and duplicating way should have a required "source" tag (e.g., "source=NHD)**.
*In case of multiple duplicating ways these violations will be skipped.
**By default this verifier is skipped--the source tag will not be checked.
When a violation is confirmed by the bot, data modification is performed.
Data modification is understood as:
Basic scenario:
- copying tags from the way, which does not belong to any relation to the way, which is a member of a violating relation,
- removing way which does not belong to any relation.
Three or more duplicated ways scenario:
- removing inner ring ways which belong to relations,
- assigning a duplicating way to those relations.
The bot source code is available here.
Test Run
Before running the bot on the whole of the United States, we will run the automated updates on a smaller area. For this, we've selected Louisiana, where we have 5,542 cases logged by the Osmose rule.
Bot runs
To make sure that the system is not overloaded, we plan to run the bot in parts based on Osmose regions. Below you can see the proposed order:
Order | Osmose region | Count of issues* |
---|---|---|
1 | usa_louisiana - TEST RUN | 5542 |
2 | usa_north_carolina | 1908 |
3 | usa_south_carolina | 1291 |
4 | usa_north_dakota | 1280 |
5 | usa_michigan | 887 |
6 | usa_massachusetts | 727 |
7 | usa_georgia | 705 |
8 | usa_rhode_island | 603 |
9 | usa_california_merced | 451 |
10 | usa_connecticut | 410 |
11 | usa_iowa | 375 |
12 | usa_illinois | 350 |
13 | usa_kansas | 336 |
14 | usa_colorado | 318 |
15 | usa_washington | 256 |
16 | usa_california_madera | 208 |
17 | usa_south_dakota | 202 |
18 | usa_california_stanislaus | 192 |
19 | usa_florida | 177 |
20 | usa_california_san_luis_obispo | 163 |
21 | usa_alabama | 158 |
22 | usa_missouri | 156 |
23 | usa_oregon | 153 |
24 | usa_california_sacramento | 147 |
25 | usa_california_tulare | 132 |
26 | usa_minnesota | 126 |
27 | usa_virginia | 115 |
28 | usa_california_butte | 106 |
29 | usa_california_siskiyou | 99 |
30 | usa_california_plumas | 98 |
31 | usa_california_fresno | 96 |
32 | usa_new_hampshire | 96 |
33 | usa_indiana | 95 |
34 | usa_oklahoma | 88 |
35 | usa_ohio | 84 |
36 | usa_tennessee | 79 |
37 | usa_california_santa_barbara | 78 |
38 | usa_wisconsin | 78 |
39 | usa_texas | 72 |
40 | usa_california_solano | 72 |
41 | usa_arizona | 67 |
42 | usa_california_el_dorado | 65 |
43 | usa_mississippi | 62 |
44 | usa_california_colusa | 54 |
45 | usa_california_lassen | 53 |
46 | usa_california_san_joaquin | 49 |
47 | usa_pennsylvania | 49 |
48 | usa_montana | 47 |
49 | usa_california_monterey | 46 |
50 | usa_new_jersey | 46 |
51 | usa_california_sutter | 46 |
52 | usa_california_kern | 45 |
53 | usa_new_york | 44 |
54 | usa_california_modoc | 43 |
55 | usa_maine | 42 |
56 | usa_utah | 41 |
57 | usa_california_orange | 38 |
58 | usa_california_santa_cruz | 38 |
59 | usa_maryland | 38 |
60 | usa_california_yolo | 36 |
61 | usa_california_contra_costa | 33 |
62 | usa_alaska | 30 |
63 | usa_california_sierra | 30 |
64 | usa_california_lake | 30 |
65 | usa_district_of_columbia | 27 |
66 | usa_california_amador | 26 |
67 | usa_california_mariposa | 25 |
68 | usa_california_los_angeles | 24 |
69 | usa_california_kings | 22 |
70 | usa_new_mexico | 21 |
71 | usa_california_glenn | 17 |
72 | usa_california_napa | 15 |
73 | usa_kentucky | 14 |
74 | usa_nevada | 12 |
75 | usa_delaware | 12 |
76 | indonesia_east_nusa_tenggara | 12 |
77 | usa_california_san_bernardino | 11 |
78 | usa_idaho | 11 |
79 | usa_california_san_benito | 10 |
80 | usa_west_virginia | 9 |
81 | usa_vermont | 9 |
82 | usa_california_humboldt | 8 |
83 | usa_california_shasta | 8 |
84 | usa_california_yuba | 8 |
85 | usa_california_santa_clara | 6 |
86 | usa_arkansas | 6 |
87 | usa_nebraska | 5 |
88 | usa_california_riverside | 4 |
89 | usa_california_marin | 3 |
90 | usa_california_trinity | 3 |
91 | usa_puerto_rico | 2 |
92 | usa_california_ventura | 2 |
93 | usa_california_tehama | 2 |
94 | usa_california_del_norte | 1 |
95 | usa_california_alameda | 1 |
96 | usa_california_san_mateo | 1 |
97 | usa_wyoming | 1 |
98 | usa_california_mendocino | 1 |
99 | usa_california_alpine | 1 |
100 | usa_california_san_francisco | 1 |
101 | usa_california_placer | 1 |
102 | usa_hawaii | 1 |
n/a | usa_california_nevada | 0 |
n/a | usa_california_calaveras | 0 |
n/a | usa_northern_mariana_islands | 0 |
n/a | usa_guam | 0 |
n/a | usa_california_san_diego | 0 |
n/a | usa_american_samoa | 0 |
n/a | indonesia_west_nusa_tenggara | 0 |
n/a | usa_california_mono | 0 |
n/a | usa_california_tuolumne | 0 |
n/a | usa_california_inyo | 0 |
n/a | usa_california_imperial | 0 |
n/a | usa_virgin_islands | 0 |
n/a | usa_california_sonoma |
*count taken from Osmose, date: February 8 2022 (note that the number of the violations can differ, depending on a day, as features are being edited in the OSM constantly)
Discussion
This automated action will be announced and discussed in the talk-us mailing list. We invite everyone to join the conversation and share feedback.
This is the link to the notification: https://lists.openstreetmap.org/pipermail/talk-us/2022-February/021602.html
Opt-out
To opt out of this automated update, please write an e-mail (in English) to TTmechanicalupdates@groups.tomtom.com describing which area or source version should be excluded from the update scope and why.
When
Full run (without Louisiana)
The runs and results analysis were performed between 8 Mar 2022 - 9 Mar 2022.
Test run (Louisiana)
The test run was completed on 1 March 2022.
Outcome
Full run details
Scope: USA
Start Date: 8 Mar 2022
General summary of the full bot run:
Opened changesets | Total violations | Fixed violations | Not fixed violations |
---|---|---|---|
459 | 20,102 | 19,821 | 281 |
Below you can see the results of the full run per region:
Run No. | Region | Opened changesets | Total violations | Fixed violations | Found duplicates* | Fixed duplicates | Filtered out by verifiers** | Others rejected*** |
---|---|---|---|---|---|---|---|---|
1 | usa_louisiana (Test Run) | 110 | 5542 | 5459 | 82 | 0 | 0 | 1 |
2 | usa_north_carolina | 38 | 1891 | 1880 | 6 | 0 | 1 | 4 |
3 | usa_south_carolina | 27 | 1291 | 1283 | 10 | 2 | 0 | 0 |
4 | usa_north_dakota | 26 | 1280 | 1280 | 0 | 0 | 0 | 0 |
5 | usa_michigan | 18 | 887 | 873 | 12 | 0 | 1 | 1 |
6 | usa_massachusetts | 15 | 731 | 720 | 8 | 0 | 0 | 3 |
7 | usa_georgia | 15 | 736 | 735 | 0 | 0 | 1 | 0 |
8 | usa_rhode_island | 13 | 610 | 602 | 8 | 0 | 0 | 0 |
9 | usa_california_merced | 10 | 451 | 451 | 0 | 0 | 0 | 0 |
10 | usa_connecticut | 9 | 408 | 407 | 0 | 0 | 0 | 1 |
11 | usa_iowa | 8 | 374 | 374 | 0 | 0 | 0 | 0 |
12 | usa_illinois | 7 | 356 | 328 | 28 | 0 | 0 | 0 |
13 | usa_kansas | 8 | 383 | 383 | 0 | 0 | 0 | 0 |
14 | usa_colorado | 7 | 318 | 302 | 16 | 0 | 0 | 0 |
15 | usa_washington | 6 | 266 | 250 | 14 | 0 | 0 | 2 |
16 | usa_california_madera | 6 | 208 | 208 | 0 | 0 | 0 | 0 |
17 | usa_south_dakota | 5 | 202 | 202 | 0 | 0 | 0 | 0 |
18 | usa_california_stanislaus | 4 | 192 | 192 | 0 | 0 | 0 | 0 |
19 | usa_florida | 5 | 221 | 212 | 6 | 0 | 0 | 3 |
20 | usa_california_san_luis_obispo | 4 | 163 | 163 | 0 | 0 | 0 | 0 |
21 | usa_alabama | 4 | 158 | 158 | 0 | 0 | 0 | 0 |
22 | usa_missouri | 4 | 156 | 155 | 0 | 0 | 1 | 0 |
23 | usa_oregon | 4 | 153 | 151 | 2 | 0 | 0 | 0 |
24 | usa_california_sacramento | 0 | 146 | 146 | 0 | 0 | 0 | 0 |
25 | usa_california_tulare | 3 | 132 | 132 | 0 | 0 | 0 | 0 |
26 | usa_minnesota | 4 | 127 | 124 | 4 | 2 | 1 | 0 |
27 | usa_virginia | 3 | 120 | 119 | 0 | 0 | 0 | 1 |
28 | usa_california_butte | 3 | 106 | 104 | 2 | 0 | 0 | 0 |
29 | usa_california_siskiyou | 2 | 99 | 99 | 0 | 0 | 0 | 0 |
30 | usa_california_plumas | 2 | 98 | 98 | 0 | 0 | 0 | 0 |
31 | usa_california_fresno | 2 | 96 | 96 | 0 | 0 | 0 | 0 |
32 | usa_new_hampshire | 2 | 96 | 96 | 2 | 0 | 0 | 0 |
33 | usa_indiana | 2 | 95 | 93 | 0 | 0 | 0 | 2 |
34 | usa_oklahoma | 2 | 88 | 88 | 0 | 0 | 0 | 0 |
35 | usa_ohio | 2 | 84 | 83 | 0 | 0 | 1 | 0 |
36 | usa_tennessee | 2 | 79 | 68 | 0 | 0 | 1 | 10 |
37 | usa_california_santa_barbara | 2 | 72 | 72 | 0 | 0 | 0 | 0 |
38 | usa_wisconsin | 2 | 87 | 87 | 0 | 0 | 0 | 0 |
39 | usa_texas | 2 | 72 | 70 | 2 | 0 | 0 | 0 |
40 | usa_california_solano | 2 | 72 | 72 | 0 | 0 | 0 | 0 |
41 | usa_arizona | 2 | 67 | 67 | 0 | 0 | 0 | 0 |
42 | usa_california_el_dorado | 2 | 65 | 65 | 0 | 0 | 0 | 0 |
43 | usa_mississippi | 2 | 62 | 61 | 0 | 0 | 1 | 0 |
44 | usa_california_colusa | 2 | 54 | 54 | 0 | 0 | 0 | 0 |
45 | usa_california_lassen | 2 | 53 | 53 | 0 | 0 | 0 | 0 |
46 | usa_california_san_joaquin | 1 | 48 | 48 | 0 | 0 | 0 | 0 |
47 | usa_pennsylvania | 1 | 49 | 49 | 0 | 0 | 0 | 0 |
48 | usa_montana | 1 | 47 | 47 | 0 | 0 | 0 | 0 |
49 | usa_california_monterey | 1 | 46 | 44 | 0 | 0 | 1 | 1 |
50 | usa_new_jersey | 1 | 23 | 23 | 0 | 0 | 0 | 0 |
51 | usa_california_sutter | 1 | 46 | 46 | 0 | 0 | 0 | 0 |
52 | usa_california_kern | 1 | 44 | 44 | 0 | 0 | 0 | 0 |
53 | usa_new_york | 2 | 77 | 77 | 0 | 0 | 0 | 0 |
54 | usa_california_modoc | 1 | 43 | 43 | 0 | 0 | 0 | 0 |
55 | usa_maine | 1 | 42 | 40 | 0 | 0 | 0 | 2 |
56 | usa_utah | 1 | 42 | 9 | 32 | 32 | 0 | 1 |
57 | usa_california_orange | 1 | 38 | 38 | 0 | 0 | 0 | 0 |
58 | usa_california_santa_cruz | 1 | 38 | 36 | 2 | 0 | 0 | 0 |
59 | usa_maryland | 1 | 38 | 37 | 0 | 0 | 1 | 0 |
60 | usa_california_yolo | 1 | 36 | 36 | 0 | 0 | 0 | 0 |
61 | usa_california_contra_costa | 1 | 33 | 33 | 0 | 0 | 0 | 0 |
62 | usa_alaska | 1 | 30 | 30 | 0 | 0 | 0 | 0 |
63 | usa_california_sierra | 1 | 30 | 30 | 0 | 0 | 0 | 0 |
64 | usa_california_lake | 1 | 30 | 30 | 0 | 0 | 0 | 0 |
65 | usa_district_of_columbia | 2 | 61 | 53 | 8 | 0 | 0 | 0 |
66 | usa_california_amador | 1 | 26 | 26 | 0 | 0 | 0 | 0 |
67 | usa_california_mariposa | 1 | 25 | 25 | 0 | 0 | 0 | 0 |
68 | usa_california_los_angeles | 1 | 24 | 24 | 0 | 0 | 0 | 0 |
69 | usa_california_kings | 1 | 22 | 22 | 0 | 0 | 0 | 0 |
70 | usa_new_mexico | 1 | 21 | 21 | 0 | 0 | 0 | 0 |
71 | usa_california_glenn | 1 | 17 | 17 | 0 | 0 | 0 | 0 |
72 | usa_california_napa | 1 | 13 | 13 | 0 | 0 | 0 | 0 |
73 | usa_kentucky | 1 | 14 | 14 | 0 | 0 | 0 | 0 |
74 | usa_nevada | 1 | 12 | 12 | 0 | 0 | 0 | 0 |
75 | usa_delaware | 1 | 12 | 12 | 0 | 0 | 0 | 0 |
76 | indonesia_east_nusa_tenggara | 1 | 12 | 12 | 0 | 0 | 0 | 0 |
77 | usa_california_san_bernardino | 1 | 11 | 11 | 0 | 0 | 0 | 0 |
78 | usa_idaho | 1 | 11 | 11 | 0 | 0 | 0 | 0 |
79 | usa_california_san_benito | 1 | 10 | 10 | 0 | 0 | 0 | 0 |
80 | usa_west_virginia | 1 | 9 | 9 | 0 | 0 | 0 | 0 |
81 | usa_vermont | 1 | 9 | 8 | 0 | 0 | 1 | 1 |
82 | usa_california_humboldt | 1 | 8 | 8 | 0 | 0 | 0 | 0 |
83 | usa_california_shasta | 1 | 8 | 8 | 0 | 0 | 0 | 0 |
84 | usa_california_yuba | 1 | 8 | 8 | 0 | 0 | 0 | 0 |
85 | usa_california_santa_clara | 1 | 6 | 6 | 0 | 0 | 0 | 0 |
86 | usa_arkansas | 1 | 6 | 6 | 0 | 0 | 0 | 0 |
87 | usa_nebraska | 1 | 5 | 5 | 0 | 0 | 0 | 0 |
88 | usa_california_riverside | 1 | 4 | 4 | 0 | 0 | 0 | 0 |
89 | usa_california_marin | 1 | 3 | 3 | 0 | 0 | 0 | 0 |
90 | usa_california_trinity | 1 | 3 | 3 | 0 | 0 | 0 | 0 |
91 | usa_puerto_rico | 1 | 2 | 2 | 0 | 0 | 0 | 0 |
92 | usa_california_ventura | 1 | 2 | 2 | 0 | 0 | 0 | 0 |
93 | usa_california_tehama | 1 | 2 | 2 | 0 | 0 | 0 | 0 |
94 | usa_california_del_norte | 1 | 1 | 1 | 0 | 0 | 0 | 0 |
95 | usa_california_alameda | 1 | 1 | 1 | 0 | 0 | 0 | 0 |
96 | usa_california_san_mateo | 1 | 1 | 1 | 0 | 0 | 0 | 0 |
97 | usa_wyoming | 1 | 1 | 1 | 0 | 0 | 0 | 0 |
98 | usa_california_mendocino | 1 | 1 | 1 | 0 | 0 | 0 | 0 |
99 | usa_california_alpine | 1 | 1 | 1 | 0 | 0 | 0 | 0 |
100 | usa_california_san_francisco | 1 | 1 | 1 | 0 | 0 | 0 | 0 |
101 | usa_california_placer | 1 | 1 | 1 | 0 | 0 | 0 | 0 |
102 | usa_hawaii | 1 | 1 | 1 | 0 | 0 | 0 | 0 |
*Found duplicates - cases were there are more than 2 duplicated ways
**Filtered out by verifiers - cases which are rejected because they are not passing all the algorithm criteria
***Others rejected - other reasons of rejection, usually incomplete data, not easily solvable in an automatic way
Test run details
Scope: USA, Region: usa_louisiana
Start Date: 1 March 2022
Violations source date: 1 March 2022
Total violation count: 5,542
Uploaded fixes: 5,459
Source verifier rejected: 0
Filtered out due to duplicated way id: 82
Incomplete data (inner and duplicated had tags): 1
Total opened changesets: 110 (117973580, 117973610, 117973642, 117973669, 117973698, 117973721, 117973748, 117973776, 117973817, 117973847, 117973878, 117973907, 117973935, 117973960, 117973987, 117974018, 117974047, 117974073, 117974101, 117974126, 117974153, 117974177, 117974205, 117974233, 117974251, 117974284, 117974315, 117974344, 117974384, 117974411, 117974442, 117974462, 117974487, 117974526, 117974552, 117974576, 117974607, 117974637, 117974668, 117974703, 117974730, 117974756, 117974775, 117974799, 117974834, 117974861, 117974887, 117974913, 117974940, 117974971, 117974997, 117975025, 117975047, 117975066, 117975106, 117975126, 117975151, 117975174, 117975202, 117975221, 117975247, 117975270, 117975295, 117975324, 117975352, 117975363, 117975389, 117975416, 117975440, 117975456, 117975480, 117975502, 117975521, 117975558, 117975602, 117975631, 117975660, 117975688, 117975715, 117975741, 117975763, 117975791, 117975819, 117975847, 117975871, 117975900, 117975920, 117975935, 117975960, 117975990, 117976007, 117976045, 117976078, 117976108, 117976138, 117976164, 117976197, 117976219, 117976242, 117976266, 117976297, 117976322, 117976348, 117976372, 117976397, 117976419, 117976440, 117976470, 117976501, 117976515)
Total time of run: app. 82 minutes