Automated edits/TTmechanicalupdates/Fix issue with duplicated inner polygons in United States

From OpenStreetMap Wiki
Jump to navigation Jump to search

Who

TomTom team using TTmechanicalupdates bot account.

The team can be contacted at OSM@tomtom.com.

Why

Based on the Osmose Rule 1170 Class 1 "Double inner polygon" (the geometry of the multipolygon inner ring is duplicated: one is in a relation but without a tag and another has tags but is not part of the relation), we have detected approximately 20,000 such issues in the United States of America alone.

61% of the USA issues is related to the National Hydrography Dataset (NHD) as a source of data. Looking at the description of NHD data on the OSM wiki, those polygons are usually imported via JOSM.

Example

This example shows the relation with ID=1397683. This relation has 24 members, where one way with ID=97163738 (hereinafter referred to as Inner Ring Way) has a duplicating way ID=97138927 (hereinafter referred to as Duplicating Way) which is assigned to the same nodes as the Inner Ring Way, but is not a member of the relation (ID=1397683). In addition, the Inner Ring Way has no tags assigned, and the Duplicating Way has assigned tags, which would suggest that the Duplicating Way should be a member of relation ID=1397683.

Algorithm

The bot takes violations from Osmose (rule id 1170, class 1) as input data. For each violation, data from OSM is fetched and violations are verified one more time.

Violations with common way ids (separate violations in Osmose, with repeating way id) are grouped into one changeset.

The following verifiers are executed:

  • all ways are closed,
  • all ways have the same nodes (direction of way digitalization and starting node can be different),
  • the duplicating way should have tags*,
  • the duplicating way is not a member of any relation,
  • inner ring ways should have no tags,
  • inner ring ways are a member of only 1 relation (the one from the Osmose violation),
  • optional: relation and duplicating way should have a required "source" tag (e.g., "source=NHD)**.

*In case of multiple duplicating ways these violations will be skipped.

**By default this verifier is skipped--the source tag will not be checked.

When a violation is confirmed by the bot, data modification is performed.

Data modification is understood as:

Basic scenario:

  • copying tags from the way, which does not belong to any relation to the way, which is a member of a violating relation,
  • removing way which does not belong to any relation.

Three or more duplicated ways scenario:

  • removing inner ring ways which belong to relations,
  • assigning a duplicating way to those relations.

The bot source code is available here.

Test Run

Before running the bot on the whole of the United States, we will run the automated updates on a smaller area. For this, we've selected Louisiana, where we have 5,542 cases logged by the Osmose rule.

Bot runs

To make sure that the system is not overloaded, we plan to run the bot in parts based on Osmose regions. Below you can see the proposed order:

Order Osmose region Count of issues*
1 usa_louisiana - TEST RUN 5542
2 usa_north_carolina 1908
3 usa_south_carolina 1291
4 usa_north_dakota 1280
5 usa_michigan 887
6 usa_massachusetts 727
7 usa_georgia 705
8 usa_rhode_island 603
9 usa_california_merced 451
10 usa_connecticut 410
11 usa_iowa 375
12 usa_illinois 350
13 usa_kansas 336
14 usa_colorado 318
15 usa_washington 256
16 usa_california_madera 208
17 usa_south_dakota 202
18 usa_california_stanislaus 192
19 usa_florida 177
20 usa_california_san_luis_obispo 163
21 usa_alabama 158
22 usa_missouri 156
23 usa_oregon 153
24 usa_california_sacramento 147
25 usa_california_tulare 132
26 usa_minnesota 126
27 usa_virginia 115
28 usa_california_butte 106
29 usa_california_siskiyou 99
30 usa_california_plumas 98
31 usa_california_fresno 96
32 usa_new_hampshire 96
33 usa_indiana 95
34 usa_oklahoma 88
35 usa_ohio 84
36 usa_tennessee 79
37 usa_california_santa_barbara 78
38 usa_wisconsin 78
39 usa_texas 72
40 usa_california_solano 72
41 usa_arizona 67
42 usa_california_el_dorado 65
43 usa_mississippi 62
44 usa_california_colusa 54
45 usa_california_lassen 53
46 usa_california_san_joaquin 49
47 usa_pennsylvania 49
48 usa_montana 47
49 usa_california_monterey 46
50 usa_new_jersey 46
51 usa_california_sutter 46
52 usa_california_kern 45
53 usa_new_york 44
54 usa_california_modoc 43
55 usa_maine 42
56 usa_utah 41
57 usa_california_orange 38
58 usa_california_santa_cruz 38
59 usa_maryland 38
60 usa_california_yolo 36
61 usa_california_contra_costa 33
62 usa_alaska 30
63 usa_california_sierra 30
64 usa_california_lake 30
65 usa_district_of_columbia 27
66 usa_california_amador 26
67 usa_california_mariposa 25
68 usa_california_los_angeles 24
69 usa_california_kings 22
70 usa_new_mexico 21
71 usa_california_glenn 17
72 usa_california_napa 15
73 usa_kentucky 14
74 usa_nevada 12
75 usa_delaware 12
76 indonesia_east_nusa_tenggara 12
77 usa_california_san_bernardino 11
78 usa_idaho 11
79 usa_california_san_benito 10
80 usa_west_virginia 9
81 usa_vermont 9
82 usa_california_humboldt 8
83 usa_california_shasta 8
84 usa_california_yuba 8
85 usa_california_santa_clara 6
86 usa_arkansas 6
87 usa_nebraska 5
88 usa_california_riverside 4
89 usa_california_marin 3
90 usa_california_trinity 3
91 usa_puerto_rico 2
92 usa_california_ventura 2
93 usa_california_tehama 2
94 usa_california_del_norte 1
95 usa_california_alameda 1
96 usa_california_san_mateo 1
97 usa_wyoming 1
98 usa_california_mendocino 1
99 usa_california_alpine 1
100 usa_california_san_francisco 1
101 usa_california_placer 1
102 usa_hawaii 1
n/a usa_california_nevada 0
n/a usa_california_calaveras 0
n/a usa_northern_mariana_islands 0
n/a usa_guam 0
n/a usa_california_san_diego 0
n/a usa_american_samoa 0
n/a indonesia_west_nusa_tenggara 0
n/a usa_california_mono 0
n/a usa_california_tuolumne 0
n/a usa_california_inyo 0
n/a usa_california_imperial 0
n/a usa_virgin_islands 0
n/a usa_california_sonoma

*count taken from Osmose, date: February 8 2022 (note that the number of the violations can differ, depending on a day, as features are being edited in the OSM constantly)

Discussion

This automated action will be announced and discussed in the talk-us mailing list. We invite everyone to join the conversation and share feedback.

This is the link to the notification: https://lists.openstreetmap.org/pipermail/talk-us/2022-February/021602.html

Opt-out

To opt out of this automated update, please write an e-mail (in English) to TTmechanicalupdates@groups.tomtom.com describing which area or source version should be excluded from the update scope and why.

When

Full run (without Louisiana)

The runs and results analysis were performed between 8 Mar 2022 - 9 Mar 2022.

Test run (Louisiana)

The test run was completed on 1 March 2022.

Outcome

Full run details

Scope: USA

Start Date: 8 Mar 2022

General summary of the full bot run:

Opened changesets Total violations Fixed violations Not fixed violations
459 20,102 19,821 281

Below you can see the results of the full run per region:

Run No. Region Opened changesets Total violations Fixed violations Found duplicates* Fixed duplicates Filtered out by verifiers** Others rejected***
1 usa_louisiana (Test Run) 110 5542 5459 82 0 0 1
2 usa_north_carolina 38 1891 1880 6 0 1 4
3 usa_south_carolina 27 1291 1283 10 2 0 0
4 usa_north_dakota 26 1280 1280 0 0 0 0
5 usa_michigan 18 887 873 12 0 1 1
6 usa_massachusetts 15 731 720 8 0 0 3
7 usa_georgia 15 736 735 0 0 1 0
8 usa_rhode_island 13 610 602 8 0 0 0
9 usa_california_merced 10 451 451 0 0 0 0
10 usa_connecticut 9 408 407 0 0 0 1
11 usa_iowa 8 374 374 0 0 0 0
12 usa_illinois 7 356 328 28 0 0 0
13 usa_kansas 8 383 383 0 0 0 0
14 usa_colorado 7 318 302 16 0 0 0
15 usa_washington 6 266 250 14 0 0 2
16 usa_california_madera 6 208 208 0 0 0 0
17 usa_south_dakota 5 202 202 0 0 0 0
18 usa_california_stanislaus 4 192 192 0 0 0 0
19 usa_florida 5 221 212 6 0 0 3
20 usa_california_san_luis_obispo 4 163 163 0 0 0 0
21 usa_alabama 4 158 158 0 0 0 0
22 usa_missouri 4 156 155 0 0 1 0
23 usa_oregon 4 153 151 2 0 0 0
24 usa_california_sacramento 0 146 146 0 0 0 0
25 usa_california_tulare 3 132 132 0 0 0 0
26 usa_minnesota 4 127 124 4 2 1 0
27 usa_virginia 3 120 119 0 0 0 1
28 usa_california_butte 3 106 104 2 0 0 0
29 usa_california_siskiyou 2 99 99 0 0 0 0
30 usa_california_plumas 2 98 98 0 0 0 0
31 usa_california_fresno 2 96 96 0 0 0 0
32 usa_new_hampshire 2 96 96 2 0 0 0
33 usa_indiana 2 95 93 0 0 0 2
34 usa_oklahoma 2 88 88 0 0 0 0
35 usa_ohio 2 84 83 0 0 1 0
36 usa_tennessee 2 79 68 0 0 1 10
37 usa_california_santa_barbara 2 72 72 0 0 0 0
38 usa_wisconsin 2 87 87 0 0 0 0
39 usa_texas 2 72 70 2 0 0 0
40 usa_california_solano 2 72 72 0 0 0 0
41 usa_arizona 2 67 67 0 0 0 0
42 usa_california_el_dorado 2 65 65 0 0 0 0
43 usa_mississippi 2 62 61 0 0 1 0
44 usa_california_colusa 2 54 54 0 0 0 0
45 usa_california_lassen 2 53 53 0 0 0 0
46 usa_california_san_joaquin 1 48 48 0 0 0 0
47 usa_pennsylvania 1 49 49 0 0 0 0
48 usa_montana 1 47 47 0 0 0 0
49 usa_california_monterey 1 46 44 0 0 1 1
50 usa_new_jersey 1 23 23 0 0 0 0
51 usa_california_sutter 1 46 46 0 0 0 0
52 usa_california_kern 1 44 44 0 0 0 0
53 usa_new_york 2 77 77 0 0 0 0
54 usa_california_modoc 1 43 43 0 0 0 0
55 usa_maine 1 42 40 0 0 0 2
56 usa_utah 1 42 9 32 32 0 1
57 usa_california_orange 1 38 38 0 0 0 0
58 usa_california_santa_cruz 1 38 36 2 0 0 0
59 usa_maryland 1 38 37 0 0 1 0
60 usa_california_yolo 1 36 36 0 0 0 0
61 usa_california_contra_costa 1 33 33 0 0 0 0
62 usa_alaska 1 30 30 0 0 0 0
63 usa_california_sierra 1 30 30 0 0 0 0
64 usa_california_lake 1 30 30 0 0 0 0
65 usa_district_of_columbia 2 61 53 8 0 0 0
66 usa_california_amador 1 26 26 0 0 0 0
67 usa_california_mariposa 1 25 25 0 0 0 0
68 usa_california_los_angeles 1 24 24 0 0 0 0
69 usa_california_kings 1 22 22 0 0 0 0
70 usa_new_mexico 1 21 21 0 0 0 0
71 usa_california_glenn 1 17 17 0 0 0 0
72 usa_california_napa 1 13 13 0 0 0 0
73 usa_kentucky 1 14 14 0 0 0 0
74 usa_nevada 1 12 12 0 0 0 0
75 usa_delaware 1 12 12 0 0 0 0
76 indonesia_east_nusa_tenggara 1 12 12 0 0 0 0
77 usa_california_san_bernardino 1 11 11 0 0 0 0
78 usa_idaho 1 11 11 0 0 0 0
79 usa_california_san_benito 1 10 10 0 0 0 0
80 usa_west_virginia 1 9 9 0 0 0 0
81 usa_vermont 1 9 8 0 0 1 1
82 usa_california_humboldt 1 8 8 0 0 0 0
83 usa_california_shasta 1 8 8 0 0 0 0
84 usa_california_yuba 1 8 8 0 0 0 0
85 usa_california_santa_clara 1 6 6 0 0 0 0
86 usa_arkansas 1 6 6 0 0 0 0
87 usa_nebraska 1 5 5 0 0 0 0
88 usa_california_riverside 1 4 4 0 0 0 0
89 usa_california_marin 1 3 3 0 0 0 0
90 usa_california_trinity 1 3 3 0 0 0 0
91 usa_puerto_rico 1 2 2 0 0 0 0
92 usa_california_ventura 1 2 2 0 0 0 0
93 usa_california_tehama 1 2 2 0 0 0 0
94 usa_california_del_norte 1 1 1 0 0 0 0
95 usa_california_alameda 1 1 1 0 0 0 0
96 usa_california_san_mateo 1 1 1 0 0 0 0
97 usa_wyoming 1 1 1 0 0 0 0
98 usa_california_mendocino 1 1 1 0 0 0 0
99 usa_california_alpine 1 1 1 0 0 0 0
100 usa_california_san_francisco 1 1 1 0 0 0 0
101 usa_california_placer 1 1 1 0 0 0 0
102 usa_hawaii 1 1 1 0 0 0 0

*Found duplicates - cases were there are more than 2 duplicated ways

**Filtered out by verifiers - cases which are rejected because they are not passing all the algorithm criteria

***Others rejected - other reasons of rejection, usually incomplete data, not easily solvable in an automatic way

Test run details

Scope: USA, Region: usa_louisiana

Start Date: 1 March 2022

Violations source date: 1 March 2022

Total violation count: 5,542

Uploaded fixes: 5,459

Source verifier rejected: 0

Filtered out due to duplicated way id: 82

Incomplete data (inner and duplicated had tags): 1


Total opened changesets: 110 (117973580, 117973610, 117973642, 117973669, 117973698, 117973721, 117973748, 117973776, 117973817, 117973847, 117973878, 117973907, 117973935, 117973960, 117973987, 117974018, 117974047, 117974073, 117974101, 117974126, 117974153, 117974177, 117974205, 117974233, 117974251, 117974284, 117974315, 117974344, 117974384, 117974411, 117974442, 117974462, 117974487, 117974526, 117974552, 117974576, 117974607, 117974637, 117974668, 117974703, 117974730, 117974756, 117974775, 117974799, 117974834, 117974861, 117974887, 117974913, 117974940, 117974971, 117974997, 117975025, 117975047, 117975066, 117975106, 117975126, 117975151, 117975174, 117975202, 117975221, 117975247, 117975270, 117975295, 117975324, 117975352, 117975363, 117975389, 117975416, 117975440, 117975456, 117975480, 117975502, 117975521, 117975558, 117975602, 117975631, 117975660, 117975688, 117975715, 117975741, 117975763, 117975791, 117975819, 117975847, 117975871, 117975900, 117975920, 117975935, 117975960, 117975990, 117976007, 117976045, 117976078, 117976108, 117976138, 117976164, 117976197, 117976219, 117976242, 117976266, 117976297, 117976322, 117976348, 117976372, 117976397, 117976419, 117976440, 117976470, 117976501, 117976515)

Total time of run: app. 82 minutes