User:Sanderd17/Areas
Some thoughts on my side regarding The Future of Areas.
An area type referencing nodes or ways?
First question we have is whether an area type should reference nodes or ways. Supporting both makes it difficult for the tools that should interpret it IMO. Supporting both will only make the data a bit smaller. But since both bandwidth and storage space are becoming cheap rather rapidly, and good compression algorithms handle textual data very well, this is not a problem in my eyes.
When we support only nodes, we get serious problems with the big multipolygons that exist, s.a. the big lakes, administrative borders, ...
So here's my proposal:
<area id="..." ...>
<tag k="..." v="..." />
<ring>
<way ref="1" reversed="false" />
<way ref="2" reversed="true"/>
...
</ring>
<ring>
<way ref="3" reversed="true" />
<way ref="4" reversed="true"/>
...
</ring>
</area>
Referencing dependency
First of all, I think we should go for area objects referencing ways. Since OSM has the history of referencing only in one direction, and only top to bottom (i.e. ways reference nodes, not the other way around), we should keep the same approach with areas.
Importance of orientation
Closed Rings
Next to that, the rings are coded in the data, and must be closed rings. I.e. the nodes in the ways forming the ring should be closed and non-intersecting. That means that the last node of a way, and the first of the next way in a ring, should be the same (after possibly reversing is applied). This is the first part where way orientation is important. The "reversed" flags have to be set correctly to close the rings.
Determining outer vs inner
When you have a sphere, the outer and inner of a polygon on that sphere are not topological terms. You can stretch the sphere on one side, and shrink it on another side, so the outer and inner are switched, while the data is still topologically equal.
When you have a bounding box, and a way that's part of an area crossing your bounding box (see image). Even when you know this way is classified as the "outer" or a polygon, there's no way you can decide what part of the bounding box you can colour. Since lots of people edit in bounding boxes, it's nice if the editor shows the right colour on the right side of the way. Some data users also use per-country extracts (because the complete dataset is too big). Since there are some quite big and important polygons that cross country borders (think of some great lakes), it's also important they know what's the inner and outer part of an area.
That's where we have our second use of the directionality of ways. After the "reversed" flags are applied, you can say that the outer ring is counter-clockwise. In other words, the area to render stays left of all ways.
The coordinate borders - Crossing the 180th meridian
I proposed a way to handle coordinate border problems on ways here: https://trac.openstreetmap.org/ticket/1612#comment:18
This way of representing areas would work well with coordinate border-crossing areas. The same as with regular bounding boxes, the correct part to render can be found easily. Even an area that would wrap the poles could be handled correctly. As not only the date line can be used to connect to a full ring, all sides of the world rectangle can be followed as it were a normal bounding box and partial data. This will avoid nasty ways to work around the poles or the date line.
Big areas
For big areas, it won't differ a lot with the current multipolygons. The main difference is that the data should be unambiguous, even in the case of partial downloads (only downloading some ways + the complete area object). So it becomes a lot easier to manage and big areas are less of a problem. Say you have an area consisting of 2000 ways (which is a very big lake), in textual XML data, you need about 40 bytes per way referenced by the area. In total, the download for that area object (without downloading the ways or nodes) would be 8 kB big, which isn't a problem at all.
The only really big areas are coastlines. There are currently about 762 000 ways tagged with natural=coastline. We don't have to put all of them into a single area object though. We can put them in area objects per land mass. The biggest land area would probably be Africa-Asia-Europe (AAE). Sadly, I have no way to predict how many connected pieces of coastline this would need without downloading the data (and I don't really want to download it). America and Antarctica are also rather big pieces, and there's a huge amount of islands that can have their own area object. Lets say we put all the coastline segments in an area object. This will be a severe over-estimation, but might be more correct in the future, when the coastline gets refined more and more (thus gets more nodes and more ways).
In total, the coastline area object would be about 30.5 MB big in that case. Honestly, that's a bit much. 30 MB costs quite some time to download on the fly. So users with a normal connection editing near the coast will be set back because of this.
The only ways to solve this are arbitrary splitups of land areas, or keeping the current tag on every way.
Consistency checks
The database should only do XML validation, as it always does. Ensure that every way in an area has a reversed flag set to "true" or "false", ensure that the ways in an area exist, ...
The other checks should happen by the editor, and by external tools.
- No rings of the same area should share nodes or cross each other (this also means that one way can appear only once per area object).
- No rings of the same orientation in the same area should be directly inside each other (though they can be nested when one ring of a different orientation is between the two other).
- The outer-most rings (when seeing the world as a rectangle rather than a sphere) should be ordered counter-clockwise.
Areas that do not obey these rules should not get rendered at all. This will help with assuring the areas are correct. Tags on circular ways are, from now on, never considered to be area tags, and should not get rendered as area.
Editing operations
Editors should allow for quick creation of small areas, but also provide a good interface to modify and create big areas. Since using the data with this format seems easy enough, it's time to investigate the editability too.
Creating a new area
- There's a possibility for an area drawing tool. This tool can only be used for simple areas (i.e. areas that consist out of one outer way). The area tool will create the way, and ensure the way is closed when finished. It will also automatically create an area object with the correct orientation of the way. And select that area object in order to add tags. This tool will allow to quickly add small areas (like simple buildings).
- There's should also be a more involved area editor, comparable to a relation editor. Where ways can be selected and added to a new area. The editor should compose rings and partial rings (which happens in case of incomplete data, or unfinished modifications), show those rings, and allow the user to reverse the orientation of complete rings (which means negating their "reversed" flag and reversing their order of occurrence in the ring).
Splitting ways
When splitting ways, the editor should check the areas of which this way was a part, and add the new way to it in the right position. Even with partial data, the right position can always be found, as it will be right after or before the existing way, and you can where by checking the connecting node and the "reversed" flags. The "reversed" flag of the new way part should be the same as the "reversed" flag of the existing way.
Reversing orientation of ways
Simply check to which areas the ways belong, and negate their "reversed" flag.
Selecting areas
When selecting ways, you should see a list of areas that way belongs too, and be able to edit tags on those areas. You should also be able to directly select areas by clicking inside the area.
Adding/removing ways on areas
There should not be a complete list of all areas you downloaded in the UI (like the list of relations in JOSM). This makes it hard to navigate, certainly when all simple areas (s.a. buildings) would be listed there. Instead, the following workflow would help a lot:
- select the ways you want to add to (or remove from) an area
- press on a button (or hotkey) to get the "area editor"
- areas in the editor light up
- you click the right area and you get in the "area editor"
- there you can add your current selection to the existing (partial) rings. Either on automatic positions (the editor will compare ring end nodes) or manually between the ways of the different rings. Note that ways can only appear once per area, so added ways should not be in the "to-be-added" list. You can also remove your current selection, or select individual ways or rings to be removed. The editor should always try to find the most correct position to add a way. Since most people download things per bounding box, it means that they will have the surrounding ways downloaded when they want to add a way in that area.