PT TEC Wallonie BE Import
About
This import project page is about the TEC public transport import. They released their data under a Creative Commons – Attribution 4.0 International (CC BY 4.0) license. (http://opendata.awt.be/dataset/tec)
This dataset contains all the stops and the timetables for buses and trams in Wallonia (and some in Flanders and the Brussels and German speaking region).
Import Plan Outline
Goals
The goal of the import is to use this dataset in mapping activities. We are not attempting a blind import, all data has to be seen by human eyes before it can appear in the OSM data.\\
The nodes have to be moved to the right locations and the names need to be double checked.
Schedule
There is no fixed schedule. It takes however long it takes. Updated data coming in from upstream is continuously integrated.
Import Data
Background
Import Type
Each stop and each route is vetted before it gets added to OSM. No automatic import is to take place. The generated OSM file has an extra line:
<osm version="0.6" upload="no" generator="Python script">
which causes an additional message from JOSM if somebody would try to upload all of the data at once.
The file is only meant as an aid to facilitate adding/integrating stops manually, not for automatic upload/import.
Data Preparation
Data Reduction & Simplification
The technical nitty gritty for converting the data can be found here:
https://wiki.openstreetmap.org/wiki/WikiProject_Belgium/De_Lijndata
I'm adapting what is found there to the peculiarities of the data coming from TEC. De Lijn provided a dump of their tables. TEC provides a more 'intelligent' format, which unfortunately makes it harder to load it into PostGIS.
The latest version of the resulting osm file can be found here:
https://dl.dropboxusercontent.com/u/42418402/TEC.osm.zip
In this file, all stops which are not in OSM yet, get an odbl=new tag. This has nothing to do with odbl, but those tags will get removed automatically before JOSM uploads the data.
The file contains all stops, for each stop a route_ref has been calculated from the timetable information which is part of the data. To select a group of stops in order to add route relations in the next step, this search expression (RE) can be used:
RR route_ref="(^|.+;)26(;.+|$)" inview odbl=new
26 gets replaced with the route number you want to work on.
Then copy/paste the selected stops to your work layer and reposition them one by one, checking the names for abbreviations which weren't converted properly and add zone information.
In order to add the route relations, the member stops need to be uploaded first, then the file needs to be saved and a script needs to run to update the local DB.
After that createOSMrouterelations.py can be used to create all route relations which have sequences stops in different order. In case of telescopic lines, only the longest sequence of stops gets a route relation.
Tagging Plans
tag | value |
---|---|
highway | bus_stop |
name | ongoing effort to expand abbreviations automatically and to streamline/generalise others like Eglise -> Église, Ecole -> École, Chssée -> Chaussée |
operator | TEC |
ref | internal ref number of TEC |
zone | 4 digits instead of the 2 visible on the poles |
route_ref | 1;3;20a;708 |
public_transport | platform |
bus | yes |
tram | yes (where applicable) |
Remarks
When a stop is served by more than 1 operator (common in Brussels region) 1 node per operator is used to facilitate automated QA. Unfortunately TEC is divided in entities, which all assign their own ref codes. So to keep things manageable, each of these entities is considered a separate operator. All these stops are combined in a stop_area relation.
In Brussels this leads to 4 nodes for the same stop, when the stop is served by MIVB/STIB, De Lijn, TEC Brabant-Wallon and TEC Charleroi. It looks a bit odd, but nodes are cheap.
tag | value |
---|---|
from | Bruxelles Midi |
name | TEC W Bruxelles Midi - Waterloo |
operator | TEC |
ref | W |
route | bus (or tram) |
to | Waterloo |
via | needs to be added manually |
ways: get no roles and form an ordered sequence from beginning to end (they need to added manually, although I did write a script which runs inside JOSM which can find the nearest way to a stop)
stops: get a platform role automatically, this needs to be changed to a more correct role if needed for stops where one can only board or get off.
tag | value |
---|---|
name | Bruxelles Midi - Waterloo |
operator | TEC |
ref | W |
route_master | bus |
type | route_master |
Changeset Tags
source = TEC;Bing2011
Data Transformation
https://wiki.openstreetmap.org/wiki/WikiProject_Belgium/De_Lijndata
Data Transformation Results
Data Merge Workflow
Tedious manual labour
Team Approach
If people want to join in, send me a message (Polyglot), I'll explain what you need to know during a few hangouts. This usually takes several hours...
Workflow
Dedicated upload account
Every stop gets vetted, the data from upstream serves as one of many references, integration/conflation is manual labour.
Conflation
Conflation has to be done by each individual contributor. It's better to let a human decide on this.
QA
For the stops I have a script which generates output in wiki format where names and route_refs are compared. JOSM RC is used to make it easy to upload them.
For routes, it's work in progress. QA and maintenance on them would be a lot easier if it were possible to use route segments.