User:Sjwhitak/RIPTA Import Plan
The RIPTA Import is an import of RIPTA GTFS dataset which is the bus routes covering Rhode Island. The automatic import is currently (as of 2024-03-07) completed. Currently, the bus routes are being manually having the bus stops added.
Goals
Add the bus stops in Rhode Island so I don't have to use Google Maps to find when the next bus arrives.
Schedule
I could finish this in a few days. I just want confirmation that my procedure is solid and I don't want mess anything up on OpenStreetMaps.
Import Data
Background
Data source site: https://www.ripta.com/mobile-applications/
Data license: Public domain.
Hello, I am interested in working with the GIS data related to the bus stops throughout Rhode Island using the GTFS data found here: https://www.ripta.com/mobile-applications/ Mainly, importing the bus routes, stops, and schedules into OpenStreetMaps. I was directed to this email from [redacted], hopefully this is a good contact to email.
Since this is a public service, this data is presumably public domain, but I would like confirmation to make sure I'm not doing anything I'm not supposed to. Could I have confirmation this data is publicly available? And if it is stated somewhere publicly (such as a website) could you point that out to me? That would be nice for records' sake.
Awesome that you want to put RIPTA data into OpenStreetMaps! Yes, this data is publicly available and you’re welcome to use it.
In case you’re not aware, RIPTA makes planned service changes generally three times a year (January, June, and August/September) and when we make those changes we also update the GTFS data on our website. Most of these changes are not drastic – adding or removing a bus stop, small route deviations, retiming of schedules, etc. – but to be most accurate you may need to occasionally update the data or at least note the publish date.
Import Type
I do not trust my coding to automate updates and this will simply be a one-time update. If there are significant changes that RIPTA has made to the bus routes, then I can manually plot each one.
Data Preparation
Data Reduction & Simplification
The only risk in this import is using GTFS-OSM-Validator which may miss bus stops that are already mapped in OSM. The range for automatic detection of bus stops in GTFS-OSM-Validator is very poor; expanding too large blurs everything together, and making it too little just removes everything. My plan to mitigate such risk is to simply not search for any pre-existing OSM bus stops using GTFS-OSM-Validator and instead manually search every bus stop in JOSM. There are a total of 3629 bus stops, so I have no problem simply manually checking every one in JOSM.
Changeset Tags
Unfortunately, I am unfamiliar with what a changeset is, but these are the tags that I am adding to each bus stop:
Key | Value |
---|---|
bus | yes |
public_transport | platform |
gtfs_stop_code | code provided by GTFS |
name | bus stop provided by GTFS |
route_ref | bus route provided by GTFS |
network | RIPTA |
network:wikidata | Q7320944 |
network:wikipedia | en:Rhode Island Public Transit Authority |
highway | bus_stop |
operator | Rhode Island Public Transit Authority |
I am unsure of what the purpose of the `gtfs_stop_code` and `gtfs_id` are, but they are used in GTFS so I presume there's a purpose for it.
Data Transformation
The Python script inserts the OSM-specific tags at each bus stop. This script may fail on other GTFS formats, but it worked for me with RIPTA's.
# This code runs through the RIPTA-GTFS dataset after running through gtfs-osm-sync
# and updates tags for OSM.
import xmltodict
from itertools import compress
def specific_route(nodes, route):
N = len(nodes)
route_mask = [False] * N
for i in range(N):
for tag in nodes[i]['tag']:
if 'ref' in tag.values():
if route in tag['@v']:
route_mask[i] = True
return route_mask
def remove_dupes(list_of_dicts):
# https://stackoverflow.com/a/41704996
"""Source: answer from wim
"""
list_of_unique_dicts = []
for dict_ in list_of_dicts:
if dict_ not in list_of_unique_dicts:
list_of_unique_dicts.append(dict_)
return list_of_unique_dicts
def update_nodes(nodes):
for node in nodes:
# Remove white space from lat, lon
node['@lat'] = node['@lat'].strip()
node['@lon'] = node['@lon'].strip()
# Search for broken/extra tags and remove them
# or modify them.
rm = []
stop_id = -1
for i in range(len(node['tag'])):
if 'network' in node['tag'][i].values():
node['tag'][i] = {'@k': 'network', '@v': 'RIPTA'}
if 'gtfs_id' in node['tag'][i].values():
rm.append(i)
# In my previous import, I used `ref_route` instead of `route_ref`
# which messed with things.
if 'ref_route' in node['tag'][i].values():
rm.append(i)
if 'ref' in node['tag'][i].values():
node['tag'][i]['@k'] = "route_ref"
if 'gtfs_stop_code' in node['tag'][i].values():
# Adjust to "gtfs:stop_id" according to KevinMapsThings
node['tag'][i]['@k'] = "gtfs:stop_id"
stop_id = node['tag'][i]['@v']
if len(rm) > 0:
for r in sorted(rm, reverse=True):
del node['tag'][r]
# Add OSM-sppecific tags.
node['tag'].append({'@k':'network:wikidata',
'@v':'Q7320944'})
node['tag'].append({'@k':'network:wikipedia',
'@v':'en:Rhode Island Public Transit Authority'})
node['tag'].append({'@k':'network:short',
'@v':'RIPTA'})
node['tag'].append({'@k':'highway',
'@v':'bus_stop'})
node['tag'].append({'@k':'operator',
'@v':'Rhode Island Public Transit Authority'})
node['tag'].append({'@k':'operator:short',
'@v':'RIPTA'})
node['tag'].append({'@k':'operator:wikidata',
'@v':'Q7320944'})
node['tag'].append({'@k':'ref',
'@v':stop_id})
node['tag'].append({'@k':'gtfs:feed',
'@v':'US-RI-RIPTA'})
# If previous tags already exist, remove the duplicates.
node['tag'] = remove_dupes(node['tag'])
return nodes
data = xmltodict.parse(open('test.osc').read())
# I do not want to delete any nodes, so just in case I did, ignore them.
# I do not want to modify/create relations, so ignore those too.
data['osmChange']['delete'] = None
data['osmChange']['create']['relation'] = None
data['osmChange']['modify']['relation'] = None
# Add tags
data['osmChange']['create']['node'] = update_nodes(data['osmChange']['create']['node'])
data['osmChange']['modify']['node'] = update_nodes(data['osmChange']['modify']['node'])
# Save to XML file
open('test2.osc','w').write(xmltodict.unparse(data, pretty=True))
x = open('test2.osc','r').readlines()
# My stupidity from the previous edit required me removing this empty tag.
y = [line for line in x if '<tag k="route_ref" v=""></tag>' not in line]
open('test3.osc','w').writelines(y)
Data Merge Workflow
Team Approach
Solo, unless someone cares to help.
Workflow
The import is a multi-step, manual process.
- Download the RIPTA GTFS data from the RIPTA website.
- Use GTFS-OSM-Validator to compare the GTFS bus stops with current bus stops already on OSM.
- After exporting to a
.osc
file, I use a User:Sjwhitak/RIPTA_Import_Plan#Appendix Python script I wrote to update the tags to follow OSM standards for bus stops.- NOTE: This was done so I didn't have to manually insert each of these tags in with JOSM. There might be some automated method in JOSM, but I couldn't find it.
- In JOSM, zoom in to every bus stop to see if I missed anything before uploading to OSM.
- Finally, connect the route lines that were laid out by njtbusfan
- NOTE: I personally don't know how to do this in JOSM so I'll do it manually in OSM.
BONUS: RIPTA contains bus times, shown on their website for each route, but also this data is in the GTFS database. Though, the GTFS-OSM-Validator does not handle bus times so I will need to write a second Python script to handle the GTFS bus times. Or I'll need to add an update to the GTFS-OSM-Validator to handle this properly.
I have the JOSM reverter plugin, so I can use that to revert changes.
Current state (2024-03-07)
- All bus stops are imported
- All bus routes are added
- Bus stops are NOT connected to bus routes yet
Bus route | Status |
---|---|
Not completed | R, QX, 6, 9x, 10, 12x, 13, 14, 29, 33, 76, 87 |
Bus route cleaned | 1, 17, 18, 19, 20, 27, 28, 30, 31, 32, 34, 35, 40, 50, 51, 54, 55, 56, 57, 58, 59x, 63, 64, 65x, 66, 67, 71, 72, 73, 75, 78, 80, 92, 95x, 301 |
Completed | 21, 22, 23, 24L, 60, 61x |
Add to OSM | 3, 4, 16, 68, 69, 88, 89, 203, 204, 231, 242, 281, 282, BB, PVD |
Remove from OSM | 3A, 3B, 8x, 49, 62 |
See also
The post to the community forum was sent on 2023-12-19 and can be found here