Generate GTFS Shapes User's Guide

Created by Melinda Morang, Esri
Contact: mmorang@esri.com

Copyright 2017 Esri
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0. Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

What this tool does

The optional GTFS shapes.txt file contains the actual on-street paths taken by transit vehicles in your system. A good shapes.txt file is important in order for GTFS-based routing apps to display transit routes correctly on the map. Read more about the shapes.txt file in the GTFS reference doc.

The Generate GTFS Shapes toolbox produces a new shapes.txt file for your GTFS dataset or allows you to edit an existing one. The toolbox is targeted primarily toward transit agencies seeking to improve their GTFS datasets.

To create an entirely new shapes.txt file from scratch, you give the tool a valid, existing GTFS dataset, and the tool creates a new shape.txt file and updates the shape_id field in trips.txt and the shape_dist_traveled field in stop_times.txt. Step 1 of the tool creates a feature class with good estimates for the on-street paths used in your transit system. You can edit this feature class using your own knowledge in order to ensure that the correct paths are truly represented. Then, you can use Step 2 of the tool to update your GTFS files to include this shape information.

To edit one or more existing shapes, you give the tool a valid, existing GTFS dataset with a shapes.txt file and choose the shape(s) you want to edit. The chosen shapes will be drawn in the map, where you can edit them. Then, you can use Step 2 of the tool to update your shapes.txt file and the relevant shape_dist_traveled field entries in stop_times.txt.

The tools will not overwrite any existing GTFS files. You can choose the output location for the new files, and you can compare them with the old ones before manually replacing the old ones with the updated ones.

Software requirements

Data requirements

Getting started

Workflow

This tool has four steps:

  1. Create a reasonable estimate of your transit shapes by running one of the following tools:

    OR choose one or more existing shapes from your existing shapes.txt file to draw in the map using the Step 1: Update Existing Shapes tool.

  2. Make whatever edits you need to make to your transit shape feature class using the editing tools in ArcMap or ArcGIS Pro.

  3. Run the tool called Step 2: Generate new GTFS text files to generate a shapes.txt file and add the appropriate shape-related fields to your trips.txt and stop_times.txt files.

  4. Review the output, and, if satisfied, replace your existing GTFS files with the new ones.

How to get the best results

This tool has many options, and the behavior differs slightly depending on the version of ArcGIS you're using. Here is my recommendation for getting the best estimated routes, in order of preference.

  1. Use the ArcGIS Online version of Step 1 either in ArcMap 10.3 or higher or ArcGIS Pro. The ArcGIS Online routing service uses high-quality network data, and the tool configures the analysis settings optimally for this application.

  2. If you don't have access to the ArcGIS Online routing services, but you have ArcMap 10.3 or higher, use ArcMap (not ArcGIS Pro) to run the Network Analyst version of Step 1. Make sure to use a high quality network dataset, such as Streetmap Premium, and choose good network restrictions and other settings.

  3. If you can't do either of the above, or you don't have a high quality network dataset, you might be better off using the Straight Line version of Step 1 and manually editing all the shapes to match the streets. The Network Analyst version will not work well if you have poor-quality network data, and the results will be significantly worse if you're using an ArcMap version prior to 10.3 because new functionality that became available in 10.3 is used to improve the estimated results.

Having good quality GTFS data is also important in obtaining good results. You should make sure your GTFS stops are in the correct locations. If you have used the above guidelines and still continue to have widesread problems, you should consider editing your GTFS stop locations to place them closer to the correct positions on the streets. The Edit GTFS Stop Locations tool can help you generate a corrected GTFS stops.txt file for your dataset.

Note: the above recommendations apply primarily if you are trying to generate shapes for bus routes. The Straight Line version of Step 1 will be most valuable for estimating shapes for transit modes that do not use the streets, such as subways.

Running Step 1: Generate Shapes with Network Analyst

Step 1: Generate Shapes with Network Analyst uses your GTFS schedule information and the Network Analyst Route solver to produce a feature class showing the most probable geographic paths taken by transit vehicles in your system.

This step will take some time to run for large transit systems. Smaller transit systems should only take a few minutes.

To run this tool, you must have a good network dataset of streets that covers the area served by your transit agency. If you do not have the Network Analyst extension and an adequate network dataset, you can generate shapes that follow the streets by using the Step 1: Generate Shapes with ArcGIS Online tool, or you can generate simple straight-line estimates for your route shapes with the Step 1: Generate Shapes with Straight Lines version of this tool.

This tool will produce significantly better results in ArcMap version 10.3 or higher.

Screenshot of tool dialog

Inputs

Outputs

A file geodatabase with the name and location you specified will be created and will contain the following files:

Running Step 1: Generate Shapes with ArcGIS Online

Step 1: Generate Shapes with ArcGIS Online uses your GTFS schedule information and the ArcGIS Online route service to produce a feature class showing the most probable geographic paths taken by transit vehicles in your system.

This version of Step 1 requires ArcGIS 10.3 or higher or ArcGIS Pro.

ArcGIS Online's route service is available for most parts of the world. If you are uncertain whether the route service covers the geographic location served by your transit system, check the ArcGIS Online Network Dataset Coverage map.

To use this tool, you must be signed in to an ArcGIS Online account with routing privileges and sufficient credits. Talk to your organization's ArcGIS Online administrator if you need help checking or setting up your account. This tool will generate one ArcGIS Online route per shape in your GTFS data. So, if your transit system has 100 unique shapes, the tool will solve 100 routes using ArcGIS Online. As of this writing, "Simple Routes" cost 0.04 credits each, so the total number of credits incurred by the tool would be 0.4. Please refer to the ArcGIS Online Service Credits Overview page for more detailed and up-to-date information. The number of shapes to be generated will be at minimum equal to the number of unique route_id values in your routes.txt file. Most datasets have more shapes than routes because routes can include trips with different sequences of stops.

Note: If your transit lines have a large number of stops, it may not be possible to generate an on-street route shape using ArcGIS Online because the ArcGIS Online route service limits the total number of stops allowed per route (150 as of this writing - check the route service documentation for the latest information). Shapes that exceed the stop limit will be estimated by connecting the stops with straight lines, and the tool will show a warning telling you which shape_id values have encountered this problem.

This tool will take some time to run for large transit systems. Smaller transit systems should only take a few minutes.

Note: If you don't or can't use ArcGIS Online, you can instead generate shapes that follow the streets by using the Step 1: Generate Shapes with Network Analyst tool, or you can generate simple straight-line estimates for your route shapes with the Step 1: Generate Shapes with Straight Lines version of this tool.

Screenshot of tool dialog

Inputs

Outputs

A file geodatabase with the name and location you specified will be created and will contain the following files:

Running Step 1: Generate Shapes with Straight Lines

This version of Step 1 does not use a street network to estimate your route shapes. It generates shapes by drawing a straight line between connected stops instead of tracing the pattern of the streets. You should only use this version of Step 1 if you do not have the Network Analyst extension or the ability to use ArcGIS Online, or wish to simply generate straight-line estimates for your route shapes.

Screenshot of tool dialog

Inputs

Outputs

A file geodatabase with the name and location you specified will be created and will contain the following files:

Running Step 1: Update Existing Shapes

Run this version of Step 1 if your GTFS dataset already has a shapes.txt file and you just want to update one or more of the existing shapes. With this tool, you can select which shapes you want to update, and it will create a feature class with these shapes as they currently appear in your shapes.txt file.

Screenshot of tool dialog

Inputs

Outputs

A file geodatabase with the name and location you specified will be created and will contain the following files:

Editing your Shapes

Before using Step 2 to generate your shapes.txt file, you should examine your shapes in the map and make any necessary edits. You can use the ArcMap or ArcGIS Pro editing tools to edit your shapes. Your workflow will be something like this:

For detailed information on editing in ArcGIS, read about editing in the ArcMap Help or the ArcGIS Pro Help.

Tips for editing

To view and edit one shape at a time, apply a Definition Query to the Shapes layer based on the shape_id field. for example, use "shape_id = '2'" to display only the shape for shape_id 2. All others will be hidden. You can do the same thing with the Stops_wShapeIDs feature class to see only the stops associated with that shape. Learn how to apply a Definition Query in the ArcMap Help or ArcGIS Pro Help.

If you want to visualize the estimated Bearing calculated for each stop, you can choose an arrow for the Stops_wShapeIDs layer symbology and apply a rotation to this symbol based on the Bearing field in the data. Learn how to apply a rotation to a point symbol in the ArcMap Help or ArcGIS Pro Help.

Common Shape problems and how to fix them

The route makes small diversions into side streets

The stop is actually along the main road, but the stop latitute and longitude is actually slightly closer to a side road. Consequently, the stop snapped to the side road, and the bus had to turn into the side road to visit the stop and then make a U-turn or drive around the block to return to the main road. You can edit these out easily using the Reshape Features Tool.

If this is a widespread problem in your data, see the How to get the best results section, and consider tweaking the Bearing parameters.

Editing tips

The route doubles back on itself

Although the stop should be on the right side of the road, the GTFS stop lat/lon location or the network dataset street location is slightly off, putting the stop on the wrong (left) side of the road. This means that the bus had to make a U-turn to reach the stop, so the shape doubled back on itself. Alternatively, the bus might have had to travel around the block in a big loop to turn around and visit the stop.

These situations are sometimes hard to identify because you can't see the areas where the shape line overlaps itself (see picture). You might not realize you have these problems until you get errors about misordered vertices when running Step 2 of the tool. You can use the vertex editing tool to see the vertices or the Feature Vertices to Points geoprocessing tool to save them to a feature class to follow the shape vertices in the correct order in order to detect problems like this.

Once you identify the problems, it's fairly easy to use the Reshape Features Tool to edit them out. If this is a widespread problem in your data, see the How to get the best results section.

Sometimes, the bus legitimately travels the same road in both directions. When the lines overlap exactly, it causes problems in Step 2 of the tool. The best way to handle this situation is to use the Edit Vertices Tool to slightly separate the lines going in either direction so that they no longer overlap.

Editing tips

The route diverts wildly from the expected path

Sometimes the route will take a completely different path than expected. If you're using the Network Analyst version of Step 1, it could be that your network dataset isn't well connected or the street data isn't good in this area.

More likely, however, one or more of your stops near the diversion did not snap to the correct street edge. The route thought the stop was located on some other street, so it made a large diversion to reach it. The most likely cause is an incorrect Bearing estimate for one or more stops. If this is a widespread problem in your data, consider tweaking the Bearing parameters.

The figure below shows an example. In this instance, the second stop (circled) has an incorrect bearing estimate because the route turns a corner. It has located on the highway ramp because this is the closest street feature that has an angle similar to the stop's estimated bearing. In order to reach this ramp, the route has to divert wildly.

To correct this problem, the tool was run with a smaller value for the 'Maximum angle difference for bearing calculation (degrees)' parameter. Because the angles between the first and second stop and the second and third sotp were greater than the maximum allowed angle for bearing estimation, a bearing was not used when locating the second stop. Instead, it located on the closest street. You can see this in the second image below. The resulting route is still not "correct", but it is more easily corrected than the first estimate.

Editing tips

Editing tips

Fine-tuning results with the Bearing parameters

The ArcGIS Online or Network Analyst versions of Step 1 contain two parameters, Bearing tolerance (degrees) and Maximum angle difference for bearing calculation (degrees), that can be used to fine-tune the estimated shapes.

When you run a network analysis in ArcGIS, such as solving a Route, your input points must "locate" on the network. Each stop your route will visit has a latitute and longitude, but since this location will rarely coincide exactly with the streets in the network, the closest appropriate point on the network will be considered the stop's network location. The Network Analyst documentation explains this concept in more detail.

Normally, your transit stops would locate on the closest non-restricted street feature in your network, and the Network Analyst Route solver will find an optimal route visiting those locations. However, sometimes the GTFS stop locations fall closer to a side street than to the main road where your transit route actually travels, so the Route solver incorrectly locates these stops on the side street and then creates transit shapes that enter side streets and make U-turns or travel around blocks.

Fortunately, if you know the approximate direction of travel at each transit stop, Network Analyst can use this information to make a better guess about which network edge the stop should locate on. The ArcGIS Online or Network Analyst versions of Step 1 will calculate an estimated bearing for each transit stop based on the angle between the stop and the previous and next stops, and the estimated bearing will be used when locating the stop on the network. The resulting shapes are usually much better with this method with fewer small diversions into side streets.

Bearing diagram

Although the tool defaults for Bearing tolerance (degrees) and Maximum angle difference for bearing calculation (degrees) are reasonable, you can adjust these parameters to attempt to achieve better results.

Bearing tolerance (degrees) refers to the maximum allowed angle between the stop's estimated direction of travel and the angle of the network edge the stop could locate on. If the angles differ more than the Bearing tolerance, then Network Analyst assumes that this is not the correct network edge to locate the stop on, and it will continue searching other nearby network edges for a more appropriate one. Bearing tolerance is explained thoroughly in the Network Analyst documentation.

In the diagram below, the blue triangle represents the angular area that falls within the specified Bearing tolerance from the road. If the calculated Bearing at the stop falls within this triangle (such as the green arrow on the left), then the stop will locate on this street. If, on the other hand, the calculated Bearing angle is greater than this tolerance (like the red arrow on the right), then the stop will not locate on this street because the angle does not match the stop's Bearing adequately, and Network Analyst will continue searching other nearby streets until one is found whose angle matches the stop's bearing more closely.

Bearing diagram

A smaller Bearing tolerance angle means that the stop's Bearing must match more closely with the angle of the street, so stops are less likely to be located incorrectly on side streets. However, they may also be less likely to locate on the correct nearby street, so you may see a larger number of stops located incorrectly on streets much further away because Network Analyst had to search very far away in order to find a street with a matching angle.

A larger Bearing tolerance makes it more likely that stops will incorrectly locate on side streets instead of the correct main road.

The Maximum angle difference for bearing calculation (degrees) parameter is another way to fine-tune tool output. The Bearing for each stop is estimated by averaging the angles between that stop and the previous stop and next stop along the route. When the route follows a relatively straight road, this angle is a good representation of the bearing. However, if the route goes around a corner, makes a U-turn, follows a very twisty road, or diverts into a parking lot or side road, then the average angle is not a good estimate of actual bearing, and using this estimate can cause the stop to locate far away from where it should and worsen the quality of the tool output. Consequently, the tool is configured to NOT use a bearing estimate if the difference in angle from the previous stop and to the next stop is greater than the number specified in this parameter. In this situation, the stop will revert to the normal locating behavior and will snap to the closest non-restricted network edge.

In the diagram below, the angle difference is represented by the angle A. When A is larger, the estimated Bearing at the current stop is less likely to be accurate. The Maximum angle difference for bearing calculation (degrees) parameter set a maximum value for A after which the Bearing estimate will no longer be used.

Bearing diagram

A greater value for this parameter is more restrictive. Stops must be in a straighter line in order for the bearing estimate to be used. This will reduce large, unexpected route diversions due to incorrect Bearing estimates, but it may increase the number of stops that are incorrectly located on side streets.

Running Step 2: Generate new GTFS text files

Step 2: Generate new GTFS text files creates or updates a shapes.txt file based on the feature classes you created in Step 1 and edited. It also creates or updates the shape_id field trips.txt and the shape_dist_travled field in stop_times.txt. This tool does not overwrite any of your existing GTFS data. You can review the new files before deleting the originals.

This tool will run very quickly for small GTFS datasets or if you are only updating a few shapes, but it may take significantly longer for larger datasets and many shapes.

Screenshot of tool dialog

Inputs

Outputs

Troubleshooting & potential pitfalls