Create rail timetable graphs in R

Posted: | Tags: til train

Timetable graph for the MerwedeLingelijn on Monday between 06:00 to 10:00

Timetable or schedule graphs visualise railway traffic on a route during a set time. Typically when these graphs are shared online they look like they’ve been created using jTrainGraph although I find it very difficult to get started. I attempted to use both Google Sheets and LibreOffice Calc to some degree of success but only achieved what I wanted through by using R. This was made possible by a single StackOverflow question from 2011.

TLDR: The commented source code with a sample timetable as a CSV file is available at the bottom of the page. You can skip the next few sections and jump straight there.

Formatting the timetable

The timetable file is written in CSV which allows you to create it in any spreadsheet program. The first column lists each station name, and the second column lists out the distance between each station, starting from 0. Each station is repeated twice to denote an arrival and departure time. Each column after the first two denotes the train route with times of arrival and departure at each station in the format HH:MM:SS.

Looking at the sample file, the time can be ascending or descending as you move top to down depending on the direction the train is moving. For example, train 7112 travels from Gorinchem to Dordrecht, starting at 06:30:00 and ending at 06:54:00. Since we’ve decided to Dordrecht at the top we see the train times descending going from top to down. While filling the time from bottom up, the first mention of the station name denotes the arrival time and the second mention denotes the departure time from the station. An example of a train going the opposite direction is number 7117, starting at Dordrecht at 06:01:00 and arriving at Gorinchem at 06:25:00. In this case we fill the times from top to down the first and second mention of the station names still denote the arrival and departure when going in the opposite direction as well.

If a train does not stop at the station, you can still mention the passing time twice once for arrival and once for departure. If these times are not known you can leave a blank space at this station. In the sample file you can see examples of leaving a blank space for trains 7112 to 7133 in the first few columns, all trains after that have every time cell filled.

Note: Creating the timetable in a spreadsheet allows you to use calculations to fill in the columns fairly quickly.

Walkthrough of the code

To start, I’m an absolute beginner in R. There may be errors in the code, edge cases unaccounted for, or improvements that can be made, if you have any suggestions comment on GitHub Gist or reply on Mastodon.

To begin, the CSV file is read in with check.names set to FALSE as the train routes used start with a integer, that can be left to the default of TRUE and the routes will be prepended with an X. Next, the names of train routes are saved, to do this the first two columns are dropped and the heading of each column is stored in train_series.

readfile <- read.csv("merwedelingelijn-timetable.csv", check.names = FALSE)

# get train series labels, removes first two columns
train_series <- names(readfile)
train_series <- train_series[-1]
train_series <- train_series[-1]

This for loop does most of the data transformations. The id_order contains each stop of the train route, and distances_order maps each of these train route stops to the station number. Lastly, the times list maps the times of the stops to the train series. It’s a little complicated to understand, but if you print out each list and walk through them with the timetable side by side you’ll get the picture.

To work with R I have to prepend each time with a date, I’ve picked the date based on the route times. You can include this in the correct format within the CSV to begin with and rewrite that line to skip the paste() operation altogether.

for (i in seq_along(train_series)) {
  station_number <- 1L
  for (station in readfile$Stations) {
    if (readfile[station_number, i + 2] != "") {
      id_order <- append(id_order, train_series[i])
      times <- append(times,
                      paste("2024-03-11", readfile[station_number, i + 2]))
      distances_order <- append(distances_order, readfile[station_number, 2L])
    }
    station_number <- station_number + 1L
  }
}

The times list is created into a data frame with the POSIXct and POSTIXt classes allowing us to later apply custom breaks and labels to times. Next, the train series, time and distance objects created in the previous steps are combined into a single data frame to be used by ggplot.

time <- structure(as.POSIXct(times),
                  class = c("POSIXct", "POSIXt"), tzone = "CET")

df <- data.frame(series = id_order,
                 times = time,
                 distances = distances_order)

The plot is graphed grouped by the train series which also defined the line colour. For each of the distance values in the y-axis a label is applied to display the station name instead. The x-axis is separated by 30 minute increments with the hour and minute labelled. The final two lines adds all the titles to the graph.

ggplot(df, aes(x = times, y = distances, group = series)) +
  geom_line(aes(colour = series)) +
  scale_y_continuous(
    breaks = unique(readfile$Distance), labels = unique(readfile$Stations)
  ) +
  scale_x_datetime(breaks = "30 min", date_labels = "%H:%M") +
  ggtitle("MerwedeLingelijn trains") +
  labs(colour = "Train series", y = "Station", x = "Time")

Code

The files can also be viewed as a GitHub Gist.

timetable-graph.R

library(ggplot2)

readfile <- read.csv("merwedelingelijn-timetable.csv", check.names = FALSE)

# get train series labels, removes first two columns
train_series <- names(readfile)
train_series <- train_series[-1]
train_series <- train_series[-1]

# get number of train stops for each train at each station
# id_order contains stops of each train series
# location_order maps each stop to the train series id
# times map the times of the stops to the train series id
id_order <- c()
distances_order <- c()
times <- c()
for (i in seq_along(train_series)) {
  station_number <- 1L
  for (station in readfile$Stations) {
    if (readfile[station_number, i + 2] != "") {
      id_order <- append(id_order, train_series[i])
      times <- append(times,
                      paste("2024-03-11", readfile[station_number, i + 2]))
      distances_order <- append(distances_order, readfile[station_number, 2L])
    }
    station_number <- station_number + 1L
  }
}

time <- structure(as.POSIXct(times),
                  class = c("POSIXct", "POSIXt"), tzone = "CET")

df <- data.frame(series = id_order,
                 times = time,
                 distances = distances_order)

ggplot(df, aes(x = times, y = distances, group = series)) +
  geom_line(aes(colour = series)) +
  scale_y_continuous(
    breaks = unique(readfile$Distance), labels = unique(readfile$Stations)
  ) +
  scale_x_datetime(breaks = "30 min", date_labels = "%H:%M") +
  ggtitle("MerwedeLingelijn trains") +
  labs(colour = "Train series", y = "Station", x = "Time")

merwedelingelijn-timetable.csv

Stations,Distance,7112,7114,7116,7117,7118,7119,7120,7121,7122,7123,7124,7125,7127,7129,7131,7133,7214,7216,7217,7218,7219,7220,7221,7222,7223,7224,7225,7226,7227,7228,7229,7231,7233
Dordrecht,0,06:54:00,07:24:00,07:54:00,06:01:00,08:24:00,06:31:00,08:54:00,07:01:00,09:24:00,07:31:00,09:54:00,08:01:00,08:31:00,09:01:00,09:31:00,10:01:00,07:09:00,07:39:00,05:46:00,08:09:00,06:16:00,08:39:00,06:46:00,09:09:00,07:16:00,09:39:00,07:46:00,10:09:00,08:16:00,10:39:00,08:46:00,09:16:00,09:46:00
Dordrecht,0,06:54:00,07:24:00,07:54:00,06:01:00,08:24:00,06:31:00,08:54:00,07:01:00,09:24:00,07:31:00,09:54:00,08:01:00,08:31:00,09:01:00,09:31:00,10:01:00,07:09:00,07:39:00,05:46:00,08:09:00,06:16:00,08:39:00,06:46:00,09:09:00,07:16:00,09:39:00,07:46:00,10:09:00,08:16:00,10:39:00,08:46:00,09:16:00,09:46:00
Dordrecht - Stadspolders,3.4,06:50:00,07:20:00,07:50:00,06:04:00,08:20:00,06:34:00,08:50:00,07:04:00,09:20:00,07:34:00,09:50:00,08:04:00,08:34:00,09:04:00,09:34:00,10:04:00,07:05:00,07:35:00,05:49:00,08:05:00,06:19:00,08:35:00,06:49:00,09:05:00,07:19:00,09:35:00,07:49:00,10:05:00,08:19:00,10:35:00,08:49:00,09:19:00,09:49:00
Dordrecht - Stadspolders,3.4,06:49:00,07:19:00,07:49:00,06:05:00,08:19:00,06:35:00,08:49:00,07:05:00,09:19:00,07:35:00,09:49:00,08:05:00,08:35:00,09:05:00,09:35:00,10:05:00,07:04:00,07:34:00,05:50:00,08:04:00,06:20:00,08:34:00,06:50:00,09:04:00,07:20:00,09:34:00,07:50:00,10:04:00,08:20:00,10:34:00,08:50:00,09:20:00,09:50:00
Sliedrecht Baanhoek,7.7,06:46:00,07:16:00,07:46:00,06:09:00,08:16:00,06:39:00,08:46:00,07:09:00,09:16:00,07:39:00,09:46:00,08:09:00,08:39:00,09:09:00,09:39:00,10:09:00,07:01:00,07:31:00,05:54:00,08:01:00,06:24:00,08:31:00,06:54:00,09:01:00,07:24:00,09:31:00,07:54:00,10:01:00,08:24:00,10:31:00,08:54:00,09:24:00,09:54:00
Sliedrecht Baanhoek,7.7,06:46:00,07:16:00,07:46:00,06:09:00,08:16:00,06:39:00,08:46:00,07:09:00,09:16:00,07:39:00,09:46:00,08:09:00,08:39:00,09:09:00,09:39:00,10:09:00,07:01:00,07:31:00,05:54:00,08:01:00,06:24:00,08:31:00,06:54:00,09:01:00,07:24:00,09:31:00,07:54:00,10:01:00,08:24:00,10:31:00,08:54:00,09:24:00,09:54:00
Sliedrecht,10.3,06:43:00,07:13:00,07:43:00,06:12:00,08:13:00,06:42:00,08:43:00,07:12:00,09:13:00,07:42:00,09:43:00,08:12:00,08:42:00,09:12:00,09:42:00,10:12:00,06:58:00,07:28:00,05:57:00,07:58:00,06:27:00,08:28:00,06:57:00,08:58:00,07:27:00,09:28:00,07:57:00,09:58:00,08:27:00,10:28:00,08:57:00,09:27:00,09:57:00
Sliedrecht,10.3,06:42:00,07:12:00,07:42:00,06:13:00,08:12:00,06:43:00,08:42:00,07:13:00,09:12:00,07:43:00,09:42:00,08:13:00,08:43:00,09:13:00,09:43:00,10:13:00,06:57:00,07:27:00,05:58:00,07:57:00,06:28:00,08:27:00,06:58:00,08:57:00,07:28:00,09:27:00,07:58:00,09:57:00,08:28:00,10:27:00,08:58:00,09:28:00,09:58:00
Hardinxveld - Blauwe Zoom,12.9,,,,,,,,,,,,,,,,,06:55:00,07:25:00,06:00:00,07:55:00,06:30:00,08:25:00,07:00:00,08:55:00,07:30:00,09:25:00,08:00:00,09:55:00,08:30:00,10:25:00,09:00:00,09:30:00,10:00:00
Hardinxveld - Blauwe Zoom,12.9,,,,,,,,,,,,,,,,,06:55:00,07:25:00,06:01:00,07:55:00,06:30:00,08:25:00,07:00:00,08:55:00,07:30:00,09:25:00,08:00:00,09:55:00,08:30:00,10:25:00,09:00:00,09:30:00,10:00:00
Hardinxveld - Giessendam,14.3,06:39:00,07:09:00,07:39:00,06:16:00,08:09:00,06:46:00,08:39:00,07:16:00,09:09:00,07:46:00,09:39:00,08:16:00,08:46:00,09:16:00,09:46:00,10:16:00,06:53:00,07:23:00,06:02:00,07:53:00,06:32:00,08:23:00,07:02:00,08:53:00,07:32:00,09:23:00,08:02:00,09:53:00,08:32:00,10:23:00,09:02:00,09:32:00,10:02:00
Hardinxveld - Giessendam,14.3,06:39:00,07:09:00,07:39:00,06:16:00,08:09:00,06:46:00,08:39:00,07:16:00,09:09:00,07:46:00,09:39:00,08:16:00,08:46:00,09:16:00,09:46:00,10:16:00,06:53:00,07:23:00,06:03:00,07:53:00,06:33:00,08:23:00,07:03:00,08:53:00,07:33:00,09:23:00,08:03:00,09:53:00,08:33:00,10:23:00,09:03:00,09:33:00,10:03:00
Boven – Hardinxveld,17.3,06:36:00,07:06:00,07:36:00,06:19:00,08:06:00,06:49:00,08:36:00,07:19:00,09:06:00,07:49:00,09:36:00,08:19:00,08:49:00,09:19:00,09:49:00,10:19:00,06:49:00,07:19:00,06:06:00,07:49:00,06:36:00,08:19:00,07:06:00,08:49:00,07:36:00,09:19:00,08:06:00,09:49:00,08:36:00,10:19:00,09:06:00,09:36:00,10:06:00
Boven – Hardinxveld,17.3,06:35:00,07:05:00,07:35:00,06:20:00,08:05:00,06:50:00,08:35:00,07:20:00,09:05:00,07:50:00,09:35:00,08:20:00,08:50:00,09:20:00,09:50:00,10:20:00,06:48:00,07:18:00,06:06:00,07:48:00,06:36:00,08:18:00,07:06:00,08:48:00,07:36:00,09:18:00,08:06:00,09:48:00,08:36:00,10:18:00,09:06:00,09:36:00,10:06:00
Gorinchem,23.6,06:30:00,07:00:00,07:30:00,06:25:00,08:00:00,06:55:00,08:30:00,07:25:00,09:00:00,07:55:00,09:30:00,08:25:00,08:55:00,09:25:00,09:55:00,10:25:00,06:43:00,07:13:00,06:12:00,07:43:00,06:42:00,08:13:00,07:12:00,08:43:00,07:42:00,09:13:00,08:12:00,09:43:00,08:42:00,10:13:00,09:12:00,09:42:00,10:12:00
Gorinchem,23.6,06:30:00,07:00:00,07:30:00,06:25:00,08:00:00,06:55:00,08:30:00,07:25:00,09:00:00,07:55:00,09:30:00,08:25:00,08:55:00,09:25:00,09:55:00,10:25:00,06:36:00,07:06:00,06:14:00,07:36:00,06:44:00,08:06:00,07:14:00,08:36:00,07:44:00,09:06:00,08:14:00,09:36:00,08:44:00,10:06:00,09:14:00,09:44:00,10:14:00
Arkel,28.4,,,,,,,,,,,,,,,,,06:32:00,07:02:00,06:19:00,07:32:00,06:49:00,08:02:00,07:19:00,08:32:00,07:49:00,09:03:00,08:19:00,09:32:00,08:49:00,10:02:00,09:19:00,09:49:00,10:19:00
Arkel,28.4,,,,,,,,,,,,,,,,,06:32:00,07:02:00,06:19:00,07:32:00,06:49:00,08:02:00,07:19:00,08:32:00,07:49:00,09:02:00,08:19:00,09:32:00,08:49:00,10:02:00,09:19:00,09:49:00,10:19:00
Leerdam,35.8,,,,,,,,,,,,,,,,,06:26:00,06:56:00,06:25:00,07:26:00,06:55:00,07:56:00,07:25:00,08:26:00,07:55:00,08:56:00,08:25:00,09:26:00,08:55:00,09:56:00,09:25:00,09:55:00,10:25:00
Leerdam,35.8,,,,,,,,,,,,,,,,,06:25:00,06:55:00,06:26:00,07:25:00,06:56:00,07:55:00,07:26:00,08:25:00,07:56:00,08:55:00,08:26:00,09:25:00,08:56:00,09:55:00,09:26:00,09:56:00,10:26:00
Beesd,42.9,,,,,,,,,,,,,,,,,06:19:00,06:49:00,06:31:00,07:19:00,07:01:00,07:49:00,07:31:00,08:19:00,08:01:00,08:49:00,08:31:00,09:19:00,09:01:00,09:49:00,09:31:00,10:01:00,10:31:00
Beesd,42.9,,,,,,,,,,,,,,,,,06:19:00,06:49:00,06:32:00,07:19:00,07:02:00,07:49:00,07:32:00,08:19:00,08:02:00,08:49:00,08:32:00,09:19:00,09:02:00,09:49:00,09:32:00,10:02:00,10:32:00
Geldermalsen,49.4,,,,,,,,,,,,,,,,,06:14:00,06:44:00,06:37:00,07:14:00,07:07:00,07:44:00,07:37:00,08:14:00,08:07:00,08:44:00,08:37:00,09:14:00,09:07:00,09:44:00,09:37:00,10:07:00,10:37:00
Geldermalsen,49.4,,,,,,,,,,,,,,,,,06:14:00,06:44:00,06:37:00,07:14:00,07:07:00,07:44:00,07:37:00,08:14:00,08:07:00,08:44:00,08:37:00,09:14:00,09:07:00,09:44:00,09:37:00,10:07:00,10:37:00

Related ramblings