Transport for NSW believes a failed network switch caused the hour-long communications outage yesterday, compounded by the system’s failure to automatically switch to a backup network.
The blackout stopped all the trains at the stations, because the communication network is essential for communication between drivers, guards and the management center of the railway network.
Sydney Trains CEO Matt Longland told a press conference this morning that the network in question communicates radio transmissions from central control to train drivers via 200 base stations.
Longland said the system had operated “reliably since 2016” and that this “is the first incident of its kind.”
When the problem first arose around 2:45 p.m. Wednesday, “staff here at rail operations sought to do a remote system reset,” Longland said.
“They watched that process for about five minutes,” he said.
“When they realized that this was not possible and the impact on the entire network, we activated our crisis management plan.”
Longland said an investigation would look into why an automatic failover to a redundant system did not occur.
“The investigation will really focus on why the system could not automatically go offline, as it should have, in an incident like this,” he said.
“The system has the redundancy to automatically switch to a backup. That should have happened immediately… [but] did not happen
“We have a secondary backup, which is a secondary data center running in parallel, that we were able to move to in case of a major issue.”
The passive backup, at Homebush, was mobilized and ran in parallel with the main system, but Longland said the production upload was never interrupted since a fix was found.
Longland said the performance of the replacement network switch is being monitored.
The investigation will also include Sydney Trains’ use of incident response technology by supplier Frequentis.
So far, Longland added, “there is no suggestion” that a cybersecurity incident caused the problems.