Network Reliability and Human Factors
By Chris Bastian
SVP, Engineering and CTO for SCTE•ISBE
Telecommunication networks are getting more complex. Speeds are increasing, network functions such as subscriber management, security and load balancing are being virtualized, and more services, including those addressing SmartHome and Smart Cities, are being overlaid onto the network.
Consider the human factor in this complexity equation. Humans have always played a large role in deploying, maintaining, and monitoring the network. Human-computer interaction (HCI) initially consisted of entering rudimentary commands to program the network elements. For programming voice services, this was called switch translations. For example, entering “alw-card” on the command line interface would activate a card in a device. Over time, command scripting became popular to “bulk load” hundreds of commands at one time into multiple devices. Ironically, this also created new risks to network reliability. Scripting saved time, but if one line or even one character was incorrect, simultaneously bulk loading the error to hundreds if not thousands of devices would cause widespread outages.
As network complexity grows, humans will continue to play a large role, although this role will continue to evolve. Yesterday’s switch translation engineer has become today’s network automation engineer, and their functions consist of configuring, managing and monitoring the end-to-end network, ensuring network connectivity for the multiple and varied services being carried, as well as quickly resolving major network incidents and focusing on root-cause analysis.
Network automation assists the human engineer by performing repetitive tasks without human intervention. Automation has been applied to all aspects of network design, deployment, maintenance and monitoring. It can install and update new network routes. It can upgrade software. And for testing, it supports conducting many more test cases compared to manual methodology. With automation in any form, it is critical that the procedure be thoroughly quality-assured before deploying.
Whether manually entering a command, or authoring an automated process, humans still account for a significant amount of network outages. Non-scientifically speaking, there is the old saying in network operations that the days with the least network incidents are the days when there are network moratoriums. Moratoriums generally are imposed around such events as holidays and the Super Bowl. In other words, when humans aren’t touching the network, it runs the smoothest. Dimensional Research conducted a survey in 2017 which outlines the magnitude of human causing outages. 315 network experts were surveyed, and 97% responded that humans cause network outages, while 45% responded that humans frequently cause the outages, or cause most or all of them.
Network operations will always require some level of human intervention overseeing automation. There needs to be focus on that human interaction with the network. Sufficient training must be provided to the workforce, and standardized methods of procedure must be followed. SCTE•ISBE is launching a Human Factors working group, whose mission will be to publish operational best practices that will minimize network downtime and degradations due to human interaction. Please visit www.scte.org/standards to learn more.
SCTE•ISBE Cable-Tec Expo® 2019, scheduled for Monday, Sept. 30 through Thursday, Oct. 3 in New Orleans, will have a speaker track focused on Operational Transformation. Sessions will address operators’ need to rethink how they train and inform the workforce to install, maintain and repair increasingly complex technologies such as DAA, FDX, 5G and Wi-Fi 6.