Redundancy
Jul. 10th, 2013 12:06 pm![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
So Toronto just made it through an unusually heavy rainstorm, which is being treated as being comparable in broad to that of Hurricane Hazel (well, actually, no: it broke Hazel's record for one day of rain at Pearson (126 mm), but Hazel dropped 285 mm over two days).
Several of the problems which followed showed the importance of redundancy (in the engineer's sense), or, rather, what happens when you don't have it.
Redundancy gets a bad name outside the engineering community. It's used as a euphemism for "unnecessary". This is especially true when applied to employment contexts ("terminating redundant employees"), but even calling, for example, a set of capital resources "redundant" is frequently a prelude to arguing that they should be dispensed with.
However, redundant systems are important. In the computer world, and especially in health care, transportation control systems, and financial services, redundancy is taken as an absolute requirement: the only debate is usually over whether a system needs a hot backup or merely a secondary system which can be turned on in the event that the primary fails. Secondary sites are usually distant from primary sites (in some jurisdictions some types of secondary sites are required by law to be more than 500 miles from the primary site).
If you have no redundancy in employment contexts, then all sorts of problems show up sooner or later. I know people who are unwilling to take the time off they're entitled to (vacations, time in lieu for overtime, etc.) because their employers provide no real coverage for the tasks they perform. If a work place runs on a "lean" basis, all it takes is a single outbreak of contagious disease to bring its functioning to a screeching halt, and the stresses associated with overwork will eventually bring in their own revenges.
Redundancy is a core concept in risk management. You need to have more staff in a health care system than is needed for normal functioning of the system, or you're screwed the next time a serious epidemic comes along (or even the next time there's a confluence of unusual disasters, like an unusual number of collisions on a holiday weekend). Some risks are more likely than others -- we can be pretty certain there will be new influenza varieties every few years, but a meteor strike large enough to level even a square mile in a populated area is relatively unlikely -- and that affects where you put your redundant resources.
Or consider the Quebec ice storm of 1998. Many high voltage transmission lines failed during that storm, but the ones which did not tended to be older, not newer -- they had been built back when structures tended (by today's standards) to be "overbuilt", rather than just being constructed to a cheaper standard which would handle normal but not extreme fluctuations.
Lack of redundancy is currently a very big deal in Toronto's transit system. GO basically saturates Union Station's capacity at rush hour, and the Yonge subway line normally runs at or above capacity at rush hour on weekdays. The TTC has some spare capacity in busses, but that buffer is currently decreasing -- ridership growth is occurring faster than the fleet catches up, although there is a large order for articulated busses in the works, beginning in 2014. The Downtown Relief Line is a high priority because it will provide additional redundancy for the Yonge line.
But one thing that became very obvious on Monday, when the subway system shut down, was that there is no real redundancy at all as regards the subway. Theoretically there is, of course -- the streetcars and busses run on a grid, and if you want to get from, say, St. Clair Station to Dundas West Station (for example) you can as an individual take the St. Clair West Streetcar to Bathurst, the Bathurst bus to Bathurst Station, the Bathurst streetcar to Dundas Street, and the Dundas streetcar to Dundas West Station. But there's no capacity to handle more than a tiny sliver of the demand when both the Yonge-University and Bloor-Danforth lines shut down, and the times are very much longer. People who normally got home at 7:30 were getting home at 10:30 or later on Monday.
Or consider the water treatment system. Part of the problem was handling the volume surge in the water coming in (a capacity redundancy issue), but a second contributing factor was a dependency on power from the grid. Unlike hospitals, which maintain backup generators in case of a blackout, the stations seem to have no alternative.
The power distribution system does have redundancy built into the switching system, but the transformer station network is less resilient than is desirable: two days later there are still areas without power and the downtown is running in a "reduced power" mode, because it doesn't take much in the way of disruption to reduce capacity below the required level. (It's also not hot backup: if a single station blows, the system can compensate by routing around the problem, but it usually takes several hours to do so.)
One of the major problems with the current municipal government is that under Ford the government has avoided the sort of funding to systems which is required to provide redundancy (or even to handle growth -- the TTC in theory is supposed to maintain crowding standards below defined levels, but on some routes it does not have the funds actually to maintain those standards as ridership growth outstrips projections). This is not new -- many previous municipal governments have skimped on projects which provide benefit only at infrequent times (flood control being an obvious example) -- but it's worrying, especially in a context where severe weather events are likely to be increasingly frequent due to global climate change.
Several of the problems which followed showed the importance of redundancy (in the engineer's sense), or, rather, what happens when you don't have it.
Redundancy gets a bad name outside the engineering community. It's used as a euphemism for "unnecessary". This is especially true when applied to employment contexts ("terminating redundant employees"), but even calling, for example, a set of capital resources "redundant" is frequently a prelude to arguing that they should be dispensed with.
However, redundant systems are important. In the computer world, and especially in health care, transportation control systems, and financial services, redundancy is taken as an absolute requirement: the only debate is usually over whether a system needs a hot backup or merely a secondary system which can be turned on in the event that the primary fails. Secondary sites are usually distant from primary sites (in some jurisdictions some types of secondary sites are required by law to be more than 500 miles from the primary site).
If you have no redundancy in employment contexts, then all sorts of problems show up sooner or later. I know people who are unwilling to take the time off they're entitled to (vacations, time in lieu for overtime, etc.) because their employers provide no real coverage for the tasks they perform. If a work place runs on a "lean" basis, all it takes is a single outbreak of contagious disease to bring its functioning to a screeching halt, and the stresses associated with overwork will eventually bring in their own revenges.
Redundancy is a core concept in risk management. You need to have more staff in a health care system than is needed for normal functioning of the system, or you're screwed the next time a serious epidemic comes along (or even the next time there's a confluence of unusual disasters, like an unusual number of collisions on a holiday weekend). Some risks are more likely than others -- we can be pretty certain there will be new influenza varieties every few years, but a meteor strike large enough to level even a square mile in a populated area is relatively unlikely -- and that affects where you put your redundant resources.
Or consider the Quebec ice storm of 1998. Many high voltage transmission lines failed during that storm, but the ones which did not tended to be older, not newer -- they had been built back when structures tended (by today's standards) to be "overbuilt", rather than just being constructed to a cheaper standard which would handle normal but not extreme fluctuations.
Lack of redundancy is currently a very big deal in Toronto's transit system. GO basically saturates Union Station's capacity at rush hour, and the Yonge subway line normally runs at or above capacity at rush hour on weekdays. The TTC has some spare capacity in busses, but that buffer is currently decreasing -- ridership growth is occurring faster than the fleet catches up, although there is a large order for articulated busses in the works, beginning in 2014. The Downtown Relief Line is a high priority because it will provide additional redundancy for the Yonge line.
But one thing that became very obvious on Monday, when the subway system shut down, was that there is no real redundancy at all as regards the subway. Theoretically there is, of course -- the streetcars and busses run on a grid, and if you want to get from, say, St. Clair Station to Dundas West Station (for example) you can as an individual take the St. Clair West Streetcar to Bathurst, the Bathurst bus to Bathurst Station, the Bathurst streetcar to Dundas Street, and the Dundas streetcar to Dundas West Station. But there's no capacity to handle more than a tiny sliver of the demand when both the Yonge-University and Bloor-Danforth lines shut down, and the times are very much longer. People who normally got home at 7:30 were getting home at 10:30 or later on Monday.
Or consider the water treatment system. Part of the problem was handling the volume surge in the water coming in (a capacity redundancy issue), but a second contributing factor was a dependency on power from the grid. Unlike hospitals, which maintain backup generators in case of a blackout, the stations seem to have no alternative.
The power distribution system does have redundancy built into the switching system, but the transformer station network is less resilient than is desirable: two days later there are still areas without power and the downtown is running in a "reduced power" mode, because it doesn't take much in the way of disruption to reduce capacity below the required level. (It's also not hot backup: if a single station blows, the system can compensate by routing around the problem, but it usually takes several hours to do so.)
One of the major problems with the current municipal government is that under Ford the government has avoided the sort of funding to systems which is required to provide redundancy (or even to handle growth -- the TTC in theory is supposed to maintain crowding standards below defined levels, but on some routes it does not have the funds actually to maintain those standards as ridership growth outstrips projections). This is not new -- many previous municipal governments have skimped on projects which provide benefit only at infrequent times (flood control being an obvious example) -- but it's worrying, especially in a context where severe weather events are likely to be increasingly frequent due to global climate change.