I recall nearly two years ago an enormous problem at RBC that was all over the news, when customers transactions were lost. Literally RBC couldn’t tell customers how much money was in their Bank accounts. They have solved for that now, with duplicated systems. Its a seminal reminder that while we consider the new marketing and other fun things, its all for naught if we don’t keep the basics working properly. The complexity of Banks product, marketing, and channel systems requires careful design, and contingency plans.
“Sometimes when you do a upgrade on the weekend you wonder: am I going to have a bad Monday?” Fenwick said. (Two years ago, RBC suffered a processing disruption during a change to its software systems that snarled thousands of withdrawals, deposits and other transactions across Canada.)
All the applications in the dual-active system have to meet a recovery time objective of 60 seconds and a recovery point objective of less than two hours, Fenwick said. The former refers to how long it would take to flip the handling of application requests from the first system to the second. The recovery point refers to how up to date the information that gets flipped to the second system will be. Depending on the volume of queries at the time of disruption, both measures could be far less than their limit, Fenwick added.
It’s like traffic on a highway. It doesn’t matter how many lanes you have “if a tractor-trailer jackknifes you, there’s not much you can do,” she said, referring to demands that bring down a system as drive-by queries.
Source: IT Business
tags: backup, technology+reliability
