1 post tagged “cell phone sim outage redundancy”
EE-Time Europe posted an article about a serious outage at T-Mobile. Most German mobile voice, text and data services were out for 6 hours.The failure was caused by "the Home Location Register, a data base that correlates individual SIM cards and phone numbers." They lost 2 of 3 servers that provide the Home Location Register service, and that brought the whole works down.
The "Home Location Register service" is the fundamental wireless network identity service. As each mobile phone joins the network, the SIM that identifies the phone securely is used as a reference to look up you account details - are you a customer, how are you billed, what services are available. No SIM-to-account correlation means no ability to provide services or to bill for minutes. This has to be one of the more critical services for a cell phone provider.
What caught my eye was this: "It took a long time to get hold of the system engineers to locate and eliminate the server problem: The respective experts could not be reached via their mobile phone connection."
Somehow that seems amusing to me. I'm sure it wasn't amusing to T-Mobile, more like a nightmare. The cell phone service cratering was hard to fix because the cell phone network had cratered. Seems like backups are the most notable issue here: only having 3 servers to handle SIM correlation, when loss of the service brought everything down - should've been more backup servers! Urgent need to get experts in during off hours to fix things, but they can't easily be reached until we fix things - should've had a backup contact method that didn't rely on the cell phone network. Of course, like most disaster response problems, this is much more obvious in hind-sight.