Chapter 6

Self-Healing

Szabolcs Nováczki, Volker Wille, Osman Yilmaz, Seppo Hämäläinen and Henning Sanneck

Cellular networks are very large and extremely complicated systems. For systems of this size and complexity it is not uncommon for faults to occur. Faults can appear at several functional areas of a complex cellular network, however, the most critical domain from a fault management viewpoint is the Radio Access Network (RAN). Every base station is responsible to serve a dedicated space of the coverage area with little, if any, redundancy. If one of these network elements is not capable to fulfil its responsibilities due to the presence of a fault, there will be no other entity to offer service until the fault is rectified. During the resulting period of degraded performance, users are not experiencing services with acceptable availability, reliability or quality-of-service, which may cause serious revenue loss for the operator.

The problem for the network operator is, first of all, that there is huge number of network elements (i.e. base stations) each of which can go into a state of degradation. Such degradations manifest themselves in variations of several KPIs and raised alarms which are not easily mapped to a specific cause. These facts imply that considerable manual workload is required to manage this part of the system, because troubleshooting engineers need to permanently analyse performance data. ‘Manual work’ often also means that the degradation time period mentioned ...

Get LTE Self-Organising Networks (SON): Network Management Automation for Operational Efficiency now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.