Ajay Gupta
3 min readFeb 29, 2020

Three Pillars of Telecom Big Data Analytics

Telecom today is deeply penetrated among masses. Almost, everybody is subscribed to one or more telecom services. Most Telecom vendors covers large geographical areas with massive infrastructure deployed in these areas in order to provide Internet services to a very large set of users.

Therefore, telecom today offers one of biggest opportunity for application of big data analytics to achieve primary business goals such as keeping the existing subscribers happy and further acquiring newer subscribers of telecom services. In this regard, I have highlighted three main pillars/aspects of big data analytics in a telecom domain.

Network Analytics: In order to provide Internet access, a typical telecom deployment consists of variety of network nodes which transfers data either in wired or wireless mode. Network Analytics revolves around analysis of various parameters related to either performance, health or configuration of these network nodes. Since, millions of wide variety of network nodes are usually deployed in a moderate to big size telecom deployment, and each node could expose several parameters, network analytics on such large number of nodes requires a big data analytics ecosystem.

A suitably designed Network Analytics pipeline provides quick insights, at periodic intervals, into the performance and health of network nodes thereby allowing telecom vendor to take suitable actions to maintain the desired quality of experience on the network. Some of the example actions could relate to certain tuning to be performed on certain nodes, or planning of new nodes due to saturation/ of the existing ones.

Spatial Analytics: Spatial analytics is very important in a telecom deployment covering large geographical areas. Spatial analytics empower a telecom vendor to calculate geographical distribution of important telecom parameters such as coverage, capacity and user density, etc. This geographical distribution usually consists of spatial transformation/aggregation of various parameters of interests at desired geographical coordinates or in geographical tiles/grids. Also, with spatial analytics, spatial correlation could be performed between various parameters which could be further used in multiple ways.

To perform spatial analytics, one needs a data set where in records are tagged either to a geographical coordinate (represented by Latitude and Longitude), geographical tile (represented by a Tile ID and Zoom) or a geographical grid (represented by a grid corner coordinate).

A suitably designed Spatial Analytics pipeline could identify problematic geographical areas in various aspects such as in terms of bad coverage, saturated capacity or high user density. These problematic areas can then be addressed by telecom vendor by way of routine telecom tuning/optimizations or by provisioning additional telecom gear in those areas.

User Analytics: Users mobile devices are connected to the telecom infrastructure either in idle or connected mode. In connected mode user device establishes an active voice or data sessions. User Analytics is typically performed on the data related to these sessions to deduce the various metrics related to users’ experiences and behaviors.

There could be millions of users on a mid to large size telecom network. These millions of users use voice and data sessions intermittently throughout a day thereby producing large amounts of voice/data sessions on the telecom infrastructure every day. Therefore, similar to network analytics, user analytics too requires big data analytics ecosystem.

A user voice experience can go bad when a particular voice session unexpectedly drops or experience muting intervals. Similarly, a user data experience can go bad when a particular data session experience high latency or slow speeds. With user analytics at hand, telecom vendors could aggregate relevant parameters from the user data to calculate user experience metrics at regular intervals of time. Further, the user data could be correlated with relevant network/spatial data to inspect cause of problematic experience which then can be addressed via suitable actions.

A typical network, spatial or user analytics pipeline, relying on big data ecosystem, firstly consist of periodic ingestion of corresponding data (in a big data lake in suitable formats) either in real time or batch wise. Followed by ingestion, one or more Map-reduce or Spark Jobs are scheduled periodically or in real time to perform desired analytics on the ingested data in order to produce required metrics.

If you are a Spark developer and have handled FetchFailed exceptions, writes Spark pseudo code in terms of transformations and actions and find always a scope of optimization in existing Spark Jobs, connect to explore the opportunities in field of telecom big data analytics.

Ajay Gupta

Leading Data Engineering Initiatives @ Jio, Apache Spark Specialist, Author, LinkedIn: https://www.linkedin.com/in/ajaywlan/