We live in an era of data deluge, where pervasive interconnected sensors collect massive amounts of information on every bit of our lives. Accordingly, propelled by emergent social networking services as well as high-definition streaming platforms, communication networks have evolved from specialized, research and tactical transmission systems to large-scale and highly complex interconnections of intelligent devices that carry massive volumes of multimedia traffic. While Big Data can be definitely perceived as a big blessing, big challenges also arise with large-scale network data. The sheer volume of data makes it often impossible to run analytics using a central processor and storage unit, and distributed processing with parallelized multi-processors is preferred while data themselves are stored in the cloud. As many sources continuously generate data in real time, analytics must often be performed “on-the-fly,” without an opportunity to revisit past entries. Due to their disparate origins, the resultant network datasets are often incomplete and include a sizeable portion of missing entries. Overall, Big Data present challenges in which resources such as time, space, and energy, are intertwined in complex ways with data resources. As "netizens" demand a seamless networking experience with not only higher speeds, but also resilience to failures and malicious cyber-attacks, ample opportunities for data-driven signal processing (SP) research arise.
This tutorial seeks to provide an overview of ongoing research in novel models applicable to a wide range of Big Data analytics problems arising with e.g., dynamic network monitoring, as well as algorithms and architectures to handle the practical challenges, while revealing fundamental limits and insights on the mathematical trade-offs involved.