Scalable Algorithms for Large and Dynamic Networks: Reducing Big Data for Small Computations

Scalable Algorithms for Large and Dynamic Networks: Reducing Big Data for Small Computations In this paper we summarize recent research regarding a novel characterization of large-scale real-life informational networks which can be leveraged to speed computations for network analytics purposes by orders of magnitude. First, using publicly available data, we show that informational networks not only satisfy well-known principles such as the small-world property and variants of the power law degree distribution, but that they also exhibit the geometric property of large-scale negative curvature, also referred to as hyperbolicity. We then provide examples of large-scale physical networks that universally lack this property, thus showing that hyperbolicity is not an ever-present feature of real-life networks in general. We document how hyperbolicity leads to unusually high centrality in informational networks. We then describe an approximation of hyperbolic networks that leverages the observed property of high centrality. We provide evidence that the fidelity of the proposed approximation is not only high for applications such as distance approximation, but that it can speed computation by a factor of 1000X or more. Finally, we discuss two applications of our proposed linear-time distance approximation for informational networks: one for personalized ranking and the other for clustering. These and many more algorithms yet to be developed take full advantage of our proposed tree-approximation of hyperbolic networks and further demonstrate its power and utility.