HAGP: A Hub-Centric Asynchronous Graph Processing Framework for Scale-Free Graph

HAGP: A Hub-Centric Asynchronous Graph Processing Framework for Scale-Free Graph Graph structure which is often used to model the relationship between the data items has drawn more and more attention. The graph datasets from many important domains have the property called scale-free. In the scale-free graphs, there exist the hubs, which have much larger degree than the average value. The hubs may cause the problems of load imbalance, poor scalability and high communication overhead when the graphs are processed in the distributed memory systems. In this paper, we design an asynchronous graph processing framework targeted for distributed memory by considering the hubs as a separate part of the vertexes, which we call it the hub-centric idea. Specifically speaking, a hub-duplicate graph partitioning method is proposed to balance the workload and reduce the communication overhead. At the same time, an efficient asynchronous state synchronization method for the duplicates is also proposed. In addition, a priority scheduling strategy is applied to further reduce the communication overhead.