SHAHED: A MapReduce-based system for querying and visualizing spatio-temporal satellite data

SHAHED: A MapReduce-based system for querying and visualizing spatio-temporal satellite data Remote sensing data collected by satellites are now made publicly available by several space agencies. This data is very useful for scientists pursuing research in several applications including climate change, desertification, and land use change. The benefit of this data comes from its richness as it provides an archived history for over 15 years of satellite observations for natural phenomena such as temperature and vegetation. Unfortunately, the use of such data is very limited due to the huge size of archives (> 500TB) and the limited capabilities of traditional applications. This paper introduces SHAHED; a MapReduce-based system for querying, visualizing, and mining large scale satellite data. SHAHED considers both the spatial and temporal aspects of the data to provide efficient query processing at large scale. The core of SHAHED is composed of four main components. The uncertainty component recovers missing data in the input which comes from cloud coverage and satellite mis-alignment. The indexing component provides a novel multi-resolution quad-tree-based spatio-temporal index structure, which indexes satellite data efficiently with minimal space overhead. The querying component answers selection and aggregate queries in real-time using the constructed index. Finally, the visualization component uses MapReduce programs to generate heat map images and videos for user queries. A set of experiments running on a live system deployed on a cluster of machines show the efficiency of the proposed design. All the features supported by SHAHED are made accessible through an easy to use Web interface that hides the complexity of the system and provides a nice user experience.