MapReduce is a very popular data analytic framework that is widely used in both industry and scientific research. Despite the popularity of MapReduce, there are several obstacles to applying it for developing some commercial and scientific data analysis applications. iNFORMER is a “A MapReduce-like Data-Intensive Processing Framework for Native Data Storage and Formats” that addresses many of the drawbacks of using MapReduce on scientific data.

The framework allows MapReduce-like applications to be executed over data stored in a native data format, without first loading the data into the framework. This addresses a major limitation of existing MapReduce-like implementations that require the data to be loaded into specialized file systems, e.g., the Hadoop Distributed File System (HDFS). The overheads and additional data management processes required for this translation can prevent MapReduce from being used in many commercial and scientific environments. 

Read more ...