Apache Falcon : InMobi’s Contribution to Open Source Community for Data Management

Falcon is a data management and process orchestration platform that InMobi has built and made open source through Apache Software Foundation. Falcon enables end consumers to quickly onboard their data and its associated processing and management tasks on Hadoop clusters. Data Management on Hadoop encompasses data motion, process orchestration, lifecycle management, data discovery, etc. among other concerns. Falcon is a new data processing and management platform for Hadoop that solves this problem and creates additional opportunities by building on existing components within the Hadoop ecosystem without reinventing the wheel. Falcon will enable easy data management via declarative mechanism for Hadoop. Users of Falcon platform simply define infrastructure endpoints, data sets and processing rules declaratively. These declarative configurations are expressed in such a way that the dependencies between these configured entities are explicitly described. This information about inter-dependencies between various entities allows Falcon to orchestrate and manage various data management functions.