Over the past few years, distributed data caches and cluster-based distributed data environments became popular with large scale IT systems. Hadoop now provides the data scalability solution to Facebook and many other large systems. Apache designed Hadoop to scale up from single servers to thousands of machines, each offering local computation and storage. Now, customers are looking to migrate their on-premises Hadoop systems to public cloud environments, but need to minimize risk, business disruption, and cost.
This paper examines the functional and performance differences between two Hadoop migration tools: WANdisco LiveMigrator and Cloudera BDR.