Taming the Data Lake: The HPCC Systems Open Source Big Data Platform | SlashdotMedia AdOps Asset Management

Taming the Data Lake: The HPCC Systems Open Source Big Data Platform

A “Data Lake” is an architecture and methodology for the continuous management of complex data that stores data on raw format for increased agility on data exploration. As it enters the lake, each piece of data is readily available for manipulations and insights via a unique identifier and a set of extended metadata tags. In contrast, a “Data Warehouse” stores data in a predefined format for faster delivery of data analysis results.

HPCC Systems offers the best of both worlds by combining the fast performance of a Data Warehouse for information delivery with the ability to treat data as if it were in a Data Lake when it comes to data exploration. HPCC Systems uses distributed data architecture and a parallel processing methodology in order to work with large datasets. Enterprises are adopting data lake technology to manage their rapidly growing internal datasets and to solve complex problems through data analysis to improve their relationships with customers and suppliers.

Start Here
I understand that by clicking the button below I agree to receive quotes, newsletters and other information from LexisNexis Risk Solutions, sourceforge.net and its partners regarding business software, IT services and related products. I understand that I can withdraw my consent at anytime. I understand by clicking on the green button below I am agreeing to the SourceForge Terms of Use and the Privacy Policy which describe how we use and share your data.