Here’s a detailed article from Cloudera about how to use Spark streaming to build a near real time dashboard.
Cloudera is rebuilding machine learning for Hadoop with Oryx zite.to/1fOWtNC
One of the challenges when analyzing data is having to move it from place to place. A common method for analysis would be to extract data from a data warehouse to then analyze it on a local pc/laptop. Then once the analysis or model build is complete the logic then needs to be transferred back to the data warehouse so it can be put in production. This process can be slow and error-prone.
MADlib is an opensource library of analytic and machine learning functions that can run in-database. By installing them on your database, you won’t need to move the data anymore. Instead, you can move your logic to the data.
Here is How-to: Use MADlib Pre-built Analytic Functions with Impala – check out this post from Cloudera.
Cloudera has a post on their blog by one of the engineers from Rapleaf about transitioning their infrastructure from MySQL to Hadoop and then benefits it offered for scaling.
Cloudera has formed an integration alliance with Greenplum. Cloudera will integrate their distribution of Hadoop with Greenplum’s Chorus product. Read more at ZDNet.
As the Open Source software movement continues the strengthen, questions abound about where the opportunities to create commercially viable solutions. Red Hat did it with Linux. Can Cloudera do it with Hadoop? Read this GigaOm article.
The guys from Cloudera put together the following executive overview of what Hadoop can do for big data.
Cloudera founder, Jeff Hammerbacher, has a new data book available – Beautiful Data: The Stories Behind Elegant Data Solutions