Apache Spark is a platform for processing big data through streaming. Streaming can be much faster than disk-based processing offered by traditional Hadoop installations. Here’s what Cloudera has to say about Spark.
Use cases: Apache Spark supports batch, streaming, and interactive analytics on all your data, enabling historical reporting, interactive analysis, data mining, real-time insights.
In his letter from Cloudera, CEO Tom Reilly made a few interesting points.
$900 million round of funding
Cloudera secured a $900 million round of funding earlier this year, one of the largest ever in enterprise software. The majority of the investment came from Intel. Tom Reilly calls out security encryption at the chip level as an outcome of the Intel relationship. Cloudera now has over 800 team members.
Acquisition of Gazzang and DataPad
Gazzang reportedly enables the industry’s first and only fully secure and regulation compliant Hadoop platform. DataPad has created a Python-based framework that simplifies data processing and analysis with Cloudera Enterprise.
The Cloudera partnerships listed are with Microsoft Azure, MongoDB, EMC, and Teradata.