Frequently asked questions about Cloudera
- What does Cloudera do?
- How do I engage Cloudera?
- Ha-What? What is Hadoop?
- I know what Hadoop is, but what about all those other things - like Pig, Flume, Sqoop, etc.?
- Who is Doug Cutting and what does he do at Cloudera?
- What is a "data scientist"?
- I have a lot of data, but how do I know if it's "Big Data?"
- What does "Ask Bigger Questions" mean?
- What solutions does Cloudera offer?
- What is CDH?
- What is Cloudera Enterprise RTQ?
- How do I get Cloudera products?
- What kinds of companies are using Cloudera solutions today?
- Why is an open source platform important?
- Do you help people learn how to use Hadoop?
- Where can I find more information about Cloudera University and the courses offered?
- What is the Cloudera Connect Partner Program?
1. What does Cloudera do?
Cloudera is the leader in Apache Hadoop-based software and services and offers a powerful new data platform that enables enterprises and organizations to look at all their data — structured as well as unstructured — and ask bigger questions for unprecedented insight at the speed of thought.
Cloudera was founded by some of the top minds in Big Data. Our Chief Architect, Doug Cutting, is a founder of Hadoop. Cloudera enhances the storage and processing technologies originally developed by the world’s biggest web companies. Today, Cloudera is the market leader in Hadoop with tens of thousands of nodes under management, as well as the top contributor of code to the Hadoop ecosystem. Markets include financial services, government, telecommunications, media, web, advertising, retail, energy, bioinformatics, pharma/healthcare, university research, oil and gas, gaming and more.
2. How do I engage with Cloudera?
Is your organization looking to implement a Big Data/Hadoop solution within your data infrastructure? Are you a potential technology partner? Are you looking for classroom or private training? Are you a reporter working on a Big Data story? The information below will help you connect with the right Cloudera resource.
Request a Quote: http://www.cloudera.com/content/cloudera/en/about/contact-us/contact-form.html
Training and Certification: http://university.cloudera.com/
Private Training: http://go.cloudera.com/privatetrainingrequest.html
Media Inquiries: firstname.lastname@example.org
3. Ha-What? What is Hadoop?
Cloudera Chief Architect and Chairman of the Apache Software Foundation Doug Cutting, helped create Apache Hadoop as 100% open source software that enables distributed parallel processing of huge amounts of data across inexpensive, industry-standard servers that both store and process the data, and can scale without limits.
With Hadoop, no data is too big. And in today’s hyper-connected world where people and businesses are creating more and more data every day, Hadoop’s ability to grow virtually without limits means businesses and organizations can now unlock potential value from all their data.
Today, Cloudera is the market leader in Hadoop deployments with tens of thousands of nodes under management, as well as the top contributor of code back to the Hadoop ecosystem.
4. I know what Hadoop is, but what about all those other things - like Pig, Flume, Sqoop, etc.?
The Apache Foundation supports many open source projects that are designed to add additional functionality and capabilities to Hadoop. For more information about each project, we encourage you to read our blog at http://www.cloudera.com/blog and review detailed project information at http://www.apache.org.
5. Who is Doug Cutting and what does he do at Cloudera?
Doug is a founder of numerous successful open source projects, including Lucene, Nutch and Hadoop. Doug joined Cloudera in 2009 from Yahoo!, where he was a key member of the team that built and deployed a production Hadoop storage and analysis cluster for mission-critical business analytics. He serves as Cloudera’s Chief Architect and Chairman of the Apache Software Foundation.
6. What is a "data scientist"?
Data science has been around for decades, but is now becoming more mainstream, and data scientists may well become the developers of the next gen of enterprise systems as analytics, Big Data and BI come together.
Cloudera’s Chief Scientist Jeff Hammerbacher and LinkedIn’s DJ Patil developed the term and created teams of workers who were focused on data science. Today, data scientists are building recommendation systems and solving science problems by digging deep and across data sets for correlations and insights that can be used to help develop new products and solutions.
Cloudera recently announced a new training course and certification program for Data Scientists. The Cloudera Certified Apache Hadoop Professional Data Scientist certification program will be the first of its kind to test actual system skills by administering both a written and performance-based exam.
7. I have a lot of data, but how do I know if it's "Big Data?"
Every company that has data likely has "Big Data," and it grows continuously. Big Data is any type of data, including structured and unstructured data such as log files, customer service information, retail data, text, database information and so on. All of this data can now be analyzed in aggregate, across types and formats, to help make more informed business decisions and drive new solutions.
8. What does "Ask Bigger Questions" mean?
Cloudera’s vision for the next generation of data management describes a world where real-time interactivity with data at the speed of thought enables data driven enterprises to Ask Bigger Questions and gain bigger answers, translating into significant business value.
Enterprises struggle to store and manage Big Data – by definition Big Data exceeds the capacity of current systems - and the reason why is clear: those legacy systems were designed decades ago, long before "Big Data" was front and center in the collective imagination. But the development of Apache Hadoop changes the economics of data completely. For the first time, the technology exists to efficiently store, manage, and analyze virtually unlimited amounts of data. 100% open source and engineered to run on industry standard hardware, Hadoop scales virtually without limits and handles any type of data, no matter how it’s encoded or formatted and whether it’s structured or unstructured. And now with Cloudera, Hadoop is enterprise-proven, already delivering on the promise of Big Data across numerous industries and hundreds of use cases.
9. What solutions does Cloudera offer?
Cloudera Enterprise Core is the most comprehensive solution for Hadoop and the foundation for Big Data in the data center. Comprising CDH, Cloudera’s 100% open source distribution of Hadoop and related projects; Cloudera Manager, our industry-leading management suite; and Cloudera Support for operational confidence, it includes everything your organization needs to get on the fastest path to value with Hadoop. Add Cloudera Real-Time Accelerators to work with Hadoop at the speed of thought.
Cloudera Enterprise Free offers the essential evaluation kit for Hadoop in the enterprise. It includes CDH and Cloudera Manager Free Edition, a free version of the Cloudera Manager application.
Cloudera Manager simplifies deployment, configuration, diagnostics and reporting for CDH and is available in two versions. Choose Cloudera Manager Free Edition to install, configure and perform basic management of a CDH cluster. For critical business use, you want Cloudera Manager, which includes many additional features that are not available in Cloudera Manager Free Edition and enables complete, end-to-end management for CDH clusters of any size.
10. What is CDH?
CDH is Cloudera’s 100% open source distribution of Hadoop and related projects, built specifically to meet enterprise demands. CDH delivers the core elements of Hadoop – scalable storage and distributed computing – as well as all of the necessary enterprise capabilities such as security, high availability and integration with a broad range of hardware and software solutions.
Available for free download, CDH is the world’s most widely deployed distribution of Hadoop and is currently run at scale in production environments across a broad range of industries and use cases.
11. What is Cloudera Enterprise RTQ?
Powered by the open source Impala technology pioneered by Cloudera, Cloudera Enterprise RTQ enables companies to do real-time, interactive data analysis in seconds for true speed-of-thought analytics while storing and managing data at a fraction of the cost of traditional systems. Cloudera Enterprise RTQ is available as an add-on feature to Cloudera Enterprise. With this major advance, Cloudera is rewriting the way enterprises do data management.
12. How do I get Cloudera products?
CDH is available for free download at www.cloudera.com/downloads or can be installed automatically via Cloudera Manager.
Companies looking to implement Cloudera Enterprise can speak with a member of the Cloudera sales team for help in determining the best solution to fit their Big Data needs. Call 1-888-789-1488, option 1 or visit http://info.cloudera.com/Contact.html.
13. What kinds of companies are using Cloudera solutions today?
More than half of the Fortune 50 are utilizing CDH today. With tens of thousands of nodes under management across customers in financial services, government, telecommunications, media, web, advertising, retail, energy, bioinformatics, pharma/healthcare, university research, oil and gas and gaming, CDH is the most widely used Hadoop distribution in the world.
To find out more about the companies using CDH today, please visit our customer page: www.cloudera.com/customers/
14. Why is an open source distribution important?
Since its inception in 2008, Cloudera has been strongly committed to a community-driven, open source Apache Hadoop distribution because of its tangible benefits to customers, such as freedom from lock-in. That said, these benefits are just “ante into the game” when deploying a complex and strategic open source platform like Hadoop.
In addition to offering the benefits of open source, Cloudera has led the way to work with customers to ensure that their performance, availability, security, and recoverability needs are met in the open source platform, and delivered in our open source distribution. With far more customers in production and far more enterprise experience than any other Hadoop vendor, Cloudera is uniquely qualified to understand these needs and has proven to be the best partner for meeting them.
Cloudera is also a proud supporter of Hadoop, Data Science, and Big Data user groups and meet ups worldwide.
15. Do you help people learn how to use Hadoop?
Yes! Cloudera has the longest running Hadoop training program with over 15,000 people trained worldwide. Cloudera was the first to develop an official curriculum for training and certifying developers and system administrators on Apache Hadoop in 2009. Cloudera University provides training and certification globally for developers, administrators and managers on Apache Hadoop and many related projects including Apache Hive, Apache Pig and Apache HBase. Cloudera’s certification program – Cloudera Certified Developer for Apache Hadoop (CCDH) and Cloudera Certified Administrator for Apache Hadoop (CCAH) provides industry-recognized credentials that are increasingly being required by organizations hiring individuals to work on their Hadoop-based projects.
Cloudera University’s global training schedule in 2012 includes more than 250 courses in ten countries. Cloudera is planning to deliver training next year in over 25 countries and expand its certification program to provide credentials across the entire spectrum of Hadoop-related technologies including the recently announced Data Scientist training and certification program.
In addition to Cloudera University’s training and certification, Cloudera also offers free, online resources for learning about Hadoop and Hadoop-related projects. Users can find white papers, webinars and videos to help advance their understanding of Hadoop. Cloudera's Developer Center is also a great resource for technical implementation and administration best practices.
16. Where can I find more information about Cloudera University and the courses offered?
Cloudera University has detailed course information, schedules and registration links for all training and certification programs offered for Hadoop, as well as all free online training resources.
17. What is the Cloudera Connect Partner Program?
The Cloudera Connect Partner Program is more than 400 companies strong and is designed to champion partner advancement and solution development for the Apache Hadoop ecosystem. Together, we can promote a broader, stronger presence by combining your product and services expertise with our leading Apache Hadoop-based technology, support and partner enablement. With more members than any other partner program and the only Hadoop provider with a Technology Certification program, Cloudera ensures consistency, reliability and tight integration with enterprise environments.
The Cloudera Connect Partner Program is a multi-track program designed for independent software vendors (ISVs), independent hardware vendors (IHVs), systems integrators (SIs), value-added resellers (VARs) and Training Organizations that are entering or already part of the Apache Hadoop data management and analytics ecosystem.
We welcome partners who want to participate at either our baseline level as Cloudera Connect Members or at an advanced level as Cloudera Connect Authorized Partners.
For detailed information on our tracks and levels, please see the Cloudera Connect Partner Program Guide.