There is an incredible volume of free resources online to help learn the Hadoop ecosystem. I’ve gone through these resources, Webinars, and VM tutorials myself, and these will teach you the basics of the Hadoop landscape.
A large portion of all of these resources are not vendor-specific. The fundamentals of the ecosystem (HDFS, MapReduce, Hive, Pig, Sqoop, Flume, etc) are the same across all vendors. Each vendor then has their twist on the administration toolset, and has added their own extended capabilities.
Cloudera Essential for Apache Hadoop
This is a 7 part recorded Webinar series.
Here is where you can download the Cloudera VM image:
A vast library of Webinars covering a multitude of subject areas. Among these is a good walk-through with setting up the “Word Count” MapReduce program in Java (which is the “Hello World” program for Hadoop):
- Hadoop Basics
- Cluster Management (this will be MapR-centric)
- Hadoop Use Cases
- Hadoop Admin
- Hadoop Developer
- Hadoop Business User
Here is where you can download the MapR VM image/Sandbox:
For the Oracle / PL/SQL programmers who are new to Java, here are links to a couple of docs from Peter Koletzke @ Quovera which help bridge the gap. These were presented @ ODTUG 2013 (as well as at RMOUG in 2012). These really have nothing to do with Hadoop specifically, but they do help the non-java programmer understand the Java programming structure. Note, not all Hadoop programming requires Java. There are several tools which are MapReduce generators, such as Hive and Pig. As the ecosystem tools continue to evolve, our need for custom Java will dwindle, but it’s still good to understand, if even for just a custom UDF here and there.
“Introduction to Java – PL/SQL Developers Take Heart”
White paper – http://www.quovera.com/whitepapers/downloads/PAPERPKoletzke.IntroductionTo.pdf
Presentation – http://www.quovera.com/whitepapers/downloads/PKoletzke.IntroductionTo.pdf
This is a VM image which also comes with several hand-on Hadoop tutorials:
There are thousands of additional resources and videos online concerning Hadoop and MapReduce. I recommend searching youtube.com for the key Hadoop ecosystem terms: MapReduce, Hadoop, HDFS, HIVE, etc. Sometimes seeing and hearing the same concept described multiple different ways helps
While not Hadoop, this is a also a competitive Big Data platform. They have a set of VM downloads (Aster Express) with some extensive free tutorials which provide some great insight about analytics of BigData in general.
VM images – http://www.asterdata.com/downloads/
Tutorials – http://www.asterdata.com/download_aster_express/tutorial.php