Hadoop Real-World Solutions Cookbook

About This Book

Solutions to common problems when working in the Hadoop environment
Recipes for (un)loading data, analytics, and troubleshooting
In depth code examples demonstrating various analytic models, analytic solutions, and common best practices

Who This Book Is For

This book is ideal for developers who wish to have a better understanding of Hadoop application development and associated tools, and developers who understand Hadoop conceptually but want practical examples of real world applications.
What You Will Learn

Data ETL, compression, serialization, and import/export
Simple and advanced aggregate analysis
Graph analysis
Machine learning
Troubleshooting and debugging
Scalable persistence
Cluster administration and configuration

In Detail

Helping developers become more comfortable and proficient with solving problems in the Hadoop space. People will become more familiar with a wide variety of Hadoop related tools and best practices for implementation.

Hadoop Real-World Solutions Cookbook will teach readers how to build solutions using tools such as Apache Hive, Pig, MapReduce, Mahout, Giraph, HDFS, Accumulo, Redis, and Ganglia.

Hadoop Real-World Solutions Cookbook provides in depth explanations and code examples. Each chapter contains a set of recipes that pose, then solve, technical challenges, and can be completed in any order. A recipe breaks a single problem down into discrete steps that are easy to follow. The book covers (un)loading to and from HDFS, graph analytics with Giraph, batch data analysis using Hive, Pig, and MapReduce, machine learning approaches with Mahout, debugging and troubleshooting MapReduce, and columnar storage and retrieval of structured data using Apache Accumulo.

Hadoop Real-World Solutions Cookbook will give readers the examples they need to apply Hadoop technology to their own problems.
Table of Contents
1: Hadoop Distributed File System – Importing and Exporting Data
3: Extracting and Transforming Data
4: Performing Common Tasks Using Hive, Pig, and MapReduce
5: Advanced Joins
6: Big Data Analysis
7: Advanced Big Data Analysis
8: Debugging
9: System Administration
10: Persistence Using Apache Accumul


