Entry Level Hadoop Developer and Training

col-narrow-left   

Title:

Entry Level Hadoop Developer and Training

Job ID:

6928

Location:

Woburn, MA 

Classification:

Manufacturing/Operations

Salary:

Posted By:

Xperttech
col-narrow-right   

Job Type:

Full time

Posted:

05/30/2014

Start Date:

06/21/2014

Job Function:

Developer

Telephone:

781-325-9151
col-wide   

Job Description:

 We are hiring 40 Hadoop entry level Developers, for our joint venture with a startup company. As we know that Hadoop is a new technology and not many people know it. We will be conducting a 2 Months of free intense HADOOP in class training which will start from June 21st 2014 in Boston, MA. We will provide free accommodation for our out of state students. With the help of this intensive training we will also be able to help you do a HADOOP certification.

 

You need to know basics of Java to be enrolled in this program.

 

It is a very exciting opportunity as we are expecting some senior executives from fortune 100 companies and some professors from well known colleges to be our guest instructors. We see a great future for HADOOP developers and we would want you to be one of them. We will even sponsor for your immigration status. 

Below is course content that we will be teaching you!!

 

LINUX Introduction

File Handling

Text Processing

System Administration

Archival

Network

 

Core Java Training

Introduction

String

Exception Handling

 

 

INTRODUCTION TO BIG DATA-HADOOP                                                                

Big Data (What, Why, Who) – 3++Vs – Overview of Hadoop EcoSystem - Role of Hadoop in Big data – Overview of other Big Data Systems – Who is using Hadoop – Hadoop integrations into Exiting Software Products - Current Scenario in Hadoop Ecosystem - Installation - Configuration - UseCases of Hadoop (HealthCare, Retail, Telecom)

 

HDFS                                                                                                                                                       

Concepts - Architecture – Data Flow (File Read , File Write)–Fault Tolerance - Shell Commands – Java Base API – Data Flow Archives – Coherency - Data Integrity – Role of Secondary NameNode

 

MAPREDUCE                                                                                                                                

Theory – Data Flow (Map – Shuffle - Reduce) – MapRed vs MapReduce APIs - Programming [Mapper, Reducer, Combiner, Partitioner] –Writables – InputFormat – Outputformat - Streaming API using python – Inherent Failure Handling using Speculative Execution – Magic of Shuffle Phase –FileFormats – Sequence Files

 

ADVANCED MAPREDUCE PROGRAMMING                                                            

Counters (Built In and Custom) – CustomInputFormat – Distributed Cache – Joins (MapSide, Reduce Side) – Sorting - Performance Tuning –GenericOptionsParser - ToolRunner – Debugging(LocalJobRunner)

 

ADMINISTRATION                                                                                                                 

Multi Node Cluster Setup using AWS Cloud Machines –Hardware Considerations –Software Considerations - Commands (fsck, job, dfsadmin) – Schedulers in Job Tracker - RackAwareness Policy - Balancing - NameNode Failure and Recovery - commissioning and Decommissioning a Node – Compression Codecs

 

HBASE                                                                                                                           

Introduction to NoSQL – CAP Theorem – Classification of NoSQL – Hbase and RDBMS – HBASE and HDFS- Architecture (Read Path, Write Path, Compactions, Splits) - Installation – Configuration - Role of Zookeeper – HBase Shell - Java Based APIs (Scan, Get, other advanced APIs )– Introduction to Filters- RowKey Design - Map reduce Integration – Performance Tuning –What’s New in HBase 0.98 – Backup and Disaster Recovery - Hands On

 

HIVE                                                                                                                                           

Architecture – Installation –Configuration – Hive vs RDBMS - Tables – DDL – DML – UDF – UDAF – Partitioning – Bucketing – MetaStore - Hive-Hbase Integration – Hive Web Interface – Hive Server(JDBC,ODBC, Thrift) – File Formats (RCFile - ORCFile) – Other SQL on Hadoop

 

PIG                                                                                                                                              

Architecture –Installation - Hive vs. Pig - Pig Latin Syntax –Data Types –Functions (Eval, Load/Store, String, DateTime) - Joins - Pig Server –Macros- UDFs- Performance - Troubleshooting – Commonly Used Functions

 

SQOOP                                                                                                                        

Architecture, Installation, Commands (Import, Hive-Import, EVal, Hbase Import, Import All tables, Export) – Connectors to Existing DBs and DW                                                             

 

FLUME                                                                                                                                     

Why Flume? - Architecture, Configuration (Agents), Sources(Exec-Avro-NetCat), Channels(File,Memory,JDBC, HBase), Sinks(Logger, Avro, HDFS, Hbase, FileRoll), Contextual Routing (Interceptors, Channel Selectors) - Introduction to other aggregation frameworks

 

OOZIE                                                                                                                            

Architecture, Installation, Workflow, Coordinator, Action (Mapreduce, Hive, Pig, Sqoop) – Introduction to Bundle – Mail Notifications

 

HADOOP 2.0                                                                                                                          

Limitations in Hadoop-1.0 - HDFS Federation - High Availability in HDFS – HDFS Snapshots – Other Improvements in HDFS2- Introduction to YARN aka MR2 – Limitations in MR1 – Architecture of YARN - MapReduce Job Flow in YARN – Introduction to Stinger Initiative and Tez – BackWard Compatibility for Hadoop 1.X

 

SOLR                                                                                                                                           

Introduction to Information Retrieval - common usecases - Introduction to Solr and Lucene – Installation – Concepts ( Cores,Schema , Documents, fields, Inverted Index,) - Configuration - CRUD operation requests and responses – Java Based APIs – Introduction to SolrCloud

 

Cloudera / MapR Certification Assistance will be provided!!

Please do not hesitate to contact me if you have any more questions.

 

Company Info
Xperttech


Web Site:

Company Profile