BIG DATA ANALYTICS USING SPARK SYLLUBAS

 

BIG DATA ANALYTICS USING SPARK

UNIT I:

Introduction to Big Data: Whatis Big Data-Characteristics, Data in the Warehouse and Data in Hadoop, Why is Big Data Important- When to consider Big Data Solution, Applications.


Introduction to Hadoop: Hadoop- definition, Application development in Hadoop. The building blocks of Hadoop, Name Node, Data Node, Secondary Name Node, Job Tracker and Task Tracker.

 

UNIT II:

Introduction to Spark: What is Apache Spark, Why Spark when Hadoop is there, Spark Features, , Spark components, Spark program flow, Spark Eco System. Differences between implementation of programs in Hadoop and Spark Programming environments.

 

UNIT III:

Spark Fundamentals- Using spark in action VM, Using Spark Shell and writing first spark program, Basic RDD actions and transformations.

Spark SQL-Working with Data Frames, Using SQL Commands, Saving and loading Data Frame.

 

UNIT IV:

Streaming in Spark- Writing spark streaming applications, Using external data sources, structured streaming.

Spark MLlib-Introduction to Machine Learning. Definition of Machine Learning, Machine Learning with Spark.

UNIT V:

Graph Representation in MapReduce: Graph Processing with Spark, Spark GraphX, GraphX features, Graph Examples, Graph algorithms-Shortest Path Algorithm.


TEXT BOOKS:


2.      Spark in Action PetarZecevic, markoBonaci Manning Publications-2016.

3.      Learning Spark“Holden KarauA. Konwinskietc.,”O’reilly Publications.

No comments:

Post a Comment