Technology Course

Apache Spark Certification Training

Course Type: Certification | Study Mode: Online
Keywords: spark download | spark tutorial | spark streaming | spark sql | scala programming | scala list | scala language
Course Provider:

Course Detail


Introduction to Scala for Apache Spark :- Learning Objectives - In this module, you will understand the basics of Scala that are required for programming Spark applications. You can learn about the basic constructs of Scala such as variable types, control structures, collections, and more.Topics What is Scala? Why Scala for Spark? Scala in other frameworks, introduction to Scala REPL, basic Scala operations, Variable Types in Scala, Control Structures in Scala, Foreach loop, Functions, Procedures, Collections in Scala- Array, ArrayBuffer, Map, Tuples, Lists, and more.

Curriculum Overview

OOPS and Functional Programming in Scala :- Learning Objectives - In this module, you will learn about object oriented programming and functional programming techniques in Scala.Topics Class in Scala, Getters and Setters, Custom Getters and Setters, Properties with only Getters, Auxiliary Constructor, Primary Constructor, Singletons, Companion Objects, Extending a Class, Overriding Methods, Traits as Interfaces, Layered Traits, Functional Programming, Higher Order Functions, Anonymous Functions, and more.

Area Of Studies

Introduction to Big Data and Apache Spark :- Learning Objectives - In this module, you will understand what is big data, challenges associated with it and the different frameworks available. The module also includes a first-hand introduction to Spark.Topics - Introduction to big data, challenges with big data, Batch Vs. Real Time big data analytics, Batch Analytics - Hadoop Ecosystem Overview, Real-time Analytics Options, Streaming Data - Spark, In-memory data - Spark, What is Spark?, Spark Ecosystem, modes of Spark, Spark installation demo, overview of Spark on a cluster, Spark Standalone cluster, Spark Web UI.

Spark Common Operations :- Learning Objectives - In this module, you will learn how to invoke Spark Shell and use it for various common operations.Topics - Invoking Spark Shell, creating the Spark Context, loading a file in Shell, performing basic Operations on files in Spark Shell, Overview of SBT, building a Spark project with SBT, running Spark project with SBT, local mode, Spark mode, caching overview, Distributed Persistence.

Playing with RDDs :- Learning Objectives - In this module, you will learn one of the fundamental building blocks of Spark - RDDs and related manipulations for implementing business logics.Topics - RDDs, transformations in RDD, actions in RDD, loading data in RDD, saving data through RDD, Key-Value Pair RDD, MapReduce and Pair RDD Operations, Spark and Hadoop Integration-HDFS, Spark and Hadoop Integration-Yarn, Handling Sequence Files, Partitioner.
Spark Streaming and MLlib :- Learning Objectives In this module, you will learn about the major APIs that Spark offers. You will get an opportunity to work on Spark streaming which makes it easy to build scalable fault-tolerant streaming applications, MLlib which is Spark s machine learning library.Topics Spark Streaming Architecture, first Spark Streaming Program, transformations in Spark Streaming, fault tolerance in Spark Streaming, checkpointing, parallelism level, machine learning with Spark, data types, algorithms statistics, classification and regression, clustering, collaborative filtering.
GraphX, SparkSQL and Performance Tuning in Spark :- Learning Objectives - In this module, you will learn about Spark SQL that is used to process structured data with SQL queries, graph analysis with Spark, GraphX for graphs and graph-parallel computation. You will also0 get a chance to learn the various ways to optimize performance in Spark.Topics - Analyze Hive and Spark SQL architecture, SQLContext in Spark SQL, working with DataFrames, implementing an example for Spark SQL, integrating hive and Spark SQL, support for JSON and Parquet File Formats, implement data visualization in Spark, loading of data, Hive queries through Spark, testing tips in Scala, performance tuning tips in Spark, shared variables: Broadcast Variables, Shared Variables: Accumulators.
A complete project on Apache Spark :- Learning Objectives - In this module, you will get an opportunity to work on a live Spark project where you can implement the learnings from previous modules hands-on, and solve a real-time use case.Problem Statement: Design a system to replay the real time replay of transactions in HDFS using Spark.Technologies Used: 1. Spark Streaming2. Kafka (for messaging)3. HDFS (for storage)4. Core Spark API (for aggregation)
Entry Requirements

Who Should Go For This Course?

1. Big Data enthusiasts . 2. Software Architects, Engineers and Developers. 3. Data Scientists and Analytics professionals

Other Information
Reviewer Name & Review Content
Roshan Bagde: I Joined Edureka, For "Apache Spark and Scala" course. Course instructor is very knowledgeable.He describes you course in detail. All Presentations are very good. The LMS system is truly great. and Good thing is that you will get Lifetime access to course. Edureka Support is excellent as they will provide quick resolution on any issue related to your course and that way you can save lots of time. If you looking for Big Data related courses then Edureka is the right place.
Ajai Singh: I am a Sr. Database Engineer working for a leading energy management company at Boston, USA, attending a very useful course with EUREKA on Apache Spark and Scala, course contents are really very good.
Ganapathy Govindan: I love Edureka Courses since It has evolving content (Trending technologies are keep evolving as online training provider - needs to deliver latest updated content.) in the course. It was started with Hadoop 1 and now evolved to Hadoop 2 (yarn). It was started with spark 1.2 now evolve to 1.6. The best part is students are getting latest content without any extra cost. Thanks for Edureka life time access. Edureka has top notch faculties deliver the course very interactively. They teach all their experience. Edureka courses nothing but learning from the expert in other side and we sitting in our place without any commuting. Edureka support is ultimate. Any question, any help they will do it for 24/7. For testing just call their support at 2 o' clock in night. You will get immediate response.
Viresh Dagade: I am thankful to Edureka which is one of the best Educational organization. I have undergone two highly rated courses (Bigdata and Hadoop, Spark and Scala). Now i am doing well with the stuff i have learnt, after getting certified for bigdata and hadoop i m getting many offers from many mnc companies. I after the great experience of learning hadoop technology, now i am keen to enroll for Data science course. I hope i get the same learning experience which i got while undergoing my previous courses. I heartily thank edureka for helping me to make my career. The overall team [Trainers, support team, online support team] is the best.
Kavitha Veluri Course is well designed and explains the concepts in detail with many real world examples.
Sai Manoj: A "Very Good Help improve our skills
1. Good course material and structured modules in each course
2. Helpful instructors even during the class / off-the class help
3. Good blend of business examples / technical content in the slides.
4. Quotes of good examples in the usage of technical concepts in a real-time scenario
5. Good real-time examples and work out in the class."
Revathi: I have learned a lot from this training.And you really helped to understand all the concepts very well.Thanks again.