Note: 

   

If you guys are getting coupon expired or course is not free after opening the link, then it is due to the fact that course instructors provide only few hundreds or thousands of slots which get exhausted. So, try to enroll in the course as soon as it is posted in the channel. The Coupons may expire any time for instant notification follow telegram channel

New customer offer! Top courses from $13.99 when you first visit Udemy

Build Data Engineering Pipelines using SQL, Python and Spark

Description

As part of this course, you will learn all the Data Engineering Essentials related to building Data Pipelines using SQL, Python as well as Spark.

 

About Data Engineering

 

Data Engineering is nothing but processing the data depending upon our downstream needs. We need to build different pipelines such as Batch Pipelines, Streaming Pipelines, etc as part of Data Engineering. All roles related to Data Processing are consolidated under Data Engineering. Conventionally, they are known as ETL Development, Data Warehouse Development, etc.

 

Course Details

 

As part of this course, you will be learning Data Engineering Essentials such as SQL, Programming using Python and Spark. Here is the detailed agenda for the course.

 

Database Essentials – SQL using Postgres

 

Getting Started with Postgres

 

Basic Database Operations (CRUD or Insert, Update, Delete)

 

Writing Basic SQL Queries (Filtering, Joins, and Aggregations)

 

Creating Tables and Indexes

 

Partitioning Tables and Indexes

 

Predefined Functions (String Manipulation, Date Manipulation, and other functions)

 

Writing Advanced SQL Queries

 

Programming Essentials using Python

 

Perform Database Operations

 

Getting Started with Python

 

Basic Programming Constructs

 

Predefined Functions

 

Overview of Collections – list and set

 

Overview of Collections – dict and tuple

 

Manipulating Collections using loops

 

Understanding Map Reduce Libraries

 

Overview of Pandas Libraries

 

Database Programming – CRUD Operations

 

Database Programming – Batch Operations

 

Setting up Single Node Cluster for Practice

 

Setup Single Node Hadoop Cluster

 

Setup Hive and Spark on Single Node Cluster

 

Introduction to Hadoop ecosystem

 

Overview of HDFS Commands

 

Data Engineering using Spark SQL

 

Getting Started with Spark SQL

 

Basic Transformations

 

Managing Tables – Basic DDL and DML

 

Managing Tables – DML and Partitioning

 

Overview of Spark SQL Functions

 

Windowing Functions

 

Data Engineering using Spark Data Frame APIs

 

Data Processing Overview

 

Processing Column Data

 

Basic Transformations – Filtering, Aggregations, and Sorting

 

Joining Data Sets

 

Windowing Functions – Aggregations, Ranking, and Analytic Functions

 

Spark Metastore Databases and Tables

 

Desired Audience

 

Here is the desired audience for this course.

 

College students and entry-level professionals to get hands-on expertise with respect to Data Engineering. This course will provide enough skills to face interviews for entry-level data engineers.

 

Experienced application developers to gain expertise related to Data Engineering.

 

Conventional Data Warehouse Developers, ETL Developers, Database Developers, PL/SQL Developers to gain enough skills to transition to be successful Data Engineers.

 

Testers to improve their testing capabilities related to Data Engineering applications.

 

Any other hands-on IT Professional who wants to get knowledge about Data Engineering with Hands-On Practice.

 

Prerequisites

 

Logistics

 

Computer with decent configuration (At least 4 GB RAM, however 8 GB is highly desired)

 

Dual Core is required and Quad-Core is highly desired

 

Chrome Browser

 

High-Speed Internet

 

Desired Background

 

Engineering or Science Degree

 

Ability to use computer

 

Knowledge or working experience with databases and any programming language is highly desired

 

Training Approach

 

Here are the details related to the training approach.

 

It is self-paced with reference material, code snippets, and videos provided as part of Udemy.

 

One can either use the environment provided by us or set up their own environment using Docker on AWS or GCP or the platform of their choice.

 

We would recommend completing 2 modules every week by spending 4 to 5 hours per week.

 

It is highly recommended to take care of the exercises at the end to ensure that you are able to meet all the key objectives for each module.

 

Support will be provided through Udemy Q&A.

 

The course is designed in such a way that one can self-evaluate through the course and confirm whether the skills are acquired.

 

Here is the approach we recommend you to take this course.

 

The course is hands-on with thousands of tasks, you should practice as you go through the course.

 

You should also spend time understanding the concepts. If you do not understand the concept, I would recommend moving on and come back later to the topic.

 

Go through the consolidated exercises and see if you are able to solve the problems or not.

 

Make sure to follow the order we have defined as part of the course.

 

After each and every section or module, make sure to solve the exercises. We have provided enough information to validate the output.

 

By the end of the course, then you can come to the conclusion that you are able to master essential skills related to SQL, Python, and Spark.

 

Who this course is for:

Computer Science or IT Students or other graduates with passion to get into IT

Data Warehouse Developers who want to transition to Data Engineering roles

ETL Developers who want to transition to Data Engineering roles

Database or PL/SQL Developers who want to transition to Data Engineering roles

BI Developers who want to transition to Data Engineering roles

QA Engineers to learn about Data Engineering

Application Developers to gain Data Engineering Skills

Enroll Now