Introduction to Python for Big Data Engineering with PySpark

November 9, 2020

Learn how to use Python and Spark 3.0 (PySpark) for Data Engineering and Data Analytics on Big Data Cloud Platforms

Description

The key objectives of this course are as follows;

Learn Spark Architecture
Learn Spark Execution Concepts
Learn Spark Transformations and Actions using the Structured API
Learn Spark Transformations and Actions using the RDD (Resilient Distributed Datasets) API
Learn how to set up your own local PySpark Environment
Learn how to interpret the Spark Web UI
Learn how to interpret DAG (Directed Acyclic Graph) for Spark Execution

The Python Spark project that we are going to do together;

Sales Data

Python Developers who wish to learn how to use the language for Data Engineering and Analytics with PySpark
Aspiring Data Engineering and Analytics Professionals
Data Scientists / Analysts who wish to learn an analytical processing strategy that can be deployed over a big data cluster
Data Managers who want to gain a deeper understanding of managing data over a cluster

[maxbutton id=”1″ url=”https://www.udemy.com/course/introduction-to-python-for-big-data-engineering-with-pyspark/” ]