This Professional Certificate is for anyone who wants to develop job-ready skills, tools, and a portfolio for an entry-level data engineer position. Throughout the self-paced online courses, you will immerse yourself in the role of a data engineer and acquire the essential skills you need to work with a range of tools and databases to design, deploy, and manage structured and unstructured data.
By the end of this Professional Certificate, you will be able to explain and perform the key tasks required in a data engineering role. You will use the Python programming language and Linux/UNIX shell scripts to extract, transform and load (ETL) data. You will work with Relational Databases (RDBMS) and query data using SQL statements. You will use NoSQL databases and unstructured data. You will be introduced to Big Data and work with Big Data engines like Hadoop and Spark. You will gain experience with creating Data Warehouses and utilize Business Intelligence tools to analyze and extract insights.
This program does not require any prior data engineering, or programming experience.
This program is ACE® recommended—when you complete, you can earn up to 12 college credits.
Applied Learning Project
Throughout this Professional Certificate, you will complete hands-on labs and projects to help you gain practical experience with Python, SQL, relational databases, NoSQL databases, Apache Spark, building data pipelines, managing databases, and working with data warehouses. Projects:
Design a relational database to help a coffee franchise improve operations
Use SQL to query census, crime, and school demographic data sets.
Write a Bash shell script on Linux that backups changed files.
Set up, test, and optimize a data platform that contains MySQL, PostgreSQL, and IBM Db2 databases.
Analyze road traffic data to perform ETL and create a pipeline using Airflow and Kafka.
Design and implement a data warehouse for a solid-waste management company.
Move, query, and analyze data in MongoDB, Cassandra, and Cloudant NoSQL databases.
Train a machine learning model by creating an Apache Spark application.
Design, deploy, and manage an end-to-end data engineering platform.