Google Cloud Data Engineer

Live Online (VILT) & Classroom Corporate Training Course

Google Cloud logo

Master Google Cloud's data engineering tools and prepare for the Google Cloud Professional Data Engineer certification. This course covers data storage, processing, and machine learning integration on Google Cloud.

How can we help you?

  • CloudLabs
  • Projects
  • Assignments
  • 24x7 Support
  • Lifetime Access

Google Cloud Data Engineer

Overview

The Google Cloud Data Engineer certification training prepares learners to design, build, maintain, and troubleshoot data processing systems on Google Cloud. This course covers data management, processing, and machine learning on Google Cloud’s powerful data infrastructure, equipping participants to make data-driven decisions by leveraging Google Cloud solutions.

Objectives

At the end of Google Data Engineer training course, participants will be able to

  • Design and build data processing systems on Google Cloud
  • Process batch and streaming data by implementing autoscaling data pipelines on Cloud Dataflow
  • Derive business insights from extremely large datasets using Google BigQuery
  • Train, evaluate, and predict using machine learning models using Tensorflow and Cloud ML
  • Leverage unstructured data using Spark and ML APIs on Cloud Dataproc
  • Enable instant insights from streaming data

Prerequisites

  • Basic knowledge of SQL and data modeling.
  • Familiarity with general cloud computing concepts.
  • Experience with data warehousing or data pipelines.
  • Fundamental understanding of programming (Python or Java recommended).
  • Interest in using data to solve business problems and drive decisions.

Course Outline

Google Cloud Dataproc Overview2021-06-26T18:08:10+05:30
  • Creating and managing clusters.
  • Leveraging custom machine types and preemptible worker nodes
  • Scaling and deleting Clusters
Running Dataproc Jobs2021-06-26T18:09:54+05:30
  • Running Pig and Hive jobs.
  • Separation of storage and compute.
Integrating Dataproc with Google Cloud Platform2021-06-26T18:10:03+05:30
  • Customize cluster with initialization actions.
  • BigQuery Support.
Making Sense of Unstructured Data with Google’s Machine Learning APIs2021-06-26T18:10:09+05:30
  • Google’s Machine Learning APIs
  • Common ML Use Cases
  • Invoking ML APIs
  • Serverless Data Analysis with Google BigQuery and Cloud Dataflow
Serverless Data Analysis with BigQuery2021-06-26T18:10:19+05:30
  • What is BigQuery
  • Queries and Functions
  • Loading data into BigQuery
  • Exporting data from BigQuery
  • Nested and repeated fields
  • Querying multiple tables
  • Performance and pricing
Serverless, Autoscaling Data Pipelines with Dataflow2021-06-26T18:10:36+05:30
  • The Beam programming model
  • Data pipelines in Beam Python
  • Data pipelines in Beam Java
  • Scalable Big Data processing using Beam
  • Incorporating additional data
  • Handling stream data
  • GCP Reference architecture
  • Serverless Machine Learning with TensorFlow on Google Cloud Platform
Getting Started with Machine Learning2021-06-26T18:12:04+05:30
  • What is machine learning (ML)
  • Effective ML: concepts, types
  • ML datasets: generalization
Building ML Models with Tensorflow2021-06-26T18:12:12+05:30
  • Getting started with TensorFlow
  • TensorFlow graphs and loops + lab
  • Monitoring ML training
Scaling ML Models with CloudML2021-06-26T18:12:24+05:30
  • Why Cloud ML?
  • Packaging up a TensorFlow model
  • End-to-end training
Feature Engineering2021-06-26T18:12:32+05:30
  • Creating good features
  • Transforming inputs
  • Synthetic features
  • Preprocessing with Cloud ML
  • Building Resilient Streaming Systems on Google Cloud Platform
Architecture of Streaming Analytics Pipelines2021-06-26T18:12:43+05:30
  • Stream data processing: Challenges
  • Handling variable data volumes
  • Dealing with unordered/late data
Ingesting Variable Volumes2021-06-26T18:13:05+05:30
  • What is Cloud Pub/Sub?
  • How it works: Topics and Subscriptions
Implementing Streaming Pipelines2021-06-26T18:13:40+05:30
  • Challenges in stream processing.
  • Handle late data: watermarks, triggers, accumulation.
Streaming Analytics and Dashboards2021-06-26T18:13:47+05:30
  • Streaming analytics: from data to decisions
  • Querying streaming data with BigQuery
  • What is Google Data Studio?
High Throughput and Low-Latency with Bigtable2021-06-26T18:14:14+05:30
  • What is Cloud Spanner?
  • Designing Bigtable schema
  • Ingesting into Bigtable
2024-11-25T20:16:40+05:30
Go to Top