Stream & Unify Data Schemas with CDC

Ends in 5 days! Save 40% on your access to 10,000+ programs and make a real impact in your career. Save now.

Stream & Unify Data Schemas with CDC

This course is part of Real-Time, Real Fast: Kafka & Spark for Data Engineers Specialization

Instructors: Starweaver

Included with Learn more

Ask Coursera

3 modules

Gain insight into a topic and learn the fundamentals.

Intermediate level

Recommended experience

5 hours to complete

Flexible schedule

Learn at your own pace

3 modules

Gain insight into a topic and learn the fundamentals.

Intermediate level

Recommended experience

5 hours to complete

Flexible schedule

Learn at your own pace

What you'll learn

Explain CDC fundamentals (binlog/WAL) and schema evolution strategies.
Configure a Schema Registry pipeline locally using Debezium and Kafka.
Use streaming SQL (Flink/ksqlDB) to map, cast, and merge divergent schemas into a canonical model.

Skills you'll gain

Tools you'll learn

Details to know

Shareable certificate

Add to your LinkedIn profile

Assessments

5 assignments¹

AI Graded see disclaimer

Taught in English

See how employees at top companies are mastering in-demand skills

Learn more about Coursera for Business

logos of Petrobras, TATA, Danone, Capgemini, P&G and L'Oreal

Build your subject-matter expertise

This course is part of the Real-Time, Real Fast: Kafka & Spark for Data Engineers Specialization

When you enroll in this course, you'll also be enrolled in this Specialization.

Learn new concepts from industry experts
Gain a foundational understanding of a subject or tool
Develop job-relevant skills with hands-on projects
Earn a shareable career certificate

There are 3 modules in this course

Imagine deploying schema changes with confidence—knowing your pipeline will handle them gracefully, consumers will stay healthy, and your data will stay consistent. That's the difference between hoping your CDC pipeline works and knowing it will. In this course you will learn how to build a working, vendor‑neutral CDC pipeline and a single, unified table from evolving source schemas. Starting with Debezium streaming changes from Postgres/MySQL into Kafka, you will use Schema Registry to enforce compatibility, then apply streaming SQL in Flink (or ksqlDB) to map, cast, and merge divergent fields into a canonical model. Finally, you will persist results to an Apache Iceberg table and query it instantly with Trino. Along the way, you’ll learn practical strategies to manage schema drift, choose compatibility modes (backward/full), and avoid breaking downstream consumers. Everything runs locally with Docker so you can reproduce it anywhere and take the same patterns to your cloud stack later.

This course is designed for engineers working with Kafka, Debezium, and streaming SQL who need reliable schema evolution and canonical modeling skills. Learners should be familiar with Basic SQL, Docker, and familiarity with Kafka or streaming concepts. By the end of the course,you will be able to implement a small end‑to‑end CDC pipeline that streams from a source DB and unifies evolving schemas into a single queryable table.

Deploy a local Debezium, Kafka, Schema Registry, and Flink/ksqlDB stack to observe row-level changes in real-time. Intentionally modify the source schema, then employ streaming SQL to map, cast, and coalesce fields into a canonical table. Perform upserts using stable keys and verify the data is correctly stored in Iceberg. By the conclusion, you will have established an operational CDC loop and a unified, queryable dataset.

What's included

4 videos2 readings1 assignment

4 videosTotal 37 minutes

Introduction and Welcome4 minutes
CDC to Analytics: Complete Architecture Overview11 minutes
Data Flow Deep Dive: Source to Lakehouse12 minutes
Live Build: Unify Schemas with Streaming SQL10 minutes

2 readingsTotal 10 minutes

Welcome to the Course: Course Overview5 minutes
Schema Evolution Additional Resources5 minutes

1 assignmentTotal 30 minutes

Hands On Learning (HOL): CDC Basics & Safe Schema Evolution30 minutes

Learn to prevent consumer disruptions by enforcing compatibility at both the subject and global levels. We will deliberately deploy an incompatible schema, observe the failure, and proceed safely using defaults and transitive modes. Implement practical safeguards such as CI schema checks, DLQs, alerts, and lag probes to ensure issues are promptly identified and contained. The emphasis is on repeatable recovery, not heroics.

What's included

3 videos1 reading1 assignment

Develop a robust canonical model encompassing naming conventions, data types and units, nullability, and soft delete mechanisms, and store it in Iceberg on MinIO utilizing streaming upserts. Perform immediate queries with Trino and employ time-travel features for validation or debugging regressions. The project involves constructing a denormalized “latest per customer” view for analytical purposes, as well as discussing partitioning strategies, equality deletes, and data compaction. Participants will acquire scalable patterns suitable for deployment from laptops to cloud environments.

What's included

4 videos1 reading3 assignments

4 videosTotal 36 minutes

Canonical Schema Basics7 minutes
Streaming SQL Patterns: Casts, Coalesce, Upserts, Joins13 minutes
Persist & Query with Iceberg + Trino12 minutes
Recap + Next Steps3 minutes

1 readingTotal 5 minutes

Iceberg Essentials for Stream Sinks5 minutes

3 assignmentsTotal 120 minutes

Stream & Unify Data Schemas with CDC30 minutes
Hands On Learning (HOL): Building the Latest Customer Orders30 minutes
Project: From CDC Streams to Trusted Customer Orders 60 minutes

Earn a career certificate

Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.

Instructors

Starweaver

Coursera

573 Courses1,166,983 learners

Offered by

Coursera

Explore more from Software Development

Coursera
Manage Schema Evolution in Real‑Time Data
Course
Status: Free Trial
Packt
Kafka for Developers - Data Contracts Using Schema Registry
Course
Coursera
Ensure Consistency in Streaming Pipelines
Course
Status: Free Trial
Coursera
Unify Diverse Data Sources
Course
Status: Free Trial

Why people choose Coursera for their career

Felipe M.

Learner since 2018

"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."

Jennifer J.

Learner since 2020

"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."

Larry W.

Learner since 2021

"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."

Chaitanya A.

"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."

Frequently asked questions

To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.

When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.

Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.