What Does a Data Warehouse Architect Do?
October 4, 2024
Article
New year. Big goals. Bigger savings. Unlock a year of unlimited access to learning with Coursera Plus for $199. Save now.
Elevate your coding skills with data engineering. Use big data for decision-making, analysis, AI and machine learning
Instructors: Kennedy Behrman
3,837 already enrolled
Included with
(38 reviews)
Recommended experience
Intermediate level
Experience in working with Python, Git for version control, Docker for containerization and Kubernetes for deployment and scaling; also a strong foundation in linear algebra and statistics.
(38 reviews)
Recommended experience
Intermediate level
Experience in working with Python, Git for version control, Docker for containerization and Kubernetes for deployment and scaling; also a strong foundation in linear algebra and statistics.
Create scalable big data pipelines (Hadoop, Spark, Snowflake, Databricks) for efficient data handling.
Build machine learning workflows (PySpark, MLFlow) on Databricks for seamless model development and deployment.
Implement DataOps/DevOps to streamline data engineering processes.
Formulate and communicate data-driven insights and narratives through impactful visualizations with Python and data storytelling
Add to your LinkedIn profile
Add this credential to your LinkedIn profile, resume, or CV
Share it on social media and in your performance review
Learn how to use data engineering to leverage big data for business strategy, data analysis, or machine learning and AI. By completing this course series, you'll empower yourself with the knowledge and proficiency required to build efficient data pipelines, manage cutting-edge platforms like Hadoop, Spark, Snowflake, Databricks, and Kubernetes, and tell stories with data through visualization. You will delve into foundational big data concepts, distributed computing with Spark, Snowflake’s architecture, Databricks’ machine learning capabilities, Python techniques for data visualization, and critical methodologies like DataOps.
This course series is designed for software engineers, developers, researchers, and data scientists who want to strengthen their specialization in data science or machine learning, as well as for professionals who are interested in pursuing a career as a data-focused software engineer, data scientist, or a data engineer working in cloud, machine learning, business intelligence, or other field.
Applied Learning Project
The Specialization features a capstone project focused on using Databricks’ API to replicate an existing project. This provides hands-on experience working with Databricks to build a portfolio-ready data solution. You will apply Python to a variety of data engineering tasks.
Create scalable data pipelines (Hadoop, Spark, Snowflake, Databricks) for efficient data handling.
Optimize data engineering with clustering and scaling to boost performance and resource use.
Build ML solutions (PySpark, MLFlow) on Databricks for seamless model development and deployment.
Implement DataOps and DevOps practices for continuous integration and deployment (CI/CD) of data-driven applications, including automating processes.
Master virtualization, containerization, and Docker, including Dockerfile creation and multi-container orchestration with Compose and Airflow.
Develop expertise in Kubernetes core concepts, cluster architecture, and deployment using cloud environments, GitHub Codespaces, and AI-driven tools.
Navigate data scenarios mastering containerization, deploying apps, and addressing production issues with cloud orchestration and SRE practices.
Apply Python, spreadsheets, and BI tooling proficiently to create visually compelling and interactive data visualizations.
Formulate and communicate data-driven insights and narratives through impactful visualizations and data storytelling.
Assess and select the most suitable visualization tools and techniques to address organizational data needs and objectives.
Duke University has about 13,000 undergraduate and graduate students and a world-class faculty helping to expand the frontiers of knowledge. The university has a strong commitment to applying knowledge in service to society, both near its North Carolina campus and around the world.
Unlimited access to 10,000+ world-class courses, hands-on projects, and job-ready certificate programs - all included in your subscription
Earn a degree from world-class universities - 100% online
Upskill your employees to excel in the digital economy
The course series takes approximately 5 months to complete.
Experience in working with Python, Git for version control, Docker for containerization and Kubernetes for deployment and scaling; also a strong foundation in linear algebra and statistics.
The course series is designed to be completed in the order outlined here on this page.
Note that the Specialization Certificate does not represent official academic credit from the partner institution offering the course. Duke cannot provide a transcript for your completion of the Specialization; however, we encourage you to share your Coursera completion certificate with your employer and community to demonstrate your completion of the course series.
This Specialization teaches learners how to create and scale data pipelines for big data using Hadoop, Spark, Snowflake, and Databbricks, build machine learning workflows with PySpark and MLFlow, implement DataOps/DevOps to streamline data engineering processes, and develop data visualizations with Python.
This course is completely online, so there’s no need to show up to a classroom in person. You can access your lectures, readings and assignments anytime and anywhere via the web or your mobile device.
If you subscribed, you get a 7-day free trial during which you can cancel at no penalty. After that, we don’t give refunds, but you can cancel your subscription at any time. See our full refund policy.
Yes! To get started, click the course card that interests you and enroll. You can enroll and complete the course to earn a shareable certificate, or you can audit it to view the course materials for free. When you subscribe to a course that is part of a Specialization, you’re automatically subscribed to the full Specialization. Visit your learner dashboard to track your progress.
Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.
When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. If you only want to read and view the course content, you can audit the course for free. If you cannot afford the fee, you can apply for financial aid.