Coursera Project Network

Data Analysis Using Pyspark

Ahmad Varasteh

Instructor: Ahmad Varasteh

14,801 already enrolled

Included with Coursera Plus

Learn, practice, and apply job-ready skills with expert guidance
4.4

(289 reviews)

Intermediate level

Recommended experience

1.5 h
Learn at your own pace
Hands-on learning
Learn, practice, and apply job-ready skills with expert guidance
4.4

(289 reviews)

Intermediate level

Recommended experience

1.5 h
Learn at your own pace
Hands-on learning

What you'll learn

  • Learn how to setup the google colab for distributed data processing

  • Learn applying different queries to your dataset to extract useful Information

  • Learn how to visualize this information using matplotlib

Details to know

Shareable certificate

Add to your LinkedIn profile

Taught in English
No downloads or installation required

Only available on desktop

See how employees at top companies are mastering in-demand skills

Placeholder

Learn, practice, and apply job-ready skills in less than 2 hours

  • Receive training from industry experts
  • Gain hands-on experience solving real-world job tasks
  • Build confidence using the latest tools and technologies
Placeholder

About this Guided Project

Learn step-by-step

In a video that plays in a split-screen with your work area, your instructor will walk you through these steps:

  1. Prepare the Google Colab for distributed data processing

  2. Mounting our Google Drive into Google Colab environment

  3. Importing first file of our Dataset (1 Gb) into pySpark dataframe

  4. Applying some Queries to extract useful information out of our data

  5. Importing second file of our Dataset (3 Mb) into pySpark dataframe

  6. Joining two dataframes and prepapre it for more advanced queries

  7. Learn visualizing our query results using matplotlib

Recommended experience

Learners should be familiar with Python programming Language, Spark Technology and have a little experience working with google colab environment

5 project images

Instructor

Instructor ratings
4.3 (11 ratings)
Ahmad Varasteh
Coursera Project Network
24 Courses61,654 learners

Offered by

How you'll learn

  • Skill-based, hands-on learning

    Practice new skills by completing job-related tasks.

  • Expert guidance

    Follow along with pre-recorded videos from experts using a unique side-by-side interface.

  • No downloads or installation required

    Access the tools and resources you need in a pre-configured cloud workspace.

  • Available only on desktop

    This Guided Project is designed for laptops or desktop computers with a reliable Internet connection, not mobile devices.

Why people choose Coursera for their career

Felipe M.
Learner since 2018
"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."
Jennifer J.
Learner since 2020
"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."
Larry W.
Learner since 2021
"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."
Chaitanya A.
"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."

Learner reviews

4.4

289 reviews

  • 5 stars

    63.32%

  • 4 stars

    24.22%

  • 3 stars

    8.65%

  • 2 stars

    1.73%

  • 1 star

    2.07%

Showing 3 of 289

AM
4

Reviewed on Jan 29, 2021

DM
5

Reviewed on Nov 14, 2020

AA
4

Reviewed on Jan 22, 2022

New to Data Analysis? Start here.

Placeholder

Open new doors with Coursera Plus

Unlimited access to 10,000+ world-class courses, hands-on projects, and job-ready certificate programs - all included in your subscription

Advance your career with an online degree

Earn a degree from world-class universities - 100% online

Join over 3,400 global companies that choose Coursera for Business

Upskill your employees to excel in the digital economy

Frequently asked questions