Build a job-ready portfolio with these five beginner-friendly data analysis projects.
If you’re getting ready to launch a new career as a data analyst, chances are you’ve encountered an age-old dilemma. Job listings ask for experience, but how do you get experience if you’re looking for your first data analyst job? This is where your data analyst portfolio comes in.
The projects you include in your portfolio demonstrate your skills and experience—even if it’s not from a previous data analytics job—to hiring managers and interviewers. Populating your portfolio with the right projects can go a long way toward building confidence that you’re the right person for the job, even without previous work experience.
In this article, we’ll discuss five types of projects you should include in your portfolio, especially if you’re just starting out. Don't yet have one? In the Google Data Analytics Professional Certificate, you'll demonstrate your proficiency in portfolio-ready projects so you can showcase your work to future employers.
As an aspiring data analyst, you’ll want to demonstrate a few key skills in your portfolio. These data analytics project ideas reflect the tasks often fundamental to many data analyst roles.
While you’ll find no shortage of excellent (and free) public data sets on the internet, you might want to show prospective employers that you’re able to find and scrape your own data as well. Plus, knowing how to scrape web data means you can find and use data sets that match your interests, regardless of whether or not they’ve already been compiled.
If you know some Python, you can use tools like Beautiful Soup or Scrapy to crawl the web for interesting data. If you don’t know how to code, don’t worry. You’ll also find several tools that automate the process (many offer a free trial), like Octoparse or ParseHub.
If you’re unsure where to start, here are some websites with interesting data options to inspire your project:
Wikipedia
Job portals
Example web scraping project: Todd W. Schneider of Wedding Crunchers scraped some 60,000 New York Times wedding announcements published from 1981 to 2016 to measure the frequency of specific phrases.
Tip: Anytime you’re scraping data from the internet, remember to respect and abide by each website’s terms of service. Limit your scraping activities so as not to overwhelm a company’s servers, and always cite your sources when you present your data findings in your portfolio.
Want insight into how employers view data analysts? Learn more about how data analysts and their portfolios are viewed by hiring managers in this lecture from Google's Data Analytics Professional Certificate:
A significant part of your role as a data analyst is cleaning data to prepare it for analysis. Data cleaning (also called data scrubbing) is the process of removing incorrect and duplicate data, managing any holes in the data, and ensuring consistent formatting.
As you look for a data set to practice cleaning, look for one that includes multiple files gathered from multiple sources without much curation. Some sites where you can find “dirty” data sets to work with include:
CDC Wonder
Data.gov
World Bank
Data.world
/r/datasets
Example data cleaning project: This Medium article outlines how data analyst Raahim Khan cleaned a set of daily-updated statistics on trending YouTube videos.
Learn how to collect, clean, sort, evaluate, and visualize data with the Meta Data Analyst Professional Certificate.
Data analysis is all about answering questions with data. Exploratory data analysis, or EDA for short, helps you explore what questions to ask. This could be done separately from or in conjunction with data cleaning. Either way, you’ll want to accomplish the following during these early investigations.
Ask lots of questions about the data.
Discover the underlying structure of the data.
Look for trends, patterns, and anomalies in the data.
Test hypotheses and validate assumptions about the data.
Think about what problems you could potentially solve with the data.
Example exploratory data analysis project: This data analyst took an existing dataset on American universities from Kaggle in 2013 and used it to explore what makes students prefer one university over another.
An EDA project is an excellent time to take advantage of the wealth of public datasets available online. Here are 10 fun and free datasets to get you started in your explorations.
1. National Centers for Environmental Information: Dig into the world’s largest provider of weather and climate data.
2. World Happiness Report 2021: What makes the world’s happiest countries so happy?
3. NASA: If you’re interested in space and earth science, see what you can find among the tens of thousands of public datasets made available by NASA.
4. US Census: Learn more about the people and economy of the United States with the latest census data from 2020.
5. FBI Crime Data Explorer (CDE): Explore crime data collected by more than 18,000 law enforcement agencies.
6. World Health Organization COVID-19 Dashboard: Track the latest coronavirus numbers by country or WHO region.
7. Latest Netflix Data: This Kaggle dataset (updated in April 2021) includes movie data broken down into 26 attributes.
8. Google Books Ngram: Download the raw data from the Google Books Ngram to explore phrase trends in books published from 1960 to 2015.
9. NYC Open Data: Discover New York City through its many publicly available datasets on topics like the Central Park squirrel population to motor vehicle collisions.
10. Yelp Open Dataset: See what you can find while exploring this collection of Yelp user reviews, check ins, and business attributes.
Sentiment analysis, typically performed on textual data, is a technique in natural language processing (NLP) for determining whether data is neutral, positive, or negative. It may also be used to detect a particular emotion based on a list of words and their corresponding emotions (known as a lexicon).
This type of analysis works well with public review sites and social media platforms, where people are likely to offer public opinions on various subjects.
To get started exploring what people feel about a certain topic, you can start with sites like:
Amazon (product reviews)
Rotten Tomato (movie reviews)
News sites
Example sentiment analysis project: This blog post on Towards Data Science explores the use of linguistic markers in Tweets to help diagnose depression.
Learn how to use Google Cloud for sentiment analysis from Google itself in their short, interactive project Entity and Sentiment Analysis with the Natural Language API.
Humans are visual creatures, which makes data visualization a powerful tool for transforming data into a compelling story to encourage action. Great visualizations are not only fun to create, but they also have the power to make your portfolio look beautiful.
Example data visualization project: Data analyst Hannah Yan Han visualizes the skill level required for 60 different sports to determine which are the toughest.
You don’t need to pay for advanced visualization software to start creating stellar visuals either. These are just a few of the free visualization tools you can use to start telling a story with data:
1. Tableau Public: Tableau ranks among the most popular visualization tools. Use the free version to transform spreadsheets or files into interactive visualizations (here are some examples from April 2021).
2. Google Charts: This gallery of interactive charts and data visualization tools makes it easy to embed visualizations within your portfolio using HTML and JavaScript code. A robust Guides section walks you through the creation process.
3. Datawrapper: Copy and paste your data from a spreadsheet or upload a CSV file to generate charts, maps, or tables—no coding required. The free version allows you to create unlimited visualizations to export as PNG files.
4. D3 (Data-Driven Documents): With a bit of technical know-how, you can do a ton with this JavaScript library.
5. RAW Graphs: This open source web app makes it easy to turn spreadsheets or CSV files into a range of chart types that might otherwise be difficult to produce. The app even provides sample data sets for you to experiment with.
There’s nothing wrong with populating your portfolio with mini projects highlighting individual skills. But if you’ve scraped the web for your own data, you might also consider using that same data to complete an end-to-end project. To do this, take the data you scraped and apply the main steps of data analysis to it—clean, analyze, and interpret.
This can show a potential employer that you have the essential skills of a data analyst and know how they fit together.
We published a newsletter about becoming a data analyst. Subscribe to keep up with the latest career tips with Career Chat.
Another great way to build some portfolio-ready projects is through a project-based online course. Here are some of the most recommended courses on Coursera:
Complete projects to add to your portfolio with the Google Data Analytics Professional Certificate. When you complete the program, you'll also get access to career resources.
Deepen and demonstrate your Python capabilities with the University of Michigan's Python for Everybody Specialization.
Practice using Power BI, a common data analysis tool used to transform data into insights with custom reports and dashboards, with the Microsoft Power BI Data Analyst Professional Certificate.
There are many great books for those just starting out in data analytics. The following three books, in particular, offer accessible introductions to key aspects of field: Data Analytics Made Accessible by Dr. Anil Maheshwari Numsense! Data Science for the Layman: No Math Added by Annalyn Ng and Kenneth Soo Python for Everybody: Exploring Data in Python 3 by Dr. Charles Russell Severance
To supplement their reading, beginners may also consider taking the online Python for Everybody Specialization offered by the University of Michigan and taught by Dr. Severance himself.
Data visualization is the process of graphically representing data through visual means. Common forms of data visualization include the use of graphs, charts, and diagrams to visually represent otherwise abstract data sets. Today, data visualization is considered a key skill in the world of data analytics.
Beginning data analysts should make sure they have a solid technical understanding of Structured Query Language (SQL), Microsoft Excel, and either R or Python. Additionally, they should be able to think critically, present confidently, and know how to tell their data’s story visually. Read more about these and other key data analyst skills.
Editorial Team
Coursera’s editorial team is comprised of highly experienced professional editors, writers, and fact...
This content has been made available for informational purposes only. Learners are advised to conduct additional research to ensure that courses and other credentials pursued meet their personal, professional, and financial goals.