End to End Beginner’s Guide To Become Data Analyst

Harsha Kumavat
6 min readApr 6, 2023

--

Hello Everyone, If you want to become data analyst then this blog is for you. This article will help you to understand who is data analyst, what work they do, what skills are required along with free resources to acquire those skills, portfolio projects, certifications, cheat sheets, internships, resume making & interview questions. Let’s get started.

A data analyst is a person who collects, processes, and analyses large datasets in order to uncover insights, trends, and patterns. They use various tools and techniques to organize and interpret data, as well as present their findings to help businesses or organizations make informed decisions. Data analysts work in many different industries, including healthcare, finance, marketing, and technology.

A data analyst is someone who examines information such as numbers or words to discover patterns or meaning. It’s similar to being a detective, but instead of solving mysteries, they use data to solve problems.

Example: Assume a retailer wanted to know what their most popular toy was. A data analyst would analyze all of the sales data and determine which toy was sold the most, and then inform the store so that they could ensure they had enough of that toy in stock.

Let’s understand this steps in detail:

Define the problem: Define and understand the problem that needs to be solved.

Collect the data: Collect the data required for analysis.

Data Cleaning: Data cleaning is the process of preparing data for analysis by removing/filling missing values, remove duplicates, handling errors, and so on.

Data Analysis & Visualization: Analyze the data to understand the underlying pattern, trends, characteristics etc. by visualizing them to get better insights.

Interpret the results: After analyzing the data, the next step is to interpret the results and make recommendations or steps that can be implemented based on the analysis to solve the problem.

Communicate the results: Present your findings to stakeholders through a presentation, report, dashboard, and so on.

Let’s understand the data analysis process using a real world scenario:

So, imagine you’re a data analyst at a retail company, and you’ve been tasked with analyzing sales data to identify trends and opportunities to increase revenue. The procedure is outlined below:

Define the Problem: At first you will understand the problem you’re trying to solve. So In this particular scenarios you are trying to identify trends and opportunities to increase revenue.

Collect the data: Collect relevant data from various sources. In this case, you might gather sales data from your company’s database, as well as external data such as industry reports and so on.

Data Cleaning: After collecting the data, you need to clean and prepare it for analysis. This process includes removing/filling missing values, remove duplicates, handling errors and so on.

Data Analysis & Visualization: Now it’s time to analyze the data. Analyze the data to identify patterns and trends in the data, creating visualizations to help you better understand the data.

Interpret the results: Once you have analyzed the data, you need to interpret the results. What patterns and trends you identified? Will your analysis helps in solving the problem? Are there any specific opportunities to increase revenue that you identified? Make recommendations based on your analysis that can be implemented to solve the problem.

Communicate the results: Finally, you need to communicate your findings to the relevant stakeholders. This process involves creating reports or presentations or dashboards that clearly explain your analysis and recommendations.

To be a successful data analyst, one must have both technical and soft skills. The most important technical and soft skills are listed below:

Resources to learn technical skills are listed below:

Excel

Topics:

  • Basic formulas: SUM, AVERAGE, MEAN, MEDIAN, PRODUCT, CONCATENATE
  • Advance formulas: VLOOKUP, INDEX, MATCH, IF, COUNTIF, SUMIF
  • Conditional Formatting
  • Basic Charts, filters, slicers & timeline
  • Pivot tables and pivot charts
  • Ignore VBA, Macros, etc.

Videos:

Reading material:

Projects:

Statistics

Topics:

  • Measures of Central tendency(Mean, Median, Mode)
  • Measures of Variability (Range, Variance, Standard Deviation, Percentiles, Quartiles, IQR)
  • Sampling techniques
  • Correlation & regression analysis
  • Probability distributions
  • Basic principles of hypothesis testing, confidence intervals.

Videos:

Reading material:

SQL

Topics:

  • Basics of relational databases, schemas, tables, different database types, SQL data types.
  • SQL commands: DDL, DML, DCL, TCL, DQL.
  • Basic Queries: Select, Limit, Where, Operators, Distinct, Aliasing, Comments, Aggregate functions, Group by, Having, Order by, Joins, Union, Union All.
  • Functions: String functions, Date functions, Numeric functions, Advanced functions.
  • Advanced Queries: Advanced filtering( Like, wildcard operators, between, IN), cte ,views, temporary tables, window functions( over and partition by, row_number(), rank(), dense_rank(), lead(), lag()), subqueries, stored procedures and functions.

Videos:

Reading material:

Projects:

Alex the Analyst: There are 2 guided projects they are SQL data exploration and data cleaning in SQL.

Adam Finer: This project helps you from creating a SQL database, to querying and exploring the data, to visualize the data.

For detailed SQL Roadmap for Data Science view my blog: https://harshakumavat.medium.com/sql-for-data-science-roadmap-d8b867de3100

Python

Topics:

  • Variables, Strings, Lists, Dictionaries, Tuples, Sets, If condition, for & while loops, Operators, functions, Modules, file handling, classes and objects, exception handling.
  • Libraries: Pandas, Numpy,Matplotlib & Seaborn.
  • Pandas: DataFrame basics, read/write csv, excel and JSON files, data manipulation and analysis- Group by, pivot, Concatenate, Merge.
  • Matplotlib & Seaborn: Basic charts( Bar, Line, area pie, boxplot, histogram, heatmap)

Videos:

Reading material:

Projects:

Internships

Internships with detailed information are listed in blog: https://harshakumavat.medium.com/virtual-data-science-internships-d7624fe0a945

Connect with me on LinkedIn if you have any questions. I’m delighted to assist!!!

LinkedIn: www.linkedin.com/in/harshakumavat2000

Please leave a comment and a clap to show your support for my blog. Follow me on medium for more blogs like this.

--

--

Harsha Kumavat
Harsha Kumavat

Written by Harsha Kumavat

As a Data Engineer, I love building data pipelines and optimizing datasets to extract valuable insights. LinkedIn: www.linkedin.com/in/ harshakumavat2000.

Responses (7)