Working in Data analytics for more than 3 years, I can say that Data visualization plays a humongous role in providing insightful recommendations. I always wished to have quick notes while I prepare for interview either if I am attending an interview or recruiting analysts. I went forward to put forth some collective information on plots that I have worked with over the years. Below are some of the most widely used plots in the field of data analytics and data science. These are just a quick notes on when they are used predominantly.
Scatter plot:
Bi-variate analysis
Trends and patterns between two variables
Understand the relationship between two variables
Pair plot:
Multiple scatter plots to show how more than 2 variables relate to each other
Tweaking a little bit, it helps us how regressed the variables are with eachother
Heat map:
Colorful visuals based on the measure
Pairplots — to understand positive & negative correlation
Confusion matrix for finding how good the model performed based on TP , TN metrics
Bar plot:
Bi-variate as well as multivariate analysis
Compare numbers, frequencies over categorical variables
eg: Pre vs Post campaign performance
Count plot:
Find the most frequent category from the groups
Uni variate analysis (Age or gender frequency)
Stacked bar plot:
Categorizing and comparing the parts of a whole
Example: out of men and female groups, how many survived?
Line chart:
Analyzing trends and patterns over a period of time or specific variable
Ex: Time series analysis plot
Box plot:
One of the many important steps in data preparation
To find outliers in data
Provide us with the Interquartile Range (IQR)
Histogram:
Allows us to analyze data distribution
To find skeweness — positive or negatively skewed
Gauge chart:
Predominately used as goal/target indicator
Provides insights if the target is achieved or how far the metric have performance
Pie chart:
Proportion of variable out of total
Visual variant of pie chart is doughnut chart
Top is “Doughnut Chart” & Bottom is “Pie chart”
Funnel chart:
Understand data flow over the period
most widely used in analyzing campaign & website performance
Table chart with indicators:
A simple display of data with KPI (Key Performance Indicators) metrics
Helps us to show rate of performance (eg. Below chart signifies weekly movement-Open Rate)
Tree map:
Hierarchical structure of the data that consist of nested rectangles with parent-child hierarchy
Provides insightful analysis based on color and size
Bonus: Difference between Heat Map and Tree Map: In heat map, one measure can be assigned to the color and another measure can be assigned to the size. The layout is similar to the table with values encoded as colors In Tree map, 1 or more dimensions & up to 2 measures are used to create. The bigger the size of the node, greater the values. These nested rectangles range in size from top left corner of the chart to the bottom right corner with the largest positioned on top left and smallest in the bottom right.
Comments