Essential Chart Types for Data Visualization

Data Tutorial Charts

Charts are an essential part of working with data, as they are a way to condense large amounts of data into an easy to understand format. Visualizations of data can bring out insights to someone looking at the data for the first time, as well as convey findings to others who won’t see the raw data. There are countless chart types out there, each with different use cases. Often, the most difficult part of creating a data visualization is figuring out which chart type is best for the task at hand.

Your choice of chart type will depend on multiple factors. What are the types of metrics, features, or other variables that you plan on plotting? Who is the audience that you plan on presenting to – is it just an initial exploration for yourself, or are you presenting to a broader audience? What is the kind of conclusion that you want the reader to draw?

In this article, we’ll provide an overview of essential chart types that you’ll see most frequently offered by visualization tools. With these charts, you will have a broad toolkit to be able to handle your data visualization needs. Guidance on when to select each one based on use case is covered in a follow-up article.

The Foundational Four

In his book Show Me the Numbers, Stephen Few suggests four major encodings for numeric values, indicating positional value via bars, lines, points, and boxes. So we’ll start off with four basic chart types, one for each of these value-encoding means.

Bar chart

This bar chart shows the number of purchases made by different user types

In a bar chart, values are indicated by the length of bars, each of which corresponds with a measured group. Bar charts can be oriented vertically or horizontally; vertical bar charts are sometimes called column charts. Horizontal bar charts are a good option when you have a lot of bars to plot, or the labels on them require additional space to be legible.

Line chart

This line chart shows changes in a currency exchange rate over time

Line charts show changes in value across continuous measurements, such as those made over time. Movement of the line up or down helps bring out positive and negative changes, respectively. It can also expose overall trends, to help the reader make predictions or projections for future outcomes. Multiple line charts can also give rise to other related charts like the sparkline or ridgeline plot.

Scatter plot

This scatter plot demonstrates a moderate linear correlation between two numeric variables

A scatter plot displays values on two numeric variables using points positioned on two axes: one for each variable. Scatter plots are a versatile demonstration of the relationship between the plotted variables—whether that correlation is strong or weak, positive or negative, linear or non-linear. Scatter plots are also great for identifying outlier points and possible gaps in the data.

Box plot

This box plot compares the distribution of a numeric variable for three levels of a categorical variable

A box plot uses boxes and whiskers to summarize the distribution of values within measured groups. The positions of the box and whisker ends show the regions where the majority of the data lies. We most commonly see box plots when we have multiple groups to compare to one another; other charts with more detail are preferred when we have only one group to plot.

Tables and single values

Single statistics can be reported as they are rather than as a chart

Before moving on to other chart types, it’s worth taking a moment to appreciate the option of just showing the raw numbers. In particular, when you only have one number to show, just displaying the value is a sensible approach to depicting the data. When exact values are of interest in an analysis, you can include them in an accompanying table or through annotations on a graphical visualization.

Common Variations

Additional chart types can come about from changing the ways encodings are used, or by including additional encodings. Secondary encodings like area, shape, and color can be useful for adding additional variables to more basic chart types.

Histogram

This histogram shows the distribution of response times to a ticketing system, grouped by hours

If the groups depicted in a bar chart are actually continuous numeric ranges, we can push the bars together to generate a histogram. Bar lengths in histograms typically correspond to counts of data points, and their patterns demonstrate the distribution of variables in your data. A different chart type like line chart tends to be used when the vertical value is not a frequency count.

Stacked bar chart

This stacked bar chart shows revenue by store location, divided by department

One modification of the standard bar chart is to divide each bar into multiple smaller bars based on values of a second grouping variable, called a stacked bar chart. This allows you to not only compare primary group values like in a regular bar chart, but also illustrate a relative breakdown of each group’s whole into its constituent parts.

Grouped bar chart

This grouped bar chart shows new quarterly revenue divided by representative

If, on the other hand, the sub-bars were placed side-by-side into clusters instead of kept in their stacks, we would obtain the grouped bar chart. The grouped bar chart does not allow for comparison of primary group totals, but does a much better job of allowing for comparison of the sub-groups.

Dot plot

This dot plot shows differences in performance for different experimental conditions

A dot plot is like a bar chart in that it indicates values for different categorical groupings, but encodes values based on a point’s position rather than a bar’s length. Dot plots are useful when you need to compare across categories, but the zero baseline is not informative or useful. You can also think of a dot plot as like a line plot with the line removed, so that it can be used with variables with unordered categories rather than just continuous or ordered variables.

Area chart

This area chart shows number of daily trips, divided by user type

An area chart starts with the same foundation as a line chart – value points connected by line segments – but adds in a concept from the bar chart with shading between the line and a baseline. This chart is most often seen when combined with the concept of stacking, to show how both how a total has changed over time, but also how its components’ contributions have changed.

Dual-axis chart

This dual-axis bar+line chart shows number of new customers and average acquisition cost over time

Dual-axis charts overlay two different charts with a shared horizontal axis, but potentially different vertical axis scales (one for each component chart). This can be useful to show a direct comparison between the two sets of vertical values, while also including the context of the horizontal-axis variable. It is common to use different base chart types, like the bar and line combination, to reduce confusion of the different axis scales for each component chart.

Bubble chart

This bubble chart shows the relationship between three numeric variables by x-position, y-position, and point size

Another way of showing the relationship between three variables is through modification of a scatter plot. When a third variable is categorical, points can use different shapes or colors to indicate group membership. If the data points are ordered in some way, points can also be connected with line segments to show the sequence of values. When the third variable is numeric in nature, that is where the bubble chart comes in. A bubble chart builds on the base scatter plot by having the third variable’s value determine the size of each point.

Density curve

This density curve shows a smooth distribution by adding a smooth amount of area around each data point

The density curve, or kernel density estimate, is an alternative way of showing distributions of data instead of the histogram. Rather than collecting data points into frequency bins, each data point contributes a small volume of data whose collected whole becomes the density curve. While density curves may imply some data values that do not exist, they can be a good way to smooth out noise in the data to get an understanding of the distribution signal.

Violin plot

This violin plot compares the distribution of a numeric variable for three levels of a categorical variable

An alternative to the box plot’s approach to comparing value distributions between groups is the violin plot. In a violin plot, each set of box and whiskers is replaced with a density curve built around a central baseline. This can provide a better comparison of data shapes between groups, though this does lose out on comparisons of precise statistical values. A frequent variation for violin plots is to include box-style markings on top of the violin plot to get the best of both worlds.

Heatmap

This heatmap shows new revenue by quarter and representative

The heatmap presents a grid of values based on two variables of interest. The axis variables can be numeric or categorical; the grid is created by dividing each variable into ranges or levels like a histogram or bar chart. Grid cells are colored based on value, often with darker colors corresponding with higher values. A heatmap can be an interesting alternative to a scatter plot when there are a lot of data points to plot, but the point density makes it difficult to see the true relationship between variables.

Specialist Charts

There are plenty of additional charts out there that encode data in other ways for particular use cases. Xenographics includes a collection of some fanciful charts that have been driven by very particular purposes. Still, some of these charts have use cases that are common enough that they can be considered essential to know.

Pie chart

This pie chart shows share of votes for candidates following an election

You might be surprised to see pie charts being sequestered here in the ‘specialist’ section, considering how commonly they are utilized. However, pie charts use an uncommon encoding, depicting values as areas sliced from a circular form. Since a pie chart typically lacks value markings around its perimeter, it is usually difficult to get a good idea of exact slice sizes. However, the pie chart and its cousin the donut plot excel at telling the reader that the part-to-whole comparison should be the main takeaway from the visualization.

Funnel chart

This funnel chart shows conversion rates from impression and through clicks

A funnel chart is often seen in business contexts where visitors or users need to be tracked in a pipeline flow. The chart shows how many users make it to each stage of the tracked process from the width of the funnel at each stage division. The tapering of the funnel helps to sell the analogy, but can muddle what the true conversion rates are. A bar chart can often fulfill the same purpose as a funnel chart, but with a cleaner representation of data.

Bullet chart

This bullet chart shows pageviews and downloads against goal benchmarks

The bullet chart enhances a single bar with additional markings for how to contextualize that bar’s value. This usually means a perpendicular line showing a target value, but also background shading to provide additional performance benchmarks. Bullet charts are usually used for multiple metrics, and are more compact to render than other types of more fanciful gauges.

Map-based plots

This choropleth shows how many people live in each state of the United States

There are a number of families of specialist plots grouped by usage, but we’ll close this article out by touching upon one of them: map-based or geospatial plots. When values in a dataset correspond to actual geographic locations, it can be valuable to actually plot them with some kind of map. A common example of this type of map is the choropleth like the one above. This takes a heat map approach to depicting value through the use of color, but instead of values being plotted in a grid, they are filled into regions on a map.