When you need to visualize a large number of data points, a scatter plot is one of the best tools for evaluating trends and relationships. A scatter plot, also known as a scatter chart or a scatter graph, is a good visualization tool when you don’t need to understand each individual data point, but you want to see how datasets relate to each other or what trends are present.
A scatter plot charts the relationship between two values, one on the x-axis and one on the y-axis. Each value between the x,y pair is charted as a single plot point. To build a scatter plot you need two columns of data, with each column detailing the values for each axis.
The resulting plot points reveal trends in the data, with the shape of the plot points telling a story behind your dataset. A scatter plot can show trends, clusters, patterns, and relationships at the same time, in one chart.
When Should You Use a Scatter Plot?
Scatter plots really shine when they are used for large datasets. While visualizing trends and groups of data, they also allow you to find outliers that warrant further investigation and show the general distribution of the data.
There are several ways scatter plots can visualize relationships within your data. A scatter plot will show a positive or negative correlation in the data (with both values increasing, or one increasing while the other decreases). Other relationships that can be revealed include exponential increases or decreases and linear trends. A scatter plot can also show no correlation between the data.
Using a scatter plot in your business intelligence allows you to highlight the normal tendency of the values of your data. This can be helpful in answering business questions. Some as simple as, “Does reducing the price of a product increase sales?” By charting product prices and the number of units sold at each price, you can see how the two variables relate to each other.
Scatter plots can also be useful in identifying if there is a relationship between two data points. For example, you could ask the question, “Are my employees more productive in the office when they use their PTO?” A scatter plot can show if there is a positive, negative, or negligible relationship between the number of days of PTO an employee uses and the number of customer calls resolved, software bugs updated, or prospecting leads found.
Using a Scatter Plot in Different Business Scenarios
You can use scatter plots to help inform business decisions for a wide range of use cases across departments, including:
Develop user personas. A scatter plot can be used to see clusters of data. This can be helpful when developing user personas based on user characteristics. For example, chart the total value of a user’s purchase with basic demographic data like age. You will be able to see if there is a correlation between the two data points and if the data groups fall into significant categories.
Track product performance. Use a scatter plot to visualize how a suite of products is performing. Track the number of units sold with how much revenue each produces. This can help you understand if products are priced correctly and if there is any variance in revenue across product lines.
Identify high-spending departments. You can use a scatter chart to plot out how much each department spent over the last year. You will be able to quickly identify outliers with high or low spending and see if different departments group together.
Plot out sales trends. A scatter plot is useful for understanding if two different data points may be related. For example, a coffee shop could track sales of iced coffee vs. the temperature outside to see if there is any relationship between sales and temperature. If there is a trend, they could use this data to better plan for future high-volume sales days.
Using a Scatter Plot in a Dashboard
While scatter plots are useful in identifying trends and outliers, you’ll likely need other chart types to support a BI dashboard. When you identify an outlier in your data, you can use another chart to support additional investigation into why it may have occurred.
For example, use a scatter plot to chart out the performance of a suite of products. If one product has a high number of units sold but a lower overall revenue, it will require further investigation. Use a multi-line chart to track the number of units sold and the average price of the product over the last 12 months to see if there are any discrepancies.
Use a bubble chart to dig deeper on a small cluster of data, using the size of the bubbles to add a third variable and see if it affects trends in your chart. Or, use a pie chart to see what portion of the overall revenue each of the top five products had compared to the rest of the product lines.
Best Practices for Using a Scatter Plot
There are some key points to consider when utilizing a scatter plot to visualize your metrics.
Remember correlation is not always causation.
Just because the plots on your graph may show a general trend, doesn’t mean you are interpreting the data correctly. Scatter plots are excellent at showing trends and understanding relationships, but prior to creating your scatter plot, ensure the data is actually related. You will also need to look at other data points and other data visualizations to account for additional variables that may affect the data.
Consider using trend lines.
Often the trends in your scatter plot will be easy to understand. But, sometimes it will be harder to see general trends. Including a trend line can help readers interpret the data and understand relationships between data points. The trend line will show the mathematical best fit for the data. This will help to make correlations or predictive trends easier to identify.
Be conscious of overplotting.
Scatter plots lend themselves to charting a large number of data points and identifying trends. However, you can have too many data points. When you run into the issue of overplotting, it can be difficult to see the relationship between the data points and different variables. The dots overlap and begin to create a densely packed blob that can be difficult to read. You can break data into smaller segments and plot it on several charts. Or ensure you only plot enough data points that you can reduce overlapping and still get value from each individual point.
Scatter plots are a great way to visualize trends in large datasets. They can be a powerful tool for understanding correlational trends and predicting future paths. To learn more about other charts and data visualization techniques, check out our Learn Center.
Ready to get started? Try Domo now or watch a demo.