Skip to main content

Intro

Two new Data Science charts—confusion matrix and correlation matrix—can save you time when identifying exceptions and next steps in your business analysis. A correlation matrix shows how variables in your DataSet relate to each other. Learn more about correlation matrices. A confusion matrix summarizes the predictive performance of a machine learning classification model. Learn more about confusion matrices. Correlation Matrix
correlation matrix.jpg
Confusion Matrix
prediction v actual example.jpg

Explore these charts by the following topics:

Correlation Matrix

A correlation matrix is a table showing correlation coefficients between sets of variables. Each random variable (Xi) in the table is correlated with each of the other values in the table. This allows you to see which pairs have the highest correlation. The diagonal of the matrix is always a set of ones because the correlation between a variable and itself is always 1. You can fill in the upper-right triangle, but these values are a repeat of the lower-left triangle. In other words, a correlation matrix is also a symmetric matrix. One example of a correlation matrix is the relationship between a student’s exam score and several other variables. Sample variables could include “hours of study”, “IQ score” and “hours of sleeping”. Here is an example table with these variables:
correlation table.jpg
In this format, this data is hard to analyze for correlation. Using a correlation matrix chart does much of the work for you. Here is a correlation matrix with the identity diagonal and the top half of the chart (duplicate data) removed.
example matrix.jpg
The correlation matrix chart helps you see positive and negative correlations, and which relationships have higher and lower correlation, more easily.

Power a Correlation Matrix

A correlation matrix requires at least two numeric columns to populate the x and y axes. You cannot use text columns in a correlation matrix. You should choose columns that allow you to see correlation between your data. In Analyzer, you can choose the columns containing the data for your correlation matrix. For more information about choosing data columns, see Applying DataSet Columns to Your Chart. In the example below, overall Sales data is being compared and correlated using cost, profit, and sales data. You can see how each of the three inputs correlates to one another using the scale key below the chart.
example correlation matrix.jpg

Customize a Correlation Matrix

You can customize the appearance of a correlation matrix by editing its chart properties. For information about all available chart properties, see Chart Properties. Unique properties of a correlation matrix include the following:

Property

Description

Example

General > Positive Color

The color displayed
for positive values

General > Zero Color

The color displayed
for ‘zero’ values

image

General > Negative Color

The color displayed
for negative values

General > Cell Border Color

The color of the cell borders.

General > Show Half Grid Displays the matrix
as only a half-grid
show half grid.png
General > Half Grid Position If a half-grid is selected,
this property changes its positioning.

General > Exclude Identity Removes identity data
from the matrix.
exclude identity.png

Power a Confusion Matrix

A confusion matrix requires the following:
  • A ‘Predicted Value’ measure or dimension — the predicted value defaults to the y-axis
  • An ‘Actual Value’ measure or dimension — the actual value defaults to the x-axis
A confusion matrix chart uses 1/0, yes/no, or true/false value input types. In Analyzer, you can choose the columns containing the data for your confusion matrix. For more information about choosing data columns, see our article about Applying DataSet Columns to Your Chart. In the example below, Employee Retention data is being used to show how the predicted value of employees being left out of projects correlates to the actual number of employees who leave a company.

Customize a Confusion Matrix

You can customize the appearance of a confusion matrix by editing its chart properties. For information about all available chart properties, see Chart Properties. Unique properties of a confusion matrix include the following:

Property

Description

Example

General > Actual Position

Determines the position of the ‘Actual’ values and label

General > Actual Predicted Color

The fill color for the Actual and Predicted text sections

General > Positive Label

The text to be used for the positive column label

positive label.jpg

General > Negative Label

The text to be used for the negative column label

negative label.jpg

General > Positive Position Determines whether the positive column/row will be shown first or last
General > Positive Negative Color The fill color for the Positive and Negative text sections


General > Value Label The text used for the value label
General > True Positive Color Fill color for the True Positive quadrant
General > True Negative Color Fill color for the True Negative quadrant
General > False Positive Color Fill color for the False Positive quadrant
General > False Negative Color Fill color for the False Negative quadrant
General > Border Width Width of the border around each section confusion border width.jpg
General > Border Color The color for the border drawn around each section

FAQ

What specific equation is being used to calculate the correlation coefficients? Pearson correlation coefficients are calculated.