Statistical Charts

Statistical charts are graphical tools and techniques used to display data and display data analysis output in some sort of a pictorial form. They include plots such as boxplots, scatterplots, pie charts, bar charts, histograms, probability plots, residual plots, and etc. They provide insight into a data set to help with testing assumptions, model selection, relationship identification, outlier detection, estimator selection, and more. Moreover, the choice of using appropriate statistical charts can provide a convincing means of communicating underlying messages that is present in the data to others.

Statistical charts have four main objectives:

  1. The exploration of the content of a data set.
  2. To find the structure of a data set.
  3. Checking assumptions in statistical models.
  4. To communicate the results of an analysis.

Examples 

Histogram         

Histogram example

A histogram is a statistical chart used to represent the distribution of a quantitative variable. Its purpose is to test the probability distribution of a given quantitative variable by displaying the frequencies of observations occurring in certain range of values. Histograms have no space between adjacent columns unlike bar charts.

For this specific example, the distribution of “Waiter Arrival Time” is roughly normal (bell-shaped). Since the distribution is roughly normal, we can estimate the mean and the median by looking at the graph.

Scatterplot 

Scatter plot example

A scatterplot is a graphical representation used to determine the relationship between two quantitative variables. It consists of a horizontal axis (x-axis) and a vertical axis (y-axis). X is usually the variable that might be related to the response variable and Y is the response variable. Each data point in a scatterplot is an observation point.

This example shows positive correlation between height and weight, which means that generally, as height increases, weight of a person increases.

Pareto Chart

Pareto Chart example

A Pareto chart contains both bar chart and line chart. Bar charts are represented in descending order and the cumulative total is represented by a line. The left vertical axis shows the frequency of occurrence or any other important measure. The right vertical axis shows the cumulative percentage of the total number of occurrence. Lastly, the horizontal axis shows the categories of a categorical variable. A Pareto chart is very useful when highlighting the most important factor or analyzing what problems need attention first because the highest occurring factor is the tallest bar. Categories of the independent variable (x-axis) must be mutually exclusive and exhaustive.

Bubble Chart

Bubble Chart example

 

A bubble chart is a graphical chart used to compare the relationships between data points in three dimensional space. Data points are represented by bubbles on a regular XY scatterplot. Each data point is plotted by its corresponding x value, y value, and the third value, size. Thus, each data point is in the form (x,y,z), where z is the size of the bubble. Therefore, bubble chars are like XY scatterplots except that each point on the scatterplot has an additional data value associated with it, which is represented by the magnitude of a bubble. By implementing size for each data point, it is easy to compare among observations.

These are only a small portion of examples of statistical charts used as forms of data visualizations. There are other many various statistical charts used in data visualizations that are effective in certain situations. Not all statistical charts have to follow these formats; statistical charts can be creative based on what kind of data you have!