Be able to create and interpret basic bar graphs, dotplots, pie charts, histograms, stem-and-leaf plots, line charts, and time-series diagrams.

Slide 3.2-

Definition

The distribution of a variable refers to the way its values are spread over all possible values. We can summarize a distribution in a table or show a distribution visually with a graph.

Slide 3.2-

Bar Graphs, Dotplots, and Pareto Charts

A bar graph is one of the simplest ways to picture a distribution. Bar graphs are commonly used for qualitative data.

Each bar represents the frequency (or relative frequency) of one category: the higher the frequency, the longer the bar. The bars can be either vertical or horizontal.

Slide 3.2-

• Because the highest frequency is 9 (the frequency for C grades), we chose to make the vertical scale run from 0 to 10. This ensures that even the tallest bar does not quite touch the top of the graph.

Let’s create a vertical bar graph from the essay grade data in Table 3.1.

Slide 3.2-

• The graph should not be too short or too tall. In this case, it looks about right to choose a total height of 5 centimeters (as shown in the text), which is convenient because it means that each centimeter of height corresponds to a frequency of 2.

• The height of each bar should be proportional to its frequency. For example, because each centimeter of height corresponds to a frequency of 2, the bar representing a frequency of 4 should have a height of 2 centimeters.

• Because the data are qualitative, the widths of the bars have no special meaning, and there is no reason for them to touch each other. We therefore draw them with uniform widths.

Let’s create a vertical bar graph from the essay grade data in Table 3.1.

Slide 3.2-

Important Labels for Graphs

Title/caption: The graph should have a title or caption (or both) that explains what is being shown and, if applicable, lists the source of the data.

Vertical scale and label: Numbers along the vertical axis should clearly indicate the scale. The numbers should line up with the tick marks—the marks along the axis that precisely locate the numerical values. Include a label that describes the variable shown on the vertical axis.

Slide 3.2-

Important Labels for Graphs (cont.)

Horizontal scale and label: The categories should be clearly indicated along the horizontal axis. (Tick marks may not be necessary for qualitative data, but should be included for quantitative data.) Include a label that describes the variable shown on the horizontal axis.

Legend: If multiple data sets are displayed on a single graph, include a legend or key to identify the individual data sets.

Slide 3.2-

A dotplot is a variation on a bar graph in which we use dots rather than bars to represent the frequencies. Each dot represents one data value.

Figure 3.2 Dotplot for the essay grade data in Table 3.1.

Slide 3.2-

A bar graph in which the bars are arranged in frequency order is often called a Pareto chart.

Slide 3.2-

TIME OUT TO THINK

Would it be practical to make a dotplot for the population data in Figure 3.3? Would it make sense to make a Pareto chart for data concerning SAT scores? Explain.

Slide 3.2-

Definitions

A bar graph consists of bars representing frequencies (or relative frequencies) for particular categories. The bar lengths are proportional to the frequencies.

A dotplot is similar to a bar graph, except each individual data value is represented with a dot.

A Pareto chart is a bar graph with the bars arranged in frequency order. Pareto charts make sense only for data at the nominal level of measurement.

Slide 3.2-

Pie Charts

Pie charts are usually used to show relative frequency distributions. A circular pie represents the total relative frequency of 100%,

Figure 3.5 Party affiliations of registered voters in Rochester County

Slide 3.2-

Definition

A pie chart is a circle divided so that each wedge represents the relative frequency of a particular category. The wedge size is proportional to the relative frequency. The entire pie represents the total relative frequency of 100%.

Slide 3.2-

A graph in which the bars have a natural order and the bar widths have specific meaning, is called a histogram.

The bars in a histogram touch each other because there are no gaps between the categories.

Slide 3.2-

The stem-and-leaf plot (or stemplot) looks somewhat like a histogram turned sideways, except in place of bars we see a listing of data for each category.

Figure 3.9 Stem-and-leaf plot for the energy use data from Table 3.3.

Stem

Leaves

Slide 3.2-

Another type of stem-and-leaf plot lists the individual data values. For example, the first row shows the data values 0.3 and 0.7.

Figure 3.10Stem-and-leaf plot showing numerical data—in this case, the per person carbon dioxide emissions from Table 3.11.

Slide 3.2-

Definitions

A histogram is a bar graph showing a distribution for quantitative data (at the interval or ratio level of measurement); the bars have a natural order and the bar widths have specific meaning.

A stem-and-leaf plot (or stemplot) is somewhat like a histogram turned sideways, except in place of bars we see a listing of data.

Slide 3.2-

TIME OUT TO THINK

What additional information would you need to create a stem-and-leaf plot for the ages of actresses when they won Academy Awards (from Table 3.12, page 106)? What would the stem-and-leaf plot look like?

Slide 3.2-

Line Charts

Definition

A line chart shows a distribution of quantitative data as a series of dots connected by lines. For each dot, the horizontal position is the center of the bin it represents and the vertical position is the frequency value for the bin.

Slide 3.2-

Figure 3.12 Line chart for the energy use data, with a histogram overlaid for comparison.