3. 2 Picturing Distributions of Data



Download 11,93 Kb.
Date conversion16.06.2018
Size11,93 Kb.

3.2 Picturing Distributions of Data

Definition

  • The distribution of a variable refers to the way its values are spread over all possible values. We can summarize a distribution in a table or show a distribution visually with a graph.
  • Discuss the shapes of the distributions displayed by these graphs.

http://mediamatters.org/

  • Tue, Mar 22, 2005 12:41pm ET
  • CNN.com posted misleading graph showing poll results on Schiavo case
  • In presenting the results of a CNN/USA Today/Gallup poll, CNN.com used a visually distorted graph* that falsely conveyed the impression that Democrats far outnumber Republicans and Independents in thinking the Florida state court was right to order Terri Schiavo's feeding tube removed. In fact, a majority of all three groups agrees with the court's decision, and the gap between Democrats on one hand and Republicans and Independents on the other is within the poll's margin of error.
  • According to the poll, conducted March 18-20, when asked if they "agree[d] with the court's decision to have the feeding tube removed," 62 percent of Democratic respondents agreed, compared to 54 percent of Republicans, and 54 percent of Independents. But these results were displayed along a very narrow scale of 10 percentage points, and thus appeared to show a large gap between Democrats and Republicans/Independents:
  • Laid out in this manner, the graph suggests that the gap between the two groups is overwhelming, rather than only 8 percentage points, within the poll's margin of error of +/- 7 percentage points. Also, this presentation obscures the poll's finding that majorities of all the groups sampled approved of the removal of Schiavo's feeding tube. A more accurate presentation of the poll's findings would have looked like this:
  • A reader tip from "Scott" contributed to this item. Thanks, and keep them coming mm-tips@mediamatters.org.

Number of People that Own a Pet

  • What is misleading about this graph?

How to Lie with Graphs: The NY Times as Real Estate Case Study

  • October 3, 2007 – 11:07 am | by Erik Hersman
  • The New York Times just released a new graph showing the housing bubble. The only problem is, they have intentionally skewed the way the chart reads to make their bubble look even bigger and more extreme.
  • Nat Torkington points this out:
  • “In effect, they’ve zoomed in on the area from 100-150 and magnified the growth in the last 15 years.”
  • We very well might be in a housing bubble, that doesn’t excuse the NY Times creation of a misleading and overly sensational chart.
  • The Calculated Risk blog breaks down the errors and omissions even further and then shows what the graph should really look like if the NY Times wasn’t intentionally trying to magnify the negatives:
  • Slide 3.2-
  • Important Labels for Graphs
  • Title/caption: Always have a title. Use caption and list the source if necessary.
  • Vertical scale and label: Numbers along the vertical axis should clearly indicate the scale. Be sure to label the axis.
  • Copyright © 2009 Pearson Education, Inc.
  • Important Labels for Graphs (cont.)
  • Horizontal scale and label: The categories and/or numbers should be indicated with a label and scale.
  • Legend: Include a legend with multiple data sets on the same graph.
  • Copyright © 2009 Pearson Education, Inc.
  • Bar graphs are commonly used for qualitative data.
  • Each bar represents the frequency (or relative frequency) of one category. The bars can be either vertical or horizontal.
  • Copyright © 2009 Pearson Education, Inc.
  • Because the highest frequency is 9 (the frequency for C grades), we chose to make the vertical scale run from 0 to 10. This ensures that even the tallest bar does not quite touch the top of the graph.
  • Let’s create a vertical bar graph from the essay grade data in Table 3.1.

Bar Graph

  • A bar graph can be used to show how a whole is divided into parts, but it can also compare quantities that are not parts of a whole.
  • Copyright © 2009 Pearson Education, Inc.
  • A dotplot is a variation on a bar graph in which we use dots rather than bars to represent the frequencies. Each dot represents one data value.
  • Figure 3.2 Dotplot for the essay grade data in Table 3.1.
  • Copyright © 2009 Pearson Education, Inc.
  • Pareto charts were invented by Italian economist Vilfredo Pareto (1848-1923). Pareto is best known for developing methods of analyzing income distributions, but his most important contributions probably were in developing new ways of applying mathematics and statistics to economic analysis.
  • Copyright © 2009 Pearson Education, Inc.
  • Pie Charts
  • Pie charts are usually used to show relative frequency distributions. Pie
  • charts are used
  • almost exclusively
  • for qualitative data.
  • Figure 3.5 Party affiliations of registered voters in Rochester County
  • Why is a pie chart not the correct graph to use to represent this data?
  • Copyright © 2009 Pearson Education, Inc.
  • Definition
  • A pie chart is a circle divided so that each wedge represents the relative frequency of a particular category. The wedge size is proportional to the relative frequency. The entire pie represents the total relative frequency of 100%.

Definition

  • A histogram is a bar graph showing a distribution for quantitative data (at the interval or ratio level of measurement); the bars have a natural order and the bar widths have specific meaning.
  • Copyright © 2009 Pearson Education, Inc.
  • Copyright © 2009 Pearson Education, Inc.
  • Figure 3.10 Stem-and-leaf plot showing numerical data—in this case, the per person carbon dioxide emissions from Table 3.11.
  • The stem-and-leaf plot (or stemplot) looks somewhat like a histogram turned sideways, except in place of bars we see a listing of data for each category.
  • Copyright © 2009 Pearson Education, Inc.
  • Line Charts
  • Definition
  • A line chart shows a distribution of quantitative data as a series of dots connected by lines. For each dot, the horizontal position is the center of the bin it represents and the vertical position is the frequency value for the bin.
  • Copyright © 2009 Pearson Education, Inc.
  • Figure 3.12 Line chart for the energy use data, with a histogram overlaid for comparison.
  • Slide 3.2-
  • Copyright © 2009 Pearson Education, Inc.
  • Definition
  • A histogram or line chart in which the horizontal axis represents time is called a time-series diagram.
  • Copyright © 2009 Pearson Education, Inc.
  • Figure 3.14 Time-series diagram for the homicide rate data of Table 3.14.
  • Do you see any trend in the homicide rate?

Time-series diagram

  • A line graph shows behavior over time.
  • Time is always on the horizontal axis.
  • Look for an overall pattern (trend).
  • Look for patterns that repeat at known regular intervals (seasonal variations).
  • Look for any striking deviations that might indicate unusual occurrences.
  • Look at stock prices over time of Proctor and Gamble Corporation.
  • http://finance.aol.com/charts/the-procter-and-gamble-company/pg/nys

Caution – Scale of Graph

  • Graphs are one of the most effective ways to communicate using data.
  • A good graph reveals the overall pattern or trend which may not be possible to detect from table.
  • The visual impression of a graph is much stronger than the impression made by data in numerical form.
  • In line graphs, changing the scale might change the impression about the trend.
  • Look at following two line graphs

Caution – Scale of Graph

    • Both these graphs represent number of unmarried couples (in thousands) from 1978 to 1998.
    • Just changing the scale gives different impression.
    • According to first graph, the increase in number of
    • unmarried couples has been gradual and according to the second graph, that increase is quite steep.
    • You have to be very careful about the scale of line graph
    • and visual impression it might give.


The database is protected by copyright ©sckool.org 2016
send message

    Main page