Statisticians gain information about a particular situation by collecting data for random variables.
Types of Data (variables)
Variables that can be placed into distinct categories, according to some characteristics or attribute.
E.g.: Gender , color, religion , workplace and etc
It is numerical in nature and can be ordered or ranked.
A quantitative variable may be one of two kinds:
Discrete variable – a variable that can be counted or for which there is a fixed set of values. Example: the number of children in a family, the number of students in a class and etc
Continuous variable – a variable that can be measured on continuous scale , the result depending on the precision of the measuring instrument, or the accuracy of the observer. Continuous variable can assume all values between any two specific values. Example: temperatures, heights, weights, time taken and etc.
Variables can be classified by how they are categorized, counted or measured. Data/ variables can be classified according to the LEVEL OF MEASUREMENT as follows:
Nominal Level Data: - classifies data (persons/objects) into two or more categories. Whatever the basis for classification, a person can only be in one category and members of a given category have a common set of characteristics.
The lowest level of measurement.
No ranking/order can be placed on the data
E.g. : Gender (Male / Female) , Type of school (Public / Private), Height (Tall/Short) , etc
Ordinal Level Data:- classifies data into categories that can be ranked; however precise differences between the ranks do not exist.
This type of measuring scale puts the data/subjects in order from highest to lowest, from most to least. It does not indicate how much higher or how much better. Intervals between ranks are not equal.
E.g.: Letter grades (A,B,C,D,E,F) ; Man’s build (small, medium, or large)-large variation exists among the individuals in each class.
Interval Level Data:- has all characteristics of a nominal and ordinal scale but in addition it is based upon predetermined equal interval. It has no true zero point (ratio between number on the scale are not meaningful). E.g.:
Achievement test; aptitude tests, IQ test. A one point difference between IQ test of 110 and an IQ of 111 gives a significant difference.
The Fahrenheit scale is a clear example of the interval scale of measurement. Thus, 60 degree Fahrenheit or -10 degrees Fahrenheit represent interval data. Measurement of Sea Level is another example of an interval scale. With each of these scales there are direct, measurable quantities with equality of units. In addition, zero does not represent the absolute lowest value. Rather, it is point on the scale with numbers both above and below it (for example, -10degrees Fahrenheit).
Ratio Level Data:- possesses all the characteristics of interval scale and in addition it has a meaningful (true zero point). True ratios exist when the same variable is measured on two different members of the population.
The highest, most precise level of measurement.
E.g.: Weight, number of calls received; height.
3.1.3 Data collection and Sampling Techniques
Sampling is the process of selecting a number of individuals for a study in such a way that the individuals represent the larger group from which they were selected.
The purpose of sampling is to use a sample to gain information about a population.
Random Sampling: subjects are selected by random numbers.
Systematic Sampling: Subjects are selected by using every kth number after the first subject is randomly from 1 through k.
Stratified Sampling: Subjects are selected by dividing up the population into groups (strata) and subjects within groups are randomly selected.
- E.g.: We divide the population into 5 group then we take the subjects from each group to become our sample.
Cluster Sampling: Subjects are selected by using an intact group that is representative of the population.
E.g.: We divide the population into 5 group then we take 2 groups to become our sample. That means 2 group of subject represent 5 groups of subjects.
A ) Classify each set of data as discrete or continuous.
1) The number of suitcases lost by an airline.
2) The height of corn plants.
3) The number of ears of corn produced.
4) The number of green M&M's in a bag.
5) The time it takes for a car battery to die.
6) The production of tomatoes by weight.
B) Identify the following as nominal level, ordinal level, interval level, or ratio level data.
1) Percentage scores on a Math exam.
2) Letter grades on an English essay.
3) Flavors of yogurt.
4) Instructors classified as: Easy, Difficult or Impossible.
5) Employee evaluations classified as : Excellent, Average, Poor.
7) Political parties.
8) Commuting times to school.
9) Years (AD) of important historical events.
10) Ages (in years) of statistics students.
11) Ice cream flavor preference.
12) Amount of money in savings accounts.
13) Students classified by their reading ability: Above average, Below average, Normal.
3.2 HISTOGRAMS, FREQUENCY POLYGONS AND OGIVES
For 108 randomly selected college applicants, the following frequency distribution for entrance exam scores was obtained.