Jeffrey May – January 28, 2004 Rather than attempt a lengthy review of the literature to determine what information means to others, this short essay will define what information means to me before taking the Theory of Information Seminar given by Dr. Gurpreet Dhillon. This essay will undoubtedly serve as a learning tool as I compare it to my view on what information is now and what information is after this seminar is finished.
There is no doubt in my mind that information means many different things to many different people. For example, as I have crossed disciplines from an engineer to an Instructor and PhD student in the Information Systems Department at VCU, information has taken on two completely different meanings. As an engineer, information was anything that needed to be transferred as a signal from one place to another or any output from any type of measuring device. I remember problems that asked things such as how much information can be transferred across a line or what was the amplitude of the wave signal given a particular information source. Of course, being an engineer, it made complete sense to use the word information for these types of problems because I was not concerned with semantics. I was only concerned with the technical solution. Yet, as I have now entered into a career of being an Information Systems scientist, I now know that what the engineering world was calling information often times should have been called data.
As I have progressed through the Information Systems Department at VCU I have learned that a distinction should be made between information and data. Information usually but not always results from processing or conditioning data using various statistical and data manipulation techniques. In other words, information is used to reveal the true meaning of data so that accurate and timely decisions can be made. Most of the time raw data alone is useless until it is processed into meaningful information. For example, a water plant operator spends her day collecting data on various concentrations of constituents in the water. In most water plants, this data is then given to the manager where it is conditioned and manipulated to determine if the water is safe for drinking. The information gained from this data manipulation then leads to decisions such as increasing a certain chemical concentration in the water or backwashing a particular filter. Without the timely manipulation of this original data into information, the water plant would not be capable of producing safe drinking water for the public.
However, it can be argued that not all data is required to be manipulated before it becomes information used to make decisions. That is, sometimes data and information are exactly the same thing. For example, if one measures the temperature outside to be 15o F it probably does not require data manipulation to make the decision to wear a coat. Similarly, if a student scores a 40% on an exam, it probably does not require data manipulation for the professor to make the decision to give that student an F for that test. Yet, what if the average score of the entire class for this particular test was 30%? For this case it would have required this data manipulation to determine that the student who scored 40% probably did not deserve an F. Without this data manipulation, giving the student an F based on the raw data alone would have been a poor decision. The point here is that assuming too quickly that raw data automatically leads to accurate decisions can cause problems. In other words, information or what is thought to be information can be misleading.
In no arena is it more prominent that information can be misleading then that of politics. For example, political polls conducted to determine a particular opinion of the masses are sometimes used by politicians to help garner votes for a particular campaign or to pass a law. These polls are often times rather suspicious because they are conducted by phone and tend to survey a population of citizens that has been predetermined to answer the questions the way the politicians wanted them to be answered. Therefore, the scientific rigor of these polls is not evaluated nor does it matter to the politicians whether or not the information is misleading. It only matters that the information be what the politicians want it to be. The point is, that decision makers must always question what is meaningful and what is not meaningful information.
One way to determine the usefulness or validity of information is to examine the types of decision predicates used to create this information1. In other words, how objective or subjective is the data? Although all data can be useful, one can argue that information created from this data is more reliable if it is based on objective rather than subjective type decision predicates.
Examples of objective type decision predicates include data that come from either clinical or empirical sources. A clinical predicate would be something that can be directly measured. Error in the data is found with either the tool or the operator. If the operator of a measuring device is positive that no errors were made in either operation or calibration of the instrument, then the data that is extracted can be heavily relied on for leading to accurate information to make meaningful decisions.
An empirical predicate would be any data that is extracted from some type of experiment. Error in the data is not only found with the tool and operator, but also the type of statistical inference made with such data. In other words, there is usually some type of scaling or extrapolation technique used to extract empirical data. As a result, the statistical and internal validity of the data has to be heavily questioned before one can greatly rely on any decisions that would be made from the information created from empirical data. However, if an experiment is deemed to be valid, then the data can lead to very accurate information.
Information that comes from subjective type decision predicates such as rhetorical or normative data sources leads to more questionable information. A rhetorical predicate would be any data that comes from inputs crafted to support a particular position or conclusion. Obviously, politicians rely heavily on rhetorical type predicates for making decisions. However, it should not be concluded that information based on rhetorical type predicates is incorrect, only that it is questionable and that further evidence should be gathered if possible before making any final decisions.
A normative predicate would be any data that is rooted in ideology or sentiment. A normative decision predicate often times results from value driven judgments. Managers of corporations are sometimes guilty of making decisions based on information that has resulted from normative type predicates due to corporate culture. That is, decisions are made because information is more a product of corporate culture than it is of rational thinking and sound logic. Again, these types of decisions are very questionable in terms of their accuracy.
In conclusion, this essay defines information to be the meaningful product of data used to make accurate and timely decisions. Information usually but not always is produced from conditioning data using various statistical and data manipulation techniques. Because information is the key element in decision-making, this essay then discusses the need for decision makers to examine the correctness of any information used to make decisions by considering the types of data sources that produced this information. In other words, even though information is the meaningful product of data, this does not mean that information is meaningful.
1 Source: Discussion from a seminar on Intelligent Agents, Info 790, Fall, 2003, Virginia Commonwealth University, Instructor: Dr. John Sutherland.