[to be published in the proceedings of 8’ ISKO conference: London: 13-16 July 2004;
many thanks for precious suggestions to Rossana Morriello, Riccardo Ridi, Aida Slavic, Lorena Zuccolo]
University of Pavia. Mathematics department. Library
Naturalism vs. pragmatism in knowledge organization
Abstract: Several authors remark that categories used in languages, including indexing ones, are affected by cultural biases, and does not reflect reality in an objective way. Hence knowledge organization would essentially be determined by pragmatic factors. However, human categories are connected with the structure of reality through biological bonds, and this allows for a naturalistic approach too. Naturalism has been adopted by Farradane in proposing relational categories, and by Dahlberg and the CRG in applying the theory of integrative levels to general classification schemes. The latter is especially relevant for possibile developments in making the structure of schemes independent from disciplines, and in applying it to digital information retrieval.
Organizing the whole knowledge in a single consistent scheme is an old dream. But, can such consistency result from founding the scheme on some natural order of the world, or should we rather resign ourselves to the fact that categories must be determined only by pragmatic factors? Remarks on partiality and relativism of categories used in knowledge organization systems are a sound warning to avoid socio-cultural bias and discrimination (Bowker & Star, 1999; Olson, 2002). Should we also conclude from this that no set of categories is better than any other?
In a naturalistic approach, any knowledge element is considered as part of one general picture, that is, of our general representation of the world as we know it. This representation is obviously far incomplete, and will keep improving and changing, without ever reaching any conclusion. Still, modern scientific research deals with knowledge more and more in terms of a unitary frame, with no fixed border between disciplines, nor any domain being separate and independent from the others. This is quite in contrast with traditional divisions of disciplines, on which most knowledge organization schemes are based.
Therefore, one can wonder about the possibility of a naturalistic representation of knowledge, and its application to the document organization. A look at past attempts of a naturalistic approach in constructing artificial languages, including indexing ones, as well as criticisms on them, may throw some light on the problem, before discussing the role of naturalism in contemporary KO research.
2: Philosophical languages
In 17th century, René Descartes wished for a “philosophical language”, able to reflect the order of all human thoughts (Descartes, 1953). Such a project was really pursued by Lodwick, Dalgarno (1968), Wilkins (1968) and others, who devised languages where each letter of a word expressed a concept with a given degree of specificity (Eco, 1993), much like digits in the DDC. Although their systems looked very rational, the categories in their semantic trees were obviously biased by the culture of the time. This was well shown by Borges (1999) in a famous quotation:
«Once we have defined Wilkins’ procedure, it is time to examine a problem which could be impossible or at least difficult to postpone: the value of this four-level table which is the base of the language. Let us consider the eighth category, the category of stones. Wilkins divides them into common (silica, gravel, schist), modics (marble, amber, coral), precious (pearl, opal), transparent (amethyst, sapphire) and insolubles (chalk, arsenic). Almost as surprising as the eighth, is the ninth category. This one reveals to us that metals can be imperfect (cinnabar, mercury), artificial (bronze, brass), recremental (filings, rust) and natural (gold, tin, copper). Beauty belongs to the sixteenth category; it is a living brood fish, an oblong one.»
According to Borges, the same kind of bias could be found even in a modern bibliographic classification such as UDC (actually his examples refer to 3-digit classes for religion, which were inherited from the schema of DDC; later, in UDC they have been changed into a more balanced division). Generally speaking, such examples suggest us to be skeptical about the absolute knowledge value of any language, including classifications:
«it is clear that there is no classification of the Universe not being arbitrary and full of conjectures. [...] If there is a universe, it’s aim is not conjectured yet; we have not yet conjectured the words, the definitions, the etymologies, the synonyms, from the secret dictionary of God.»
3: International auxiliary languages
At the beginning of the 19th century, some projects of international auxiliary languages (IALs) spread, the most successful being Esperanto. Their approach was mainly pragmatic, as they tried to reduce the grammatical complexity and vocabulary diversity of natural languages, in order to be easily learned by anyone.
In recent decades, much work has been done on the construction and experimentation of international logical languages, such as Loglan and Lojban, featuring unusual grammars based on predicative logic (Cooke Brown, 1989; Cowan, 1997). People using them should be helped to think in a more neat and neutral way, as a test of the Sapir-Whorf hypothesis, claiming that thought is influenced by linguistic categories. On a similar line, in a fiction context Elgin (1985) imagined a language, called Láadan, privileging womanly categories and aesthetics in both grammar and phonology, in opposition to the social dominance of manly ones.
Recent studies have shown some ways in which people having different native languages tend to classify things in different ways, though it is arguable whether these details are really important in perception overall (Motluk, 2002).
According to Black (1968), «the conception of an “ideal language”, perfectly conforming to the nature of reality, is a will-o’-the-wisp that leads nowhere except into futility», due to limitations intrinsic to language. In Eco’s view, both philosophical and international languages are to be interpreted as attempts of the perennial utopia of a perfect language: indeed, any artificial language would be no more than an expression of what is relevant to the spirit of its time (Eco, 1995). In this sense, a sign of the importance of entertainment in our society is provided by the current popularity of fantasy languages from “Star trek” and “The lord of the rings” (Klingon and Quenya respectively; the former can boast of a native speaker, a translation of “Hamlet”, and a Google interface).
The kinds of artificial languages mentioned above share some interesting properties with indexing languages for document retrieval.
Borges’s accusation of cultural relativity is echoed by contemporary authors concerned with the cultural and epistemological context in which knowledge organization systems have been developed. Hjørland & Albrectsen (1999) notice how, in bibliographic classifications, «an organization reflecti[ng] disciplinary organization and thus human interests reflects historist and pragmatic views of knowledge». According to these authors, knowledge organization cannot be neutral, as it is necessarily an expression of some epistemological assumption – quite in the same way as the perfect language is illusory in Eco’s view. Rationalistic approaches implied in 20th century classifications would have proved to have important limitations, therefore a more pragmatic approach could be adopted in the organization of knowledge for the specific needs of its users (Hjørland & Albrectsen, 1999, p. 134-135).
A similar perspective is adopted by the authors of the EDAMOK project, aiming at building a distributed knowledge management system able to translate one scheme into another according to user preferences (Bonifacio et al., 2002). One can notice that, although it is claimed that no scheme is preferred to any other, it is still necessary to choose a central scheme (WordNet in this case) to act as a bridge language for translation, and this must well be based on some given categorical structure.
Another shared property is that the fortune of a language depends more on contingent factors than on its absolute merits. Successful languages, like Esperanto among IALs, LCC and DDC among bibliographic classifications, originated more than a century ago, so they cannot be framed according to the most recent developments of research in their fields. On the other hand, they have the key advantages of being used and documented worldwide. Therefore, knowledge organizers can choose the pragmatic option by adopting the most widespread scheme, rather than the technically best one (Gnoli, 2003a).
A naturalistic approach can try to found the principles for knowledge organization either on the categories of perception (epistemology), or on the structure of reality itself as we know it (ontology). Most of the 20th century philosophy has emphasized the epistemological approach, while forgotting the ontological one or mistaking it for the former (Poli, 1996; 2001).
The problem of the relativity of categories in perception was clearly faced by Bertalanffy (1955), with reference to his general theory of systems. One factor of relativity is the difference between cultures having developed independently for a long time, so that they classify the world in different ways. Another, more basic factor is that, just as any other animal species, Man is able to perceive certain elements of the real world, which are relevant for his life, but is unaware of many others not perceived by his organs (like ultrasounds, which are perceived by bats instead).
On the other hand, the basic categories of perception and thinking (i.e. Kant’s a-priori) must be founded in some way on features of the real world: otherwise, they would have been disadvantageous for life, hence would have not evolved. This argument of hypothetical realism, having its foundation in biological evolution, is supported by Campbell (1974), Lorenz (1977) and Popper (1974). Hypothetical realism is a strong reason for trusting the categories of human knowledge as being strictly related to the structure of reality, though not reflecting it in a perfect and complete way. So, according to Bertalanffy’s perspectivism, languages do use different ways to describe reality, but converge towards it.
The categories of perception were considered as a good foundation for knowledge organization by Farradane, who searched the psychological bases of indexing. In his view, «classification is not some part of an external “reality” waiting to be discovered; it is an intellectual operation upon mental entities or concepts» (Farradane, 1961). By combining the basic relational categories distinct / not distinct / undefined and temporary / fixed / undefined, he identified 9 basic operators, by which subjects could be coordinated through a special notation (Farradane, 1952). Farradane’s approach was also naturalistic in that he claimed that classification should grow inductively from elementary concepts, just as science does, rather than being based on a-priori divisions. His writings express an optimistic confidence in the scientific approach, reminding that of positivist thinkers.
Another scientific approach has been undertaken by Dahlberg. According to her, the units of classification systems are concepts, with their characteristics and relations. Relations are grouped into formal, categorial, and material, and mostly correspond to Aristotle’s categories. Concepts are formed in our mind by perception of the world, and are trustable on the basis of hypothetical realism (Dahlberg, 1978).
6: Naturalistic ontology: integrative levels
As for the ontological side, Dahlberg acknowledged as relevant the notion that reality is structured into a series of integrative levels – those of forms, atoms, molecules, crystals, cells, organisms, populations, etc. – each based on the lower ones but also showing new emergent properties not shared with them.
The notion of integrative levels is behind the sequence of main classes both of Dahlberg’s International Coding Classification, and of Bliss Bibliographic Classification (BC2). This idea, though not being identified with a precise philosophical school, has appeared in the work of various philosophers since the 19th century. In 20th century it has been especially developed in the German-speaking area by Hartmann (1964), who was a source for Dahlberg, and in the English-speaking area by Feibleman (1954), who was a source for the British Classification Research Group (CRG). CRG especially discussed details and problems in applying integrative levels to a new general classification scheme, and also realized a first draft of it (CRG, 1969). Unfortunately, that was not further developed, as the CRG got involved in other projects such as PRECIS and BC2.
Application of integrative levels in knowledge organization was criticized (Huckaby, 1972), especially for fitting only scientific disciplines but not humanistic ones. CRG members acknowledged that the levels of artifacts and “mentefacts” were less clearly recognized than those of the physical world. However, this does not prevent from placing technological, psychological, and even spiritual levels above the other ones, as it happens in Hartmann’s system indeed (Gnoli & Poli, submitted).
Therefore, integrative levels have not been fully explored yet, nor tested for real use in a knowledge organization system, apart from main class order of some schemes. However, as a naturalistic structuring principle, they can produce several interesting features.
One of them is that they can be applied to phenomena rather than disciplines. In such a scheme, each phenomenon – e.g. “horse” – would have an objective place of unique definition, as Farradane calls it (CRG, 1969) – in this example the level of animals – whatever the disciplines dealing with it – genetics, zoology, veterinary medicine, history of transports, military science, horse-racing... So the phenomenon would not be scattered into several disciplines, as it happens in general classifications by disciplines. This would offer a possible solution to the increase of multidisciplinary documents, which is discussed as one of major issues in recent classification research (Beghtol, 1998; Hjørland & Albrechtsen, 1999).
Any document subject contains concepts referring to one or more phenomena. Such concepts could be combined in a fixed predictable order, according to the inverted natural sequence of levels, namely the chronological, evolutionary and historical sequence in their appearance, like in faceted classification facets are combined according to the standard citation order. E.g., “genetics of the horse” would be expressed in the sequence horses – genes, as horses are defined at a higher level than genes. Since horses and genes belong to different levels, their combination is analogous to phase relation or to subject device in disciplinary classifications (Ranganathan, 1967, p. 346; Gatto, pers. comm.).
A second relevant feature of this kind of system is that the notation for each concept can be used in a string, like if it were a keyword, separated from the other compounding concepts simply by a blank space (as it happens in special schemes such as Mathematics Subject Classification). Here, however, a strong unifying property will be the sequence of levels. Such feature would be especially suitable for retrieval in a digital environment, because any concept could be treated as a single word (Gnoli, 2003b). An alternative option would be to specify several kinds of relations, as it was in the CRG draft. This would allow a more sophisticated retrieval, but at the price of adding complexity to the notation, which is not a desirable property (Zuccolo, pers. comm.) as shown by experience with users.
A third notable feature is, so to speak, more philosophical: in such a scheme, it could be made syntactically impossible to attribute properties typical of higher levels, e.g. “soul”, to lower phenomena, e.g. a rock or a place, as it happens in the vitalistic fallacies recognized by Hartmann (1964) and Lorenz (1977). Beliefs such as animism could be instead described at the higher level of religious and philosophical schools. While this fact could satisfy scientific users, it makes clear that even a naturalistic classification would be influenced by the state of knowledge at the time in which it has been conceived. So, it would not make obsolete the problem of finding clever ways to provide for the representation of future knowledge not yet discovered.
The cases presented above show that some dialectic between naturalism and pragmatism has always been present in knowledge organization. Although pragmatic needs must always be kept in mind, it seems that a naturalistic approach is not completely utopian. Reference to the structure of reality, as modeled by ontology, could act as a unifying criterion for general schemes, instead of treating each discipline as a separate universe, as done in classifications by disciplines, including faceted ones. Room seems to be left for development and experimentation in this direction.
Beghtol Clare, 1998. Knowledge domains: multidisciplinarity and bibliographic classification systems. “Knowledge organization”, 25, n. 1-2, p. 1-12.
Bertalanffy Ludwig von, 1955. An essay on the relativity of categories. “Philososophy of science”, 22, p. 243-263. Re-published in: “General systems”, 7: 1962, p. 71-83.
Black Max, 1968. Can language reflect reality? In: The labyrinth of language. London: Pall Mall. Re-published in: The artificial language lab, Richard K. Harrison ed. .
Bonifacio Matteo, Bouquet Paolo & Cuel Roberta, 2002. The role of classification(s) in distributed knowledge management. In: Proceedings of the 6th International conference on Knowledge-based intelligent information engineering systems and allied technologies (KES2002), Amsterdam, 2002. .
Borges Jorge Luis, 1999. The analytical language of John Wilkins; translated by Lilia Graciela Vázquez. In: Alamut . Orig. ed.: El idioma analítico de John Wilkins, in: Otras inquisiciones, Buenos Aires: Sur, 1952.
Bowker Geoffrey C. & Star Susan Leigh, 1999. Sorting things out: classification and its consequences. Cambridge (Mass.): MIT.
Campbell Donald T., 1974. Evolutionaryepistemology. In: The philosophy of K.R. Popper. La Salle: Open Court, p. 413-463.
Cooke Brown James, 1989. Loglan 1: a logical language. 4th ed. Gainesville: The Loglan institute.
Cowan John Woldemar, 1997. The complete Lojban language. Fairfax: The Logical language group.
CRG, 1969. Classification and information control. London: Library association.
Dahlberg Ingetraut, 1978. Ontical structures and universal classification. Bangalore: Sarada Ranganathan endowment for library science.
Dalgarno Georges, 1968. Ars signorum. Menston: Scholar.
Descartes René, 1953. Lettre au père Marin Mersenne, 20 novembre 1629. In: Oeuvres et lettres. Paris: Gallimard.
Eco Umberto, 1995. The search for the perfect language. Oxford: Blackwell.
Farradane Jason E.L., 1952. A scientific theory of classification and indexing: further considerations. “Journal of documentation”, 8, n. 2, p. 73-92.
Farradane Jason E.L., 1961. Fundamental fallacies and new needs in classification. In: The Sayers memorial volume, D.J. Foskett & B.I. Palmer eds. London: Library association, p. 120-135.
Feibleman James K., 1954. The integrative levels in nature. “British journal for the philosophy of science”, 1954, 5. Also in: Focus on information, Barbara Kyle ed. London: ASLIB, 1965, p. 27-41.
Foskett Douglas J., 1978. The theory of integrative levels and its relevance to the design of information systems. “ASLIB proceedings”, 30, n. 6, p. 202-208.
Gnoli Claudio, 2003a. La classificazione come investimento nella qualità dell’informazione. “AIB-web. Contributi” .
Gnoli Claudio, 2003b. Coordinazione, ordine di citazione e livelli integrativi in ambiente digitale. “Bibliotime”, 6, n. 1 .
Gnoli Claudio & Poli Roberto. Levels of reality and levels of representation. Submitted for publication.
Hartmann Nicolai, 1964. Die Aufbau der realen Welt. Berlin: De Gruyter.
Hjørland Birger & Albrechtsen Hanne, 1999. An analysis of some trends in classification research. “Knowledge organization”, 26, n. 3, p. 131-139.
Huckaby Sarah A.S., 1972. An enquiry into the theory of integrative levels as the basis for a generalized classification scheme. “Journal of documentation”, 28, n. 2, p. 97-106.
Lorenz Konrad Z., 1977. Behind the mirror: a search for a natural history of human knowledge. London: Methuen. Orig. ed.: Die Rückseite des Spiegels, München: Piper, 1973.
Motluk Alison, 2002. You are what you speak. “New scientist”, 176, n. 2371, p. 34.
Olson Hope, 2002. The power to name: locating the limits of subject representation in libraries. Dordrecht: Kluwer.
Poli Roberto, 1996. Ontology for knowledge organization. In: Knowledge organization and change: proceedings of 4th International ISKO conference, Washington (DC), July 15-18 1996, R. Green ed. Frankfurt: Indeks, p. 313-319.
Poli Roberto, 2001. The basic problem of the theory of levels of reality. “Axiomathes”, 12, n. 3-4, p. 261-283. .
Popper Karl R., 1974. Objective knowledge: an evolutionary approach. Oxford: Clarendon.
Ranganathan Shiyali Ramamrita, 1967. Prolegomena to library classification. 3rd ed. Bangalore: Sarada Ranganathan endowment for library science.
Wilkins John, 1968. An essay towards a real character and a philosophical language, 1668. Menston: Scholar.