Why did mathematicians switch their definition of a function from Bernoulli’s definition to the one given in the modern set-theoretic view?
**-Matt Insall**
This essay includes the discussion of several topics that may have played a role in the choice made by many mathematicians in the late nineteenth century, and almost wholly in the twentieth century to abandon the definition Bernoulli used for the concept of a function, and adopt the very abstract, set-theoretic one we teach our graduate students now.
Let me ``play'' with History a bit, by discussing the possibility of writing some fiction around this historical question. I shall imagine that I am a mathematician who has discovered that Bernoulli's definition does not work for me, and that I wish to demonstrate to the mathematical community why they should not accept his definition either. To do so, I would need to write a paper in which I present something a mathematician of the times of Bernoulli would refer to as a function, but which fails to satisfy Bernoulli's definition of a function. (In fact, the best of all possible worlds for showing that the mathematical community should abandon Bernoulli's definition would be to find a work of Bernoulli in which my example, or a very similar example, is given, and in which Bernoulli refers to it as a function. In this case, I will have provided an example of a ``Bernoulli function'' that fails to be a ``Bernoulli function''.) But, let us imagine that, during Bernoulli's lifetime, I am also writing my work down in a style that will not exist until the twentieth century.
I will begin with the Bernoulli concept of a function, but I shall, in a very twentieth-century manner, write my definition out explicitly, and number it, so it may be easily recalled at a later time, and so that careful delineations may be made.
Definition 1: Let X and Y be sets, and let f be a relation on the Cartesian product of X with Y. Then f is a ``Bernoulli function'' provided that there is an expression formed from variables and constants that gives a rule for one to decide for a given x in X which y in Y is paired with x in the relation f, and there is only one such y for each x in X for which there is such a y in Y.
I would also write a definition of function that pleases me more for my
specific application, and I would call such things ``functions'', and show that every ``Bernoulli function'' is a ``function'' (according to my definition). Thus, in such a paper, I might write the following:
Definition 2: Let X and Y be sets, and let f be a relation on the Cartesian product of X with Y. Then f is a ``function'' provided that for each x in X, there is at most one element y of Y such that x is related to y via f.
Now, as I seem to have heard, the expressions Fourier wanted to use to
define functions were Sine and Cosine series. I also heard that he boasted that any function from the reals to the reals can be so expressed, and this prompted another mathematician (Was is Dirichlet or Lagrange or Cauchy? I cannot remember.) to present an example of a function from the reals to the reals which has an everywhere divergent cosine series and an everywhere divergent sine series, or of a function f whose sine and cosine series converge everywhere to some function other than f. Others on this list may be able to confirm or correct my account, or fill in some details. The example given satisfies definition 2, but not definition 1. Or does it? I will show that there is a sense in which Bernoulli's definition is equivalent to mine in a trivial, but wholly unsatisfactory, way: To do so requires that I investigate what I would mean by ``expression'', in Bernoulli's definition. This has been looked at quite a bit in the twentieth century.
If we check through the work of Bernoulli's time, I think we will find that the term ``expression’’ was used quite loosely. I approach it from a logic or universal algebra perspective. In logic, one considers the symbols of a language in a very formal way. Similarly, in universal algebra, language is highly formalized, so that there is actually a definition for the term ``expression'', and in both logic and universal algebra, the objects that are called ``expressions'' model quite accurately what one means when one uses that term in everyday mathematics. Putting all this together, I can ``show'' that if one accepts Bernoulli's definition of function for Sine and Cosine series, and then one is faced with an example of a specific function we shall call ``the strange function'', that is not representable as a Sine series or as a Cosine series, then one may easily demonstrate that the strange function is a Bernoulli function. (Statements of theorems will be in some cases somewhat informal, as was done in the days of Bernoulli.)
Theorem 1: Let X be the set of real numbers, and let S denote the strange function. Let L denote the language for analysis in which Sine and Cosine series are studied, in such a way that it is reasonable to refer to any function that has either a Sine series representation or a Cosine series representation as a Bernoulli function, and let E denote the collection of expressions of the language L. Then there is a language, for analysis, L', with collection of expressions E' with the following properties:
(i) Every expression in E is also in E'.
(ii) the strange function is a Bernoulli function in the language L',
meaning that in E', there is an expression that defines the strange
function.
The proof of this theorem is quite obvious to a beginning logic student, I would expect, so I shall not write it out in a very formal way. I'll outline how it goes briefly: Let the symbol S that denotes the strange function be appended to the language L as a constant, and then refer to the newly obtained language as the language L'.
Now, there is a philosophically burning point here that has been overlooked. In particular, the language L' is obtained by fiat, and in some sense this is unsatisfying. But, in fact, it seems to me that this theorem highlights a different kind of troubling nature about the Bernoulli definition of a function that is philosophically related to all the classical problems that are similar in some way to the Sorites problems. In particular, the definition 1 is ambiguous. I can argue that it should even have appeared overly ambiguous at the time of Bernoulli to those who care about reducing ambiguity. For in fact, the mere observation that someone as brilliant as Fourier missed out on the construction of the strange function indicates to me that he fell victim to the ambiguity of definition 1, which at the time was essentially embodied in the inherent ambiguity in the definition of the notion of a function, as it was used at that time, and still is used in many books outside rigourous mathematics.
Consider the problem that is mentioned in many first-year calculus books, for which an answer is given, but a solution is never presented:
``If possible, express the anti-derivative of e^(-x^2) in terms of
elementary functions.''
In fact, calculus books point out that this problem is insoluble, in the sense that there is no ``expression'' for the anti-derivative of e^(-x^2) in terms of elementary functions. The text we now use for calculus (Stewart), includes a section in which the students form new language, for computing and approximating integrals, that is analogous to the language L' in theorem 1 above, if the ``language of elementary functions'' is chosen as being represented by L, and the corresponding theorem would merely indicate that it is possible to be coherent when writing such a section into a calculus text, because the question of whether there is a solution to a problem is relative to the tools at hand. (Given only the language of analysis of elementary functions, the ``collection of expressions'' does not include the antiderivative of e^(-x^2), even though in that language, one may fairly easily demonstrate that such an anti-derivative exists. Given the powerful
language of analysis of trigonometric series, many applications problems can be solved, and perhaps Fourier would have been considered to be correct had he said ``every function applicable to some physical process can be expressed in a Sine or Cosine series'', for I do not think that the strange function appears very often as anything more than a curiosity in the physical sciences.)
Now, if we take Bernoulli seriously, but grant him the courtesy of anyone who is human - he had something worthwhile in mind, even if he did not properly express it - then we might say that what Bernoulli meant by the term ``expression'' was something like the meaning we have used above, but with the fundamental building blocks of his ``language of analysis'' being the same ones that were used by, for instance, Baire, in classifying functions according to their level of complexity as limits of functions ``previously described''. In this case, I would say that he would include the strange function as an expression, for pointwise limits were taken all the time, and, as I recall, the strange function is constructed using those very tools of analysis, such as pointwise limits of sequences of functions defined on the real line.
Goedel and Cohen each dealt this problem a tremendous blow with his discovery of the sxiom of constructability. In this study, he proved that if the theory ZF is consistent, then so is ZFC+GCH. But he went further, and designed a set theoretic universe in which every set is given by expressions in terms of previously defined sets. The axiom of constructability, denoted V=L, states that every set is constructible. His actual result was that ZF+V=L is consistent, as long as ZF is consistent. In models of ZF+V=L, every set, relation and function is an expression, so this brings together the definit8ions 1 and 2 completely. However, also, Cohen showed that if ZF is consistent, then so is ZF+not(V=L). That is, it is also okay to assume that definitions 1 and 2 describe different functions. Now the ``expressions’’ are not at all what Bernoulli and Fourier envisioned. They are developed in terms of transfinite ordinals, and a well-ordering of the universe ensues. This makes the assumption that everything is given by an expression take on a new unintended meaning, by creating a setting in which many counterintuitive results can be proved, because of the Generalized Continuum Hypothesis (GCH) and the definable well-ordering of the universe of sets. Goedel and Cohen, and now other mathematicians, such as Woodin, have decided that the axiom of constructibility goes too far, and, for example, Woodin suggests that the Continuum Hypothesis (CH) should be violated in such a way that the cardinality of the continuum is aleph_2.
I take a different stance. I accept ZFC, but I reject the axiom of constructability strongly. I contend that in some sense, ``most’’ functions (and sets and relations, etc) are not given by expressions.
In a sense, I deny the axiom of constructability ``locally everywhere’’. In particular, I suggest that the cardinality of the continuum should be aleph_{2^aleph_0}. This is as large as it can theoretically be, and this means that the types of subsets of the real line are as varied as they can possibly be, in any model of ZFC. |