8  Functions

If \(X\) and \(Y\) are sets, a function from (or on) \(X\) to (or into) \(Y\) is a relation \(f\) such that \(\text{dom } f = X\) and such that for each \(x\) in \(X\) there is a unique element \(y\) in \(Y\) with \((x, y) \in f\). The uniqueness condition can be formulated explicitly as follows: if \((x, y) \in f\) and \((x, z) \in f\), then \(y = z\). For each \(x\) in \(X\), the unique \(y\) in \(Y\) such that \((x, y) \in f\) is denoted by \(f(x)\). For functions this notation and its minor variants supersede the others used for more general relations; from now on, if \(f\) is a function, we shall write \(f(x) = y\) instead of \((x, y) \in f\) or \(xfy\). The element \(y\) is called the value that the function \(f\) assumes (or takes on) at the argument \(x\); equivalently we may say that \(f\) sends or maps or transforms \(x\) into \(y\). The words map or mapping, transformation, correspondence, and operator are among some of the many that are sometimes used as synonyms for function. The symbol

\[ f: X \rightarrow Y \]

is sometimes used as an abbreviation for “\(f\) is a function from \(X\) to \(Y\).” The set of all functions \(X\) to \(Y\) is a subset of the power set \(\mathcal{P}(X \times Y)\); it will be denoted by \(Y^X\).

The connotations of activity suggested by the synonyms listed above make some scholars dissatisfied with the definition according to which function does not do anything but merely is. This dissatisfaction is reflected in a different use of the vocabulary: function is reserved for the undefined object that is somehow active, and the set of ordered pairs that we have called the function is then called the graph of the function. It is easy to find examples of functions in the precise set-theoretic sense of the word in both mathematics and everyday life; all we have to look for is information, not necessarily numerical, in tabulated form. One example is a city directory; the arguments of the function are, in this case, the inhabitants of the city, and the values are their addresses.

For relations in general, and hence for functions in particular, we have defined the concepts of domain and range. The domain of a function \(f\) from \(X\) into \(Y\) is, by definition, equal to \(X\), but its range need not be equal to \(Y\); the range consists of those elements \(y\) of \(Y\) for which there exists an \(x\) in \(X\) such that \(f(x) = y\). If the range of \(f\) is equal to \(Y\), we say that \(f\) maps \(X\) onto \(Y\). If \(A\) is a subset of \(X\), we may want to consider the set of all those elements \(y\) of \(Y\) for which there exists an \(x\) in the subset \(A\) such that \(f(x) = y\). This subset of \(Y\) is called the image of \(A\) under \(f\) and is frequently denoted by \(f(A)\). The notation is bad but not catastrophic. What is bad about it is that if \(A\) happens to be both an element of \(X\) and a subset of \(X\) (an unlikely situation, but far from an impossible one), then the symbol \(f(A)\) is ambiguous. Does it mean the value of \(f\) at \(A\) or does it mean the set of values of \(f\) at the elements of \(A\)? Following normal mathematical custom, we shall use the bad notation, relying on context, and, on the rare occasions when it is necessary, adding verbal stipulations, to avoid confusion. Note that the image of \(X\) itself is the range of \(f\); the “onto” character of \(f\) can be expressed by writing \(f(X) = Y\).

If \(X\) is a subset of a set \(Y\), the function \(f\) defined by \(f(x) = x\) for each \(x\) in \(X\) is called the inclusion map (or embedding, or the injection) of \(X\) into \(Y\). The phrase “the function \(f\) defined by …” is a very common one in such contexts. It is intended to imply, of course, that there does indeed exist a unique function satisfying the stated condition. In the special case at hand this is obvious enough; we are being invited to consider the set of all those ordered pairs \((x, y)\) in \(X \times Y\) for which \(x = y\). Similar considerations apply in every case, and, following normal mathematical practice, we shall usually describe a function by describing its value \(y\) at each argument \(x\). Such a description is sometimes longer and more cumbersome than a direct description of the set (of ordered pairs) involved, but, nevertheless, most mathematicians regard the argument-value description as more perspicuous than any other.

The inclusion map of \(X\) into \(X\) is called the identity map on \(X\). (In the language of relations, the identity map on \(X\) is the same as the relation of equality in \(X\).) If, as before, \(X \subset Y\), then there is a connection between the inclusion map of \(X\) into \(Y\) and the identity map on \(Y\); that connection is a special case of a general procedure for making small functions out of large ones. If \(f\) is a function from \(Y\) to \(Z\), say, and if \(X\) is a subset of \(Y\), then there is a natural way of constructing a function \(g\) from \(X\) to \(Z\); define \(g(x)\) to be equal to \(f(x)\) for each \(x\) in \(X\). The function \(g\) is called the restriction of \(f\) to \(X\), and \(f\) is called an extension of \(g\) to \(Y\); it is customary to write \(g = f \mid X\). The definition of restriction can be expressed by writing \((f \mid X)(x) = f(x)\) for each \(x\) in \(X\); observe also that \(\text{ran }(f \mid X) = f(X)\). The inclusion map of a subset of \(Y\) is the restriction to that subset of the identity map on \(Y\).

Here is a simple but useful example of a function. Consider any two sets \(X\) and \(Y\), and define a function \(f\) from \(X \times Y\) onto \(X\) by writing \(f(x,y) = x\). (The purist will have noted that we should have written \(f((x, y))\) instead of \(f(x, y)\), but nobody ever does.) The function \(f\) is called the projection from \(X \times Y\) onto \(X\); if, similarly, \(g(x, y) = y\), then \(g\) is the projection from \(X \times Y\) onto \(Y\). The terminology here is at variance with an earlier one, but not badly. If \(R = X \times Y\), then what was earlier called the projection of \(R\) onto the first coordinate is, in the present language, the range of the projection \(f\).

A more complicated and correspondingly more valuable example of a function can be obtained as follows. Suppose \(R\) is an equivalence relation in \(X\), and let \(f\) be the function from \(X\) onto \(X/R\) defined by \(f(x) = x/R\). The function \(f\) is sometimes called the canonical map from \(X\) to \(X/R\).

If \(f\) is an arbitrary function, from \(X\) onto \(Y\), then there is a natural way of defining an equivalence relation \(R\) in \(X\); write \(aRb\) (where \(a\) and \(b\) are in \(X\)) in case \(f(a) = f(b)\). For each element \(y\) of \(Y\), let \(g(y)\) be the set of all those elements \(x\) in \(X\) for which \(f(x) = y\). The definition of \(R\) implies that \(g(y)\) is, for each \(y\), an equivalence class of the relation \(R\); in other words, \(g\) is a function from \(Y\) onto the set \(X/R\) of all equivalence classes of \(R\). The function \(g\) has the following special property: if \(u\) and \(v\) are distinct elements of \(Y\), then \(g(u)\) and \(g(v)\) are distinct elements of \(X/R\). A function that always maps distinct elements onto distinct elements is called one-to-one (usually a one-to-one correspondence). Among the examples above the inclusion maps are one-to-one, but, except in some trivial special cases, the projections are not. (Exercise: what special cases?)

To introduce the next aspect of the elementary theory of functions we must digress for a moment and anticipate a tiny fragment of our ultimate definition of natural numbers. We shall not find it necessary to define all the natural numbers now; all we need is the first three of them. Since this is not the appropriate occasion for lengthy heuristic preliminaries, we shall proceed directly to the definition, even at the risk of temporarily shocking or worrying some readers. Here it is: we define \(0\), \(1\), and \(2\) by writing

\[ 0 = \emptyset , \: 1=\{ \emptyset \} , \: \text{and} \: 2 = \{ \emptyset , \{ \emptyset \} \} . \]

In other words, \(0\) is empty, \(1\) is the singleton \(\{ 0 \}\), and \(2\) is the pair \(\{ 0, 1 \}\). Observe that there is some method in this apparent madness; the number of elements in the sets \(0\), \(1\), or \(2\) (in the ordinary everyday sense of the word) is, respectively, zero, one, or two.

If \(A\) is a subset of a set \(X\), the characteristic function of \(A\) is the function \(\chi\) from \(X\) to \(2\) such that \(\chi(x) = 1\) or \(0\) according as \(x \in A\) or \(x \in X - A\). The dependence of the characteristic function of \(A\) on the set \(A\) may be indicated by writing \(\chi_A\) instead of \(\chi\). The function that assigns to each subset \(A\) of \(X\) (that is, to each element of \(\mathcal{P}(X)\)) the characteristic function of \(A\) (that is an element of \(2^X\)) is a one-to-one correspondence between \(\mathcal{P}(X)\) and \(2^X\). (Parenthetically: instead of the phrase “the function that assigns to each \(A\) in \(\mathcal{P}(X)\) the element \(\chi_A\) in \(2^X\)” it is customary to use the abbreviation “the function \(A \rightarrow \chi_A\).” In this language, the projection from \(X \times Y\) onto \(X\), for instance, may be called the function \((x, y) \rightarrow x\), and the canonical map from a set \(X\) with a relation \(R\) onto \(X/R\) may be called the function \(x \rightarrow x/R\).)

Exercise 8.1 \((i)\) \(Y^{\emptyset}\) has exactly one element, namely \(\emptyset\), wheter \(Y\) is empty or not, and \((ii)\) if \(X\) is not empty, then \(\emptyset^X\) is empty.