15 The Axiom Of Choice

For the deepest results about partially ordered sets we need a new set-theoretic tool; we interrupt the development of the theory of order long enough to pick up that tool.

We begin by observing that a set is either empty or it is not, and, if it is not, then, by the definition of the empty set, there is an element in it. This remark can be generalized. If \(X\) and \(Y\) are sets, and if one of them is empty, then the Cartesian product \(X \times Y\) is empty. If neither \(X\) nor \(Y\) is empty, then there is an element \(x\) in \(X\), and there is an element \(y\) in \(Y\); it follows that the ordered pair \((x, y)\) belongs to the Cartesian product \(X \times Y\), so that \(X \times Y\) is not empty. The preceding remarks constitute thes cases \(n = 1\) and \(n = 2\) of the following assertion: if \(\{ X_{i} \}\) is a finite sequence of sets, for \(i\) in \(n\), say, then a necessary and sufficient condition that their Cartesian product be empty is that at least one of them be empty. The assertion is easy to prove by induction on \(n\). (The case \(n = 0\) leads to a slippery argument about the empty function; the uninterested reader may start his induction at \(1\) instead of \(0\).)

The generalization to infinite families of the non-trivial part of the assertion in the preceding paragraph (necessity) is the following important principle of set theory.

Axiom 15.1 (Axiom of choice) The Cartesian product of a non-empty family of non-empty sets is non-empty.

In other words: if \(\{ X_{i} \}\) is a family of non-empty sets indexed by a non-empty set \(I\), then there exists a family \(\{ x_{i} \}\), \(i \in I\), such that \(x_{i} \in X_{i}\) for each \(i\) in \(I\).

Suppose that \(\mathcal{C}\) is a non-empty collection of non-empty sets. We may regard \(\mathcal{C}\) as a family, or, to say it better, we can convert \(\mathcal{C}\) into an indexed set just by using the collection \(\mathcal{C}\) itself in the role of the index set and using the identity mapping on \(\mathcal{C}\) in the role of the indexing. The axiom of choice then says that the Cartesian product of the sets of \(\mathcal{C}\) has at least one element. An element of such a Cartesian product is, by definition, a function (family, indexed set) whose domain is the index set (in this case \(\mathcal{C}\)) and whose value at each index belongs to the set bearing that index. Conclusion: there exists a function \(f\) with domain \(\mathcal{C}\) such that if \(A \in \mathcal{C}\), then \(f(A) \in A\). This conclusion applies, in particular, in case \(\mathcal{C}\) is the collection of all non-empty subsets of a non-empty set \(X\). The assertion in that case is that there exists a function \(f\) with domain \(\mathcal{P}(X) - \{ \emptyset \}\) such that if \(A\) is in that domain, then \(f(A) \in A\). In intuitive language the function \(f\) can be described as a simultaneous choice of an element from each of many sets; this is the reason for the name of the axiom. (A function that in this sense “chooses” an element out of each non-empty subset of a set \(X\) is called choice function for \(X\).) We have seen that if the collection of sets we are choosing from is finite, then the possibility of simultaneous choice is an easy consequence of what we knew before the axiom of choice was even stated; the role of the axiom is to guarantee that possibility in infinite cases.

The two consequences of the axiom of choice in the preceding paragraph (one for the power set of a set and the other for more general collections of sets) are in fact just reformulations of that axiom. It used to be considered important to examine, for each consequence of the axiom of choice, the extent to which the axiom is needed in the proof of the consequence. An alternative proof without the axiom of choice spelled victory; a converse proof, showing that the consequence is equivalent to the axiom of choice (in the presence of the remaining axioms of set theory) meant honorable defeat. Anything in between was considered exasperating. As a sample (and an exercise) we mention the assertion that every relation includes a function with the same domain. Another sample: if \(\mathcal{C}\) is a collection of pairwise disjoint non-empty sets, then there exists a set \(A\) such that \(A \cap C\) is a singleton for each \(C\) in \(\mathcal{C}\). Both these assertions are among the many known to be equivalent to the axiom of choice.

As an illustration of the use of the axiom of choice, consider the assertion that if a set is infinite, then it has a subset equivalent to \(\omega\). An informal argument might run as follows. If \(X\) is infinite, then, in particular, it is not empty (that is, it is not equivalent to \(0\)); hence it has an element, say \(x_{0}\). Since \(X\) is not equivalent to \(1\), the set \(X - \{ x_{0} \}\) is not empty; hence it has an element, say \(x_{1}\). Repeat this argument ad infinitum; the next step, for instance, is to say that \(X - \{ x_{0}, x_{1} \}\) is not empty, and, therefore, it has an element, say \(x_{2}\). The result is an infinite sequence \(\{ x_{n} \}\) of distinct elements of \(X\); q.e.d. This sketch of a proof at least has the virtue of being honest about the most important idea behind it; the act of choosing an element from a non-empty set was repeated infinitely often. The mathematician experienced in the ways of the axiom of choice will often offer such an informal argument; his experience enables him to see at a glance how to make it precise. For our purposes it is advisable to take a longer look.

Let \(f\) be a choice function for \(X\); that is, \(f\) is a function from the collection of all non-empty subsets of \(X\) to \(X\) such that \(f(A) \in A\) for all \(A\) in the domain of \(f\). Let \(\mathcal{C}\) be the collection of all finite subsets of \(X\). Since \(X\) is infinite, it follows that if \(A \in \mathcal{C}\), then \(X - A\) is not empty, and hence that \(X - A\) belongs to the domain of \(f\). Define a function \(g\) from \(\mathcal{C}\) to \(\mathcal{C}\) by writing \(g(A) = A \cup \{ f(X - A) \}\). In words: \(g(A)\) is obtained by adjoining to \(A\) the element that \(f\) chooses from \(X - A\). We apply the recursion theorem to the function \(g\); we may start it rolling with, for instance, the set \(\emptyset\). The result is that there exists a function \(U\) from \(\omega\) into \(\mathcal{C}\) such that \(U(0) = \emptyset\) and \(U(n^{+}) = U(n) \cup \{ f(X - U(n)) \}\) for every natural number \(n\). Assertion: if \(v(n) = f(X - U(n))\), then \(v\) is a one-to-one correspondence from \(\omega\) to \(X\), and hence, indeed, \(\omega\) is equivalent to some subset of \(X\) (namely the range of \(v\)). To prove the assertion, we make a series of elementary observations; their proofs are easy consequences of the definitions. First: \(v(n) \notin U(n)\) for all \(n\). Second: \(v(n) \in U(n^{+})\) for all \(n\). Third: if \(n\) and \(m\) are natural numbers and \(n \le m\), then \(U(n) \subset U(m)\). Fourth: if \(n\) and \(m\) are natural numbers and \(n < m\), then \(v(n) \neq v(m)\). (Reason: \(v(n) \in U(m)\) but \(v(m) \notin U(m)\).) The last observation implies that \(v\) maps distinct natural numbers onto distinct elements of \(X\); all we have to remember is that of any two distinct natural numbers one of them is strictly smaller than the other.

The proof is complete; we know now that every infinite set has a subset equivalent to \(\omega\). This result, proved here not so much for its intrinsic interest as for an example of the proper use of the axiom of choice, has an interesting corollary. The assertion is that a set is infinite if and only if is equivalent to proper subset of itself. The “if” we already know; it says merely that a finite set cannot be equivalent to a proper subset. To prove the “only if,” suppose that \(X\) is infinite, and let \(v\) be a one-to-one correspondence from \(\omega\) into \(X\). If \(x\) is in the range of \(v\), say \(x = v(n)\), write \(h(x) = v(n^{+})\); if \(x\) is not in the range of \(v\), write \(h(x)= x\). It is easy to verify that \(h\) is a one-to-one correspondence from \(X\) into itself. Since the range of \(h\) is a proper subset of \(X\) (it does not contain \(v(0)\)), the proof of the corollary is complete. The assertion of the corollary was used by Dedekind as the very definition of infinity.