Set Theory And Russell's Paradox

In a previous post, we looked at notation for functions and sets. In that post, we knowingly skirted around the definitions of some key terms, and here we’re going to look at why this was necessary.

Mandelbrot Set Fractal Pattern

To recap, here are the key terms from our last discussion that we did not define rigorously, along with an informal definition of each:

set – A collection of items
element – An individual item to be considered for set inclusion
is a member of – A phrase denoted \(\in\) which indicates that a particular element belongs in a particular set

Notice how difficult it is to define set, element, and “is a member of” without using the terms themselves.

What we are pondering here is how to go about doing set theory. If we are not careful about how we define sets, elements, and set membership, then we will encounter paradoxes that break the set theory framework we are trying to construct.

What Qualifies As A Set?

Georg Cantor was a pioneer of set theory in the late 1800’s

Allowing any collection of items to qualify as a set is problematic – so much so that this approach is called naive set theory.

To do set theory rigorously, we’d need to have the following things settled (as well as other axioms we will not discuss here):

A rigorous definition of set, element, and \( \in \); or an agreement to let these be undefined notions
Given any entity \(S\), to be able to determine if \(S\) is a well-formed set
Given any entity \(x\), to be able to determine if \(x\) is an element within the universe of discourse
Given any element \(a\) and any set \(A\), to be able to determine unambiguously whether \(a \in A \)

In this post (and in future posts in this series) we will let set, element, and \(\in\) be undefined notions that are commonly understood; and we’ll also assume the rest of the above criteria are true on assumption. While this does not constitute a formal set theory, making these assumptions will allow us to explore the world of set theory efficiently.

Note in the criteria above we use the term “well-formed set”. This is the key term that we are focusing on in this post. As we will see, one should be careful about which collections of items are allowed to be called sets.

The Set Of All Sets

If any collection of items is allowed constitute a set, then certainly collections of sets would qualify as sets. The logical conclusion to this line of thought is that the collection of all sets would itself be a set.

Let’s suppose that \(\Omega\) is the set of all sets; and let’s consider the question, “Is \(\Omega\) a member of itself?” In other words, is \( \Omega \in \Omega \)? In naive set theory, the answer is yes. Since we are considering \(\Omega\) a set, it follows that it belongs to the “set of all sets” and therefore we have \(\Omega \in \Omega\).

Once we consider arbitrary collections of sets to be sets themselves, we are very close to a paradox. That is, we are close to stumbling upon a statement that can not possibly be true and cannot possibly be false within our system.

The Liar Paradox

Before diving head-first into Russell’s Paradox, we are going to look at a more basic example which does not require any set theory.

Logical paradoxes are a phenomenon that require one’s brain to “go back and forth” in order to experience the contradiction in order to fully understand it.

The Liar’s Paradox is the simple statement, “This statement is false.” If we call this statement \(P\) for simplicity, then asking whether \(P\) is true or false is a great way to experience what a paradox feels like.

Let’s start by supposing \(P\) is true. If this is the case, then by its own assertion \(P\) is false. This is a contradiction, so \(P\) cannot be true.

On the other hand, suppose that \(P\) is false. Then looking again at the assertion that \(P\) makes, we see that this is a true assertion; and therefore \(P\) is true.

Re-read the above paragraphs if necessary in order to experience what the paradox feels like. We are asked to assign a truth value (i.e. true or false) to a statement but we soon realize that neither value is consistent. If \(P\) is true, this implies \(P\) is false; and if \(P\) is false, this implies \(P\) is true.

The Liar’s Paradox is an example that proves we can not expect to always be able to assign a truth value to every statement, even in a formal mathematical context. It’s a famous example that Kurt Gödel used as an analogy to explain his famous Incompleteness Theorem.

Overall, this simple paradox is part of a bigger historical realization that any formal axiomatic system will contain true statements that cannot be proven and also false statements that cannot be disproven.

Sets That Contain Themselves

Moving back to sets, within naive set theory we can come up with lots of “sets” that would ostensibly belong to themselves similar to the example \(\Omega\) above.

For example, let \(\mathcal{D}\) be the set of all dogs. Surely \(\mathcal{D} \notin \mathcal{D}\), for the set of all dogs is not itself a dog.

However, consider the complement of \(\mathcal{D}\), denoted \(\overline{\mathcal{D}}\), and defined as the set of all items not in \(\mathcal{D}\).

Under our naive definition of the word set, we allow that \(\overline{\mathcal{D}}\) is a set; and we also have \(\overline{\mathcal{D}} \in \overline{\mathcal{D}}\) since the “set of all things that are not dogs” is itself not a dog.

Russell’s Paradox

Bertrand Russell in 1916

In 1901, the field of formal set theory was relatively new to mathematics; and the pioneers in the field were essentially doing naive set theory. This is when Bertrand Russell published his famous paradox that showed everyone that naive set theory needed to be re-worked and made more rigorous.

Here is the paradox: Let \(\mathcal{R}\) be defined as the set of all sets which are not members of themselves; and consider the statement \(\mathcal{R} \in \mathcal{R}\).

Notice that the paradox only applies in naive set theory, for in formal set theory we would not allow \(\mathcal{R}\) to be considered a set.

Now, let’s consider both possible ways to determine whether \(\mathcal{R} \in \mathcal{R}\).

Option 1: \(\mathcal{R} \in \mathcal{R}\)

In this case, the set \(\mathcal{R}\) is a member of itself per our assumption. But the defining characteristic of membership in \(\mathcal{R}\) is that its elements are sets which do not belong to themselves; so this implies \(\mathcal{R} \notin \mathcal{R}\).

Option 2: \(\mathcal{R} \notin \mathcal{R}\)

In this case, we are assuming \(\mathcal{R}\) is not an element of itself. But this means that \(\mathcal{R} \in \mathcal{R}\), since by not containing itself as an element it in fact meets the criteria for membership in \(\mathcal{R}\).

And that is Russell’s Paradox!

Logically we have only two options, either \(\mathcal{R} \in \mathcal{R}\) or \(\mathcal{R} \notin \mathcal{R}\). But when we consider each option we see that \(\mathcal{R} \in \mathcal{R}\) implies \(\mathcal{R} \notin \mathcal{R}\) and also \(\mathcal{R} \notin \mathcal{R}\) implies \(\mathcal{R} \in \mathcal{R}\).

Conclusion

This post is part of a series which has the intent to relate mathematics to computer programming and to explore code via the lens of math. I have no idea how Russell’s Paradox (or paradoxes in general) relate to programming, but what we have seen is that word choice and definition choice are extremely important in any system.

One cannot simply say that “A set is a collection of items” and expect to stand on solid theoretical ground. However, the amount of background work necessary to have a formal set theory is so staggering that it does make sense to assume a somewhat naive definition of the word set in order to begin discussing them.

In math, understanding the underlying axioms (and sometimes the underlying definitions) is akin to understanding the inner workings of how a computer compiles or interprets the code one is writing. In both cases, one cannot get very far if they constantly question how everything works.

Imagine someone making a trip in a car and constantly investigating everything they touch in an effort to understand the underlying mechanics of the car. One can either drive the car or one can understand how the car works, but it is extremely inefficient to do both at the same time if the goal is to move from point \(A\) to point \(B\).

In the end, having a deep understanding of a system’s underlying structure is virtually a different area of study than the system itself, which makes it very difficult to simultaneously work within a system and also master its foundational rules and assumptions.

Does this mean that we are no longer going to talk about sets because we don’t fully understand all of the structure underlying set theory? Of course not! But we will use caution regarding the entities which we classify as a set, for otherwise we’d be setting ourselves up for paradoxical situations.