# Sets and functionsSet Operations

## Complement

Given a set describing a grocery list and a subset describing the set of items we've already purchased, the set we might be most interested in constructing from and is the set of items which are in but not in . This is called the **complement** of with respect to .

The complement of the set of groceries in the cart with respect to the set of groceries on the list is a meaningful set because those are the items

**Definition** (Complement)

If and are sets and , then the complement of with respect to , denoted or , is the set of all elements in that are not in . That is,

With the blob-and-point visualization:

Since is not part of the notation , we will usually only use that notation when the intended containing set is clear from context.

Sometimes you grab some items at the grocery store which were not on your list. Likewise, the notation may be used regardless of whether is a subset of . In this case, we use a different term: the **set difference** is defined to be the set of elements which are in which are not in .

**Exercise**

Suppose and . Find the complement of with respect to . It has

*Solution.* The complement is , since 1, 3, and 5 are the elements of which are not in .

**Exercise**

Suppose , , and . Then =

Is the assumption that necessary for the problem to be well-specified?

*Solution.* Since has 55 elements and has 13, then there are elements in which are not in .

The assumption is necessary, since if some of the elements of were not in , would be larger.

## Union

If two members of your household supplied you with grocery lists as you were about to go to the store, then the first thing you might want to do is produce a combined grocery list. This set operation is called taking the *union*.

**Definition** (Union)

The **union** of two sets and , denoted , is the set containing all the elements of and all the elements of and no other elements. In other words, if and only if either or .

**Exercise**

Let and . Find . It has

*Solution.* Listing all the elements of and all elements of and eliminating duplicates, we get

## Intersection

You realize that you and your partner inadvertently *both* made grocery lists and went grocery shopping the same afternoon. You want to know the items on both lists, because

The set of items which are in both sets is called the *intersection* of the two sets.

**Definition** (Intersection)

The **intersection** of two sets and , denoted , is the set consisting of elements that are in both and . In other words, if and only if and .

**Exercise** Let and . Then has

## Set operations with more sets

The union and intersection operations may be applied to any number of sets. Suppose are sets—the union of these sets can be expressed as .

More compactly,

Similarly, we can take the intersection of an arbitrary number of sets:

## Disjoint sets

Often we will want to specify whether two sets have any elements in common. For example, if is the set of vegetables you are interested in, and is the set of vegetables that your partner is interested in, then whether and have any overlap determines whether you will need to prepare separate vegetable dishes.

**Definition** (Disjoint)

Two sets and are **disjoint** if they do not have any elements in common.

In other words, and are disjoint if .

This definition extends to an arbitrary number of sets. We say that the sets are **pairwise disjoint** if any pair is disjoint (in other words, if whenever ).

**Exercise**

Find three sets , , and which have , but for which all of the intersections , , and are nonempty.

*Solution.* We can take , , and . These sets are pairwise *non*-disjoint, but there are no elements common to all three sets.

## Partitions

Suppose you're part of a group of shoppers working together to purchase the items on a single grocery list. A good idea is to *partition* the set of items you want to purchase into smaller sets so that each person can purchase only the items on their own set.

**Definition** (Partition)

A **partition** of a set is a collection of non-empty sets such that

and are disjoint.

**Exercise**

Find a partition of into three sets. Is there a partition of into *six* sets?

*Solution.* There are many partitions of into three sets. For example,

It is not possible to partition into six sets, because each set must have at least one element, and no pair of the sets can have any element in common.

## Cartesian Products

Suppose we perform an experiment which consists of flipping a coin and rolling a standard six-sided die. The outcome of the coin flip is an element of the set , and the outcome of the die roll is an element of the set . The set of all possible outcomes of the experiment is the set with the following elements.

(H,1) | (H,2) | (H,3) | (H,4) | (H,5) | (H,6) |

(T,1) | (T,2) | (T,3) | (T,4) | (T,5) | (T,6) |

We call this 12-element set the **Cartesian product** of and .

**Definition** (Cartesian Product)

If and are sets, then the **Cartesian product** of and is defined by

Likewise, if are sets, then

This set operation is ubiquitous in probability and data science applications, since it corresponds to the common act of combining multiple pieces of information into an ordered pair, an ordered triple, or a higher-order tuple.

For example, a patient data record might be an ordered quintuple of the form (first name, last name, date of birth, height, blood pressure reading). This record is in , where is the set of all *strings* (sequences of characters), is the set of all dates, is the set of positive length measures, and is the set of possible blood pressure readings.

**Exercise**

If and , then

*Solution.* In the coin-and-die example, the cardinality of the Cartesian product was 12, which is equal to the product of the cardinalities of the original sets. We listed the elements of in a way which suggests why this is the case: the elements of can always be arranged in a by grid.

Therefore, .

## Exercises

**Exercise**

Select the most appropriate set theory term for each of the following real-world scenarios.

- You have a list of patients which have a particular risk factor and a second list of patients who have another risk factor. You want to identify the patients with both risk factors.
- Your company is merging with another company and you want to combine your customer database with their customer database to get a collection of all of the customer records.
- You have a table containing information about all of the Champions League goals this year, and you want to look at the ones which were not scored by Ronaldo.
- You have 68 clients to call, and you want to split them among your four salespeople.

**Exercise**

Establish the first and third of the following four identities. Use the following strategy: show that the left-hand side is a subset of the right-hand side and vice versa. To demonstrate that , consider an element of and—assuming only that —apply reasoning to conclude that it must be in as well.

*Solution.* If an element is in , then it is in and it is either in or . This implies that either (i) and , or (ii) and . In other words, either or . In other words, . Therefore, the left-hand side is a

Conversely, if , then either or . In the former case, it is true that and that . Therefore, in this case. Similarly, in the latter case, we have and . Therefore, in this case as well. So the right-hand side is also a

For the third identity, note that if

then it is not true that is in the union of the 's. In other words, must be in *none* of the 's. This means that for each , the element is in its complement. It follows by the definition of intersection that

Similarly, if

then is in none of the 's, which in turn means that it is not in the union of the 's. Thus, is in the complement of the union of the 's.