Sets: union, intersect and setdiff
There are three essential functions for manipulating sets. The principles are easy to see if we work with an example of two sets:
setA<-c("a", "b", "c", "d", "e")
setB<-c("d", "e", "f", "g")
Make a mental note of what the two sets have in common, and what is unique to each.
The union of two sets is everything in the two sets taken together, but counting elements only once that are common to both sets:
union(setA,setB)
[1] "a" "b" "c" "d" "e" "f" "g"
The intersection of two sets is the material that they have in common:
intersect(setA,setB)
[1] "d" "e"
Note, however, that the difference between two sets is order-dependent. It is the material that is in the first named set, that is not in the second named set. Thus setdiff(A,B) gives a different answer than setdiff(B,A). For our example,
setdiff(setA,setB) [1] "a" "b" "c" setdiff(setB,setA) [1] "f" "g"
Thus, it should be the case that setdiff(setA,setB) plus intersect(setA,setB) plus setdiff(setB,setA) is the same as the union of the two sets. Let's check:
all(c(setdiff(setA,setB),intersect(setA,setB),setdiff(setB,setA))== union(setA,setB)) [1] TRUE
There is also a built-in function setequal for testing if two sets are equal
setequal(c(setdiff(setA,setB),intersect(setA,setB),setdiff(setB,setA)), union(setA,setB)) [1] TRUE
You can use %in% for comparing sets. The result is a logical vector whose length matches the vector on the left
setA %in% setB [1] FALSE FALSE FALSE TRUE TRUE setB ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access