*Credit: Jürgen Hermann, Horst Hansen*

The simplest approach is clear, fast, and general (it works for any sequence, not just strings, and for any container on which you can test for membership, not just sets):

def containsAny(seq, aset): """ Check whether sequence seq contains ANY of the items in aset. """ for c in seq: if c in aset: return True return False

You can gain a little speed by moving to a higher-level, more
sophisticated approach, based on the `itertools`

standard library module,
essentially expressing the same approach in a different way:

import itertools def containsAny(seq, aset): for item in itertools.ifilter(aset._ _contains_ _, seq): return True return False

Most problems related to sets are best handled by using the
`set`

built-in type introduced in
Python 2.4 (if you're using Python 2.3, you can use the equivalent
`sets.Set`

type from the Python
Standard Library). However, there are exceptions. Here, for example, a
pure set-based approach would be something like:

def containsAny(seq, aset): return bool(set(aset).intersection(seq))

However, with this approach, every item in
`seq`

inevitably has to be examined. The functions in
this recipe's Solution, on the other hand, "short-circuit": they
return as soon as they know the answer. They must still check every
item in `seq`

when the answer is `False`

—we could never affirm that no item in
`seq`

is a member of `aset`

without
examining all the items, of course. But when the answer is `True`

, we often learn about that very soon,
namely as soon as we examine one item that *is* a
member of `aset`

. Whether this matters at all is very
data-dependent, of course. It will make no practical difference when
`seq`

is short, or when the answer is typically
`False`

, but it may be extremely
important for a very long `seq`

(when the answer can
typically be soon determined to be `True`

).

The first version of `containsAny`

presented in the recipe has the advantage of simplicity and clarity:
it expresses the fundamental idea with total transparency. The second
version may appear to be "clever", and that is not a complimentary
adjective in the Python world, where simplicity and clarity are core
values. However, the second version is well worth considering, because
it shows a higher-level approach, based on the `itertools`

module of the standard library.
Higher-level approaches are most often preferable to lower-level ones
(although the issue is moot in this particular case). `itertools.ifilter`

takes a predicate and an
iterable, and yields the items in that iterable that satisfy the
"predicate". Here, as the "predicate", we use ```
aset._
_contains_ _
```

, the bound method that is internally called when
we code `in aset`

for membership
testing. So, if `ifilter`

yields
anything at all, it yields an item of `seq`

that is
also a member of `aset`

, so we can `return True`

as soon as this happens. If we
get to the statement following the `for`

, it must mean the `return True`

never executed, because no items
of `seq`

are members of `aset`

, so we
can `return False`

.

If your application needs some function such as
`containsAny`

to check whether a string (or other
sequence) contains any members of a set, you may also need such
variants as:

def containsOnly(seq, aset): """ Check whether sequence seq contains ONLY items in aset. """ for c in seq: if c not in aset: return False return True

`containsOnly`

is the same function as `containsAny`

,
but with the logic turned upside-down. Other apparently similar tasks
don't lend themselves to short-circuiting (they intrinsically need to
examine all items) and so are best tackled by using the built-in type
`set`

(in Python 2.4; in 2.3, you can
use `sets.Set`

in the same
way):

def containsAll(seq, aset): """ Check whether sequence seq contains ALL the items in aset. """ return not set(aset).difference(seq)

If you're not accustomed to using the `set`

(or `sets.Set`

) method `difference`

, be aware of its semantics: for
any `set`

* a*,

`a.difference(b)`

(just like `a-set(b)`

) returns the set of all elements of
`a`

`b`

>>> L1 = [1, 2, 3, 3] >>> L2 = [1, 2, 3, 4] >>> set(L1).difference(L2)>>> set(L2).difference(L1)`set([ ])`

`set([4])`

which hopefully helps explain why:

>>> containsAll(L1, L2)>>> containsAll(L2, L1)`False`

`True`

(In other words, don't confuse `difference`

with another method of `set`

, `symmetric_difference`

, which returns the set
of all items that are in *either* argument and not
in the other.)

When you're dealing specifically with (plain,
*not* Unicode) strings for both
`seq`

and `aset`

, you may not need the
full generality of the functions presented in this recipe, and may
want to try the more specialized approach explained in Recipe 1.10 based on
strings' method `translate`

and the
`string.maketrans`

function from the
Python Standard Library. For example:

import string notrans = string.maketrans('', '') # identity "translation" def containsAny(astr, strset): return len(strset) != len(strset.translate(notrans, astr)) def containsAll(astr, strset): return not strset.translate(notrans, astr)

This somewhat tricky approach relies on `strset.translate(notrans, astr)`

being the
subsequence of `strset`

that is made of characters not
in `astr`

. When that subsequence has
the same length as `strset`

, no
characters have been removed by `strset.translate`

,
therefore no characters of `strset`

are in `astr`

. Conversely, when the subsequence is
empty, all characters have been removed, so all characters of
`strset`

are in `astr`

. The `translate`

method keeps coming up naturally
when one wants to treat strings as sets of characters, because it's
speedy as well as handy and flexible; see Recipe 1.10 for more
details.

These two sets of approaches to the recipe's tasks have very
different levels of generality. The earlier approaches are very
general: not at all limited to string processing, they make rather
minimal demands on the objects you apply them to. The approach based
on the `translate`

method, on the
other hand, works only when both `astr`

and `strset`

are strings, or
*very* closely mimic plain strings' functionality.
Not even Unicode strings suffice, because the `translate`

method of Unicode strings has a
signature that is different from that of plain strings—a single
argument (a `dict`

mapping code
numbers to Unicode strings or `None`

)
instead of two (both strings).

Recipe 1.10;
documentation for the `translate`

method of strings and Unicode objects, and `maketrans`

function in the `string`

module, in the *Library
Reference* and *Python in a Nutshell*;
ditto for documentation of built-in `set`

(Python 2.4 only), modules `sets`

and `itertools`

, and the special method `_ _contains_ _`

.

Start Free Trial

No credit card required