62 5. COMPUTATIONAL EXPERIENCE

Given a set of k training samples .x; y/ 2 T (for example each x a sentence and each y a

bit string of y

jp

values) the goal is to ﬁnd a set of weights w to maximize an objective function

with two parts: one term R.w/ incorporates constraints and minimizes errors on the training set,

and one term ˝.w/ favors simple weighting schemes to avoid overﬁtting to T . In the training

phase the learning algorithm iterates to modify the weight set w incrementally. At each iteration a

query is sent to the quantum annealing hardware to ﬁnd an optimal solution to a QUBO instance

(see Section 3.2.1) and thereby improve w.

e authors point out that theirs is an unusual objective function because it has quadratic

terms, which allows it to account for pairwise correlations between elements of the test set. eir

experiments test whether the introduction of a quadratic objective function yields better classiﬁers.

ey compare their approach to an alternative method involving support vector machines (SVM)

that does not incorporate the quadratic interaction term.

eir evaluation of the classiﬁer looks at two input classes, one with labels based on solu-

tions to random satisﬁability problems, and one from a standard test set in image recognition.

Maximum instance sizes are jxj D 20; jyj D 34 in the random problems and jxj D 294; jyj D 6

for the image data. e quality of a learning algorithm is evaluated in terms of relative Hamming

error which measures the distance between true assignments and those returned the classiﬁer on

the evaluation set R. On these tests their learning algorithm produced classiﬁers with 28 and 10

percent better relative Hamming error than those from the SVM approach.

More discussion of this use of D-Wave quantum annealers to train classiﬁers may be found

in Nevin et al. [76], Nevin et al. [77], and Nevin et al. [78].

5.1.2 FINDING RAMSEY NUMBERS

Suppose you have guest list for a party of N people. For given m and n, you want to determine

whether there is a subgroup of size m who are all mutually acquainted, forming a clique, or a

subgroup of size n who are all strangers, forming an independent set. (If so, your party will be a

ﬂop because large cliques and large groups of strangers do not mingle well.)

A Ramsey number R.n; m/ is a threshold value for N such that if N R.m; n/ it must hold

that all parties of size N will either contain m mutual acquaintances or n mutual strangers. e

computation of R.n; m/ is a notoriously diﬃcult problem; for example with n; m > 3 only nine

Ramsey numbers are known. Besides party planning, the problem has important applications in

number theory, combinatorics, and complexity theory (see Roberts [90]).

e complexity of this problem is unknown: Schaefer [95] shows that a more general ver-

sion of the decision problem (for general subgraphs) is co-NP

NP

-complete, that is, complete for

the second level of the polynomial hierarchy.

Gaitan and Clark [41] show that the problem of computing Ramsey numbers belongs in the

quantum complexity class QMA (see Section 2.4). ey describe a quantum annealing algorithm

to compute Ramsey numbers, as follows. e ﬁrst step is to cast the computation of R .n; m/ as

an optimization problem. e party list can be represented by a graph G on N vertices and M

Get *Adiabatic Quantum Computation and Quantum Annealing* now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.