16 Equivalence and Noninferiority Tests
Power calculations are similar to those for the Δ paradigm, with some
important differences.
Note that under the condition that μ
1
= (1 − p
a
)μ
2,
p
a
p
0
EX pX pppp1(1)(1 )
aa
102202
02
() ()
−−
=− µ− −µ=−µ
As a result, the noncentrality parameter for the test statistic is
pp n
p11
a
02
0
2
()
()
−µ
σ+
Thus, in order to compute the probability of rejecting the null, estimates
of both σ and μ
2
are required. Figure 2.7 shows the power curve for the
example, assuming μ
2
= 100 and σ = 2.8. In this example, Pr{Reject H
0
|p
a
=
0.0835} ≈ 0.051.
Condence interval formulation:
The expression
Power
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0.04 0.05 0.06 0.07
pa
0.08 0.09
FIGURE 2.7
Test 2.3, power curve for equivalence of two means—percent paradigm.
17Means
XpXtSE1
1021
()
−− +
−β
is a one-sided 100(1 − β) percent upper condence limit for μ
1
– (1 – p
0
)μ
2
.
From the example, the 95 percent upper condence limit for μ
1
– (1 – p
0
)μ
2
is
()
−− += +≈
−β
XpXtSE1 93.5 –(0.95)*100.0 (1.701)*0.997
0.197
1021
.
Computational considerations:
SAS code
libname stuff 'H:\Personal Data\Equivalence & Noninferiority\
Programs & Output';
data calc;
set stuff.d20121104_test_2_3_prop;
run;
proc means data = calc;
var X1 X2 p0 beta;
output out = onemean MEAN = xbar1 xbar2 p0val betaprob STD =
sd1 sd2 N = n1 n2;
run;
data outcalc;
set onemean;
se = sqrt(sd1**2/n1 + ((1-p0val)**2)*sd2**2/n2);
w1 = ((sd1**2/n1)/(sd1**2/n1 + sd2**2/n2))**2/(n1-1);
w2 = ((sd2**2/n1)/(sd1**2/n1 + sd2**2/n2))**2/(n2-1);
dfe = 1/(w1 + w2);
lowlim = xbar1 - (1-p0val)*xbar2 + tinv(1-betaprob,dfe)*se;
run;
proc print data = outcalc;/* has vars xbar1 xbar2 p0val
betaprob n1 n2 se lowlim */
run;
The MEANS Procedure
Variable Label N Mean Std Dev Minimum Maximum
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
X1 X1 25 8.8799300 1.0305582 7.3220509 11.6006410
X2 X2 25 9.6829027 1.0045857 7.4892100 12.0170228
p0 p0 25 0.1000000 0 0.1000000 0.1000000
beta beta 25 0.0500000 0 0.0500000 0.0500000
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
18 Equivalence and Noninferiority Tests
The SAS System 06:27 Sunday, November 4, 2012 2
Obs _TYPE_ _FREQ_ xbar1 xbar2 p0val betaprob sd1
1 0 25 8.88 9.68 0.10 0.05 1.03
Obs sd2 n1 n2 se w1 w2 dfe lowlim
1 1.00 25 25 0.27419 0.010955 .009891787 47.9688 0.62520
JMP Data Table and formulas (Figures2.8, 2.9, and 2.10)
Test 2.4 Single Mean (Two-Sided)
Parameters:
μ = population mean
σ = population standard deviation
FIGURE 2.8
Test 2.3, JMP screen 1.
19Means
FIGURE 2.9
Test 2.3, JMP screen 2.
FIGURE 2.10
Test 2.3, JMP screen 3.
20 Equivalence and Noninferiority Tests
μ
L
= lower acceptable limit for μ (L for “low”)
μ
H
= upper acceptable limit for μ (H for “high”)
n = sample size
1 – β = power to reject the null if μ = μ
L
or μ = μ
H
Hypotheses:
H
0
: μ < μ
L
OR μ > μ
H
H
1
: μ
L
μμ
H
Data:
=X sample mean
S = sample standard deviation
n = sample size
Critical value(s):
If
Xt
S
n
L1
+≥
µ
−β
and
Xt
S
n
H1
−≤
µ
−β
where t
1 − β
= 100*(1 − β) percentile of a central t-distribution with n − 1 degrees
of freedom, then reject H
0
.
Discussion:
This is an implementation of the two one-sided test, or TOST, philosophy
of schuirmann (1987). That is, we do not split β in half, but apply all of this
risk to each side of the hypothetical interval, (μ
L
,μ
H
). The reasoning for using
TOST is twofold:
1. The “OR” in the null hypothesis statement is an exclusive “or.” That
is, μ cannot be both less than μ
L
and greater than μ
H
.
2. Inasmuch as failing to reject the null is not the desired state, reduc-
ing β by splitting it in half would be a less conservative criterion
than not splitting the risk.
As a result, the power calculations are identical to those of the single mean
(one-sided) test.

Get Equivalence and Noninferiority Tests for Quality, Manufacturing and Test Engineers now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.