18 2. SINGLE QUEUES
In fact, for D/D/1, it is easy to see that
,L= 0 and W = 0 for any λ ≤ μ.
Here, the existence of a steady state for λ = μ illustrates the impact of determinism in arrival times.
We continue the discussion of StreamJoins  and introduce SleepingDisks  and StorageAvail-
For limited compute power (Sec. 5.2.1), the streams are sampled, resulting in effective arrival
rates of λ
. If the streams are Poisson and the tuples are sampled independently with
a ﬁxed probability, then the Split Property of Theorem 0.16 says that the sampled streams are
However, it appears from λ
= μ in Eq. (13) that the authors assume the stream arrivals
and service times are deterministic; otherwise, the average number of tuples in memory can
grow without bound, like in Fig. 2.4, so no amount of memory will be enough for the case of
“unlimited memory”. From Eq. (2.9), we see that even for M/D/1, lim
n =∞.Note that data
streams like stock price movements, news stories and network measurements are not likely to
have deterministic inter-arrival times.
For unlimited memory, one could use Little’s Law to calculate the memory requirement from
and the time a tuple spends in memory. However, this time depends on the service time
distribution (not just the average μ — compare Eq. (2.5) and Eq. (2.9)). The tuple processing
time includes probing, insertion and invalidation — which vary from tuple to tuple — so the
service time is not deterministic.
In the case of limited memory and unlimited computing resources (Sec. 5.2.2), if the streams
are not deterministic, then a burst of arrivals can cause the buffer to overﬂow and tuples may
be lost, so the arrival rates at the join may no longer be λ
. Similarly, if the computing
resources are also limited (Sec. 5.2.3), then the sampled streams may not have the intended λ
Fig. 6 shows the system for this paper. The input parameters include arrival rate α
time as speciﬁed by Exp(t
) and Var(t
), transition time T
, power P
, and P
main performance measures are response time (Fig. 7), transaction rate (Fig. 12a) and energy
per transaction (Fig. 12b).
2.4. DISCUSSION 19
The main issues are: (i) the tradeoff between performance and energy and (ii) adaptive disk
layout in response to dynamic workload changes.
Note the use of Little’s Law (Eq. (2.3)) and the Pollaczek-Khinchin formula in deriving R
The expressions for R
use the residual life introduced in the next chapter.
Fig. 1 shows the StarFish system, consisting of a host element HE, storage elements SEs,
and the network that connects them. The data replication through multiple SEs is for fault
tolerance, and the varied location distances from the HE seek to balance high resilience and
The model analyses the reliability and availability issues.When a model is meant to study rare
events, simulation models can take too long to generate those events, so there is little choice
but to use an analytical model.
The input parameters are the number of SEs N, the write quorum Q, the failure rate λ and
recovery rate μ. Note that unlike an open queue (e.g., M/G/1) it is possible that λ ≥ μ,so
ρ ≥ 1: that just means that the machine is more often down than up.The performance metrics
are availability A(Q, N), and reliability as measured by the probability that there is no data
As Table 1 shows, availability is measured in nines, e.g., six nines is 99.9999% availability. The
precision suggested by such multiple signiﬁcant ﬁgures may look impressive, but one must not
forget the underlying assumptions: in this model, for example, the six nines are calculated with
exponential distributions. An earthquake can also break the links to two or more SEs (StarFish
considers network disconnection to be a site failure), so failures may not be independent.