10.4. Understanding the Performance of Collections

Problem

When choosing a collection for an application where performance is extremely important, you want to choose the right collection for the algorithm.

Solution

In many cases, you can reason about the performance of a collection by understanding its basic structure. For instance, a List is a singly linked list. It’s not indexed, so if you need to access the one-millionth element of a List as list(1000000), that will be slower than accessing the one-millionth element of an Array, because the Array is indexed, whereas accessing the element in the List requires traversing the length of the List.

In other cases, it may help to look at the tables. For instance, Table 10-13 shows that the append operation on a Vector is eC, “effectively constant time.” As a result, I know I can create a large Vector in the REPL very quickly like this:

var v = Vector[Int]()
for (i <- 1 to 50000) v = v :+ i

However, as the table shows, the append operation on a List requires linear time, so attempting to create a List of the same size takes a much (much!) longer time.

With permission from EFPL, the tables in this recipe have been reproduced from scala-lang.org.

Before looking at the performance tables, Table 10-12 shows the performance characteristic keys that are used in the other tables that follow.

Table 10-12. Performance characteristic keys for the subsequent tables

Key

Description

C

The operation takes (fast) constant time.

eC

The operation takes effectively ...

Get Scala Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.