Chapter 4. Good Actor Design
Understanding how to build good actor systems begins at the small scale. The large-scale structures that we build are often the source of our greatest successes or failures in any software project, regardless of whether it uses Akka. But if we donât first understand the smaller building blocks that go into those larger structures, we run the risk of small mistakes propagating throughout the system in a way that makes the code unmaintainable. This problem isnât unique to Akka. If you try to write code using a functional language without first understanding the basics of functional programming, you will run into similar problems.
In this chapter, we talk about how to avoid such issues and apply good principles and practices to building your actors and your system.
Starting Small
As just mentioned, understanding how to build good actor systems must begin at the small scale. We can talk about how to build and connect large-scale actor constructs together, but if we lack a good design at the small scale, we can still easily end up with bottlenecks, blocking operations, and spaghetti code that can cause the application to fail.
The most common mistakes people make when working with Akka happen in the small scale rather than the large end of the spectrum. Even though these problems tend to be smaller in scope than architectural problems, when compounded together they can become significant. Whether it is accidentally exposing mutable state, or closing over that mutable state in a future, it is easy to make mistakes that can render your actor system incomprehensible. You need to remember that although actors address some of the issues you encounter when building concurrent systems, they work only when you use them correctly. If you use them incorrectly, they can create just as many concurrency bugs as any other system. And even though actors can help to address some problems when building concurrent systems, they donât solve every problem. So just as with any concurrent system, you need to watch out for bottlenecks, race conditions, and other problems that might come up.
Think about our sample project management system. In that application, we have separated out bounded contexts like the scheduling service, the people management service, and the project management service. But when we deploy the system we discover something unexpected. The user management service, which was supposed to be a very simple data management service, is not living up to expectations. We are losing data unexpectedly. Digging deeper, we can see that the system was built using a mix of different concurrency paradigms. There are actors and futures mixed together without consideration for what that means. As a result, there are race conditions. We are closing over mutable state inside of multiple threads. We have a series of concurrency problems that could have been avoided.
In the meantime, looking deeper itâs evident that our scheduling service isnât handling the load. We expected to have to scale it, due to the fairly complex logic it involves, but now it appears that it is running so poorly that it is jeopardizing the project. We can mitigate the problem by scaling, but even then it might prove to be too slow. Investigating further, itâs clear that the application is only capable of handling a small number of concurrent projects. It turns out we have introduced blocking operations that are consuming all the threads, and this in turn is crippling the application.
There are patterns and principles in Akka that you can use to help prevent problems like these. The key is knowing when and where to apply those patterns. When used incorrectly, the patterns can create just as many problems as they solve. But if you apply them correctly, they will give you a strong foundation upon which to build the rest of your system.
Encapsulating State in Actors
One of the key features of actors is their ability to manage state in a thread-safe manner. This is an important feature to understand, but you also need to be aware that there is more than one way to manage that state, and each option might have its uses, depending on the situation. It is also valuable to know those patterns so that if you look into someone elseâs code you can recognize the pattern and understand the motivation behind it.
Encapsulating State by Using Fields
Perhaps the simplest option for managing state is including the state in mutable private fields inside the actor. This is relatively straightforward to implement. It will be familiar to you if you are used to a accustomed imperative style of programming. It is also one of the more common ways mutable state is represented when looking at examples people provide in books, courses, or online.
In our project scheduling domain, we have the concept of a person. A person consists of many different data fields that might be altered. This includes data like first name, last name, business roles, and so on. You can update these fields individually or as a group. A basic implementation of this might look like the following:
object
Person
{
case
class
SetFirstName
(
name
:
String
)
case
class
SetLastName
(
name
:
String
)
case
class
AddRole
(
role
:
Role
)
}
class
Person
extends
Actor
{
import
Person._
private
var
firstName
:
Option
[
String
]
=
None
private
var
lastName
:
Option
[
String
]
=
None
private
var
roles
:
Set
[
Role
]
=
Set
.
empty
override
def
receive
:
Receive
=
{
case
SetFirstName
(
name
)
=>
firstName
=
Some
(
name
)
case
SetLastName
(
name
)
=>
lastName
=
Some
(
name
)
case
AddRole
(
role
)
=>
roles
=
roles
+
role
}
}
The basic format will be familiar to those accustomed to object-oriented programming and writing getters and setters in languages like Java. If youâre more functional-minded, the presence of mutable state might make you cringe a little, but within the context of an actor, mutable state is safe because it is protected by the Actor Model. If you dig into the standard Scala libraries, you will find that mutable state is actually quite common in what are otherwise pure functions. The mutable state in that case is isolated to within the function and is therefore not shared among different threads. Here, we have isolated the mutable state to within an actor, and due to the single-threaded illusion, that state is protected from concurrent access.
This approach is a good one to use when you are trying to model something simple. If only a few mutable fields or values are involved and the logic that mutates those fields is minimal, it might be a good candidate for this type of approach.
The problem with this approach is that as the complexity increases, it begins to become unmanageable. In our example of an actor representing a person, what happens when it expands and you need to keep state for various pieces of additional information like the address, phone number, and so on? You are adding more fields to the actor, all of which are mutable. They are still safe, but over time as this grows, it could begin to look ugly. Consider as well what happens if the logic around those things is nontrivialâfor example, if you begin having to include validation logic for when you set a personâs address. Your actor is becoming bloated. The more you add, the worse it will become, as illustrated here:
class
Person
extends
Actor
{
private
var
firstName
:
Option
[
String
]
=
None
private
var
lastName
:
Option
[
String
]
=
None
private
var
address
:
Option
[
String
]
=
None
private
var
phoneNumber
:
Option
[
String
]
=
None
private
var
roles
:
Set
[
Role
]
=
Set
.
empty
override
def
receive
:
Receive
=
{
case
SetFirstName
(
name
)
=>
firstName
=
Some
(
name
)
case
SetLastName
(
name
)
=>
lastName
=
Some
(
name
)
case
SetAddress
(
address
)
=>
validateAddress
(
address
)
address
=
Some
(
address
)
case
SetPhoneNumber
(
phoneNumber
)
=>
validatePhoneNumber
(
phoneNumber
)
phoneNumber
=
Some
(
phoneNumber
)
case
AddRole
(
role
)
=>
roles
=
roles
+
role
}
}
Another issue with this is that actors are inherently more difficult to test than other constructs in Scala because they are concurrent, and this concurrency introduces complexity into your tests. As the complexity of the code increases, so too does the complexity of the tests. Soon, you might find yourself in a situation in which your test code is becoming difficult to understand and tough to maintain, and that poses a problem.
One way to solve this problem is to break the logic out of the actor itself. You can create traits that can be stacked onto the actor. These traits can contain all the logic to mutate the values as appropriate. The actor then becomes little more than a concurrency mechanism. All of the logic has been extracted to helper traits that can be written and tested as pure functions. Within the actor, you then need worry only about the mutability and concurrency aspects. Take a look:
trait
PhoneNumberValidation
{
def
validatePhoneNumber
(
phoneNumber
:
Option
[
String
])
=
{
...
}
}
trait
AddressValidation
{
def
validateAddress
(
address
:
Option
[
String
])
=
{
...
}
}
class
Person
extends
Actor
with
PhoneNumberValidation
with
AddressValidation
{
...
}
This technique makes it possible for you to simplify the actor. It also reduces the number of things you need to test concurrently. However, it does nothing to reduce the number of fields that are being mutated or the tests that need to happen around the mutation of those fields. Still, it is a valuable tool to have in your toolbox to keep your actors from growing too large and unmanageable.
So how can you reduce some of that mutable state? How can you take an actor that has many mutable parts and reduce it to a smaller number? There are a few options. One is to reevaluate the structure. Maybe you didnât actually want a single actor for all of the fields. In some cases, it might be beneficial to separate this into multiple actors, each managing a single field or group of fields. Whether you should do this depends largely on the domain. This is less of a technical question and more of a domain-driven design (DDD) question. You need to look at the domain and determine whether the fields in question are typically mutated as a group or as a single field. Are they logically part of a single entity in the domain or do they exist as separate entities? This will help to decide whether to separate additional actors.
But what if you determine that they are logically part of the same entity and therefore the same actor? Is it possible to reduce some of this mutable state within a single actor?
Encapsulating State by Using âStateâ Containers
In the previous example, you can extract some of the logic from Account into a separate, nonactor-based PersonalInformation
class. You
could extract this into a class or set of classes that are much more
type safe, easier to test, and a good deal more functional. This leaves you with just a single var
to store the entire state rather than having multiple
ones, as shown here:
object
Person
{
case
class
PersonalInformation
(
firstName
:
Option
[
FirstName
]
=
None
,
lastName
:
Option
[
LastName
]
=
None
,
address
:
Option
[
Address
]
=
None
,
phoneNumber
:
Option
[
PhoneNumber
]
=
None
,
roles
:
Set
[
Role
]
=
Set
.
empty
)
}
class
Person
extends
Actor
{
private
var
personalInformation
=
PersonalInformation
()
override
def
receive
:
Receive
=
...
}
This is an improvement. You now have a PersonalInformation
class that is
much easier to test. You can put any validation logic that you want into
that state object and call methods on it as necessary. You can write the logic in an immutable fashion, and you can test the logic as pure
functions. In fact, if later you decide that you donât want to use an
actor, you could eliminate the actor altogether without having to change
much (or anything) about PersonalInformation
. You could even choose
not to declare PersonalInformation
within the companion object for
the actor. You could opt instead to move that logic to another location
in the same package or move it to a different package or module.
This is a nice technique to use when the logic of your model becomes complicated. It provides the means for you to extract your domain logic
completely so that it is encapsulated in the PersonalInformation
object
and your actor becomes pure infrastructure. This reduces everything to
a single var
that is still protected by the single-threaded illusion.
But what if you want to eliminate that var
completely? Can you do that?
Encapsulating State by Using become
Another approach to maintaining this state is by using the become
feature
to store that state. This particular technique also falls more in line
with the idea discussed earlier that behavior and state changes are in fact one
and the same thing. Using this technique, you can eliminate the var
completely without sacrificing much in the way of performance. Letâs take a look at the code:
object
Person
{
case
class
PersonalInformation
(
firstName
:
Option
[
FirstName
]
=
None
,
lastName
:
Option
[
LastName
]
=
None
,
address
:
Option
[
Address
]
=
None
,
phoneNumber
:
Option
[
PhoneNumber
]
=
None
,
roles
:
Set
[
Role
]
=
Set
.
empty
)
}
class
Person
extends
Actor
{
override
def
receive
:
Receive
=
updated
(
PersonalInformation
())
private
def
updated
(
personalInformation
:
PersonalInformation
)
:
Receive
=
{
case
SetFirstName
(
firstName
:
FirstName
)
=>
context
.
become
(
updated
(
personalInformation
.
copy
(
firstName
=
Some
(
firstName
))))
...
}
}
This completely eliminates any need for a var
. Now, rather than altering the state by manipulating that var
, you are instead altering it by changing the behavior. This changes the behavior so that the
next message will have a different PersonalInformation
than the previous
one. This is nice because it has a slightly more âfunctionalâ feel to it
(no more var
s), but it also maps better to our understanding of the
Actor Model. Remember, in the Actor Model, state and behavior are the
same thing. Here, we can see how that is reflected in code.
Be wary though. Although this technique has its uses and can be quite
valuable, particularly when building finite-state machines, the code is
more complex and takes more work to understand. It
isnât as straightforward as manipulating a var
. It might be more complex
to follow what is happening in this code if you need to debug it,
especially if there are other behavior transitions involved.
So, when should you use this approach rather than using a var
? This
technique is best saved for cases for which you have multiple behaviors,
and more specifically, when those behaviors can contain different state
objects. Letâs modify the example slightly. In the example, most of the values are set as Option
s. This is because when you create your Person
,
those values might not yet be initialized, but they might be initialized later. So,
you need a way to capture that fact. But what if, on analyzing the
domain, you realize that a person was always created with all of that
information present? How could you take advantage of your actor to capture
that fact? Hereâs one way:
object
Person
{
case
class
PersonalInformation
(
firstName
:
FirstName
,
lastName
:
LastName
,
address
:
Address
,
phoneNumber
:
PhoneNumber
,
roles
:
Set
[
Role
]
=
Set
.
empty
)
case
class
Create
(
personalInformation
:
PersonalInformation
)
}
class
Person
extends
Actor
{
override
def
receive
:
Receive
=
{
case
Create
(
personalInformation
)
=>
context
.
become
(
created
(
personalInformation
))
}
private
def
created
(
personalInformation
:
PersonalInformation
)
:
Receive
=
{
...
}
}
In this example, there are two new states. There is an initial state, and a
created state. In the initial state, the personal information has not
been provided. The actor therefore accepts only one command (Create
).
After you transition to the created state, you can go back to handling the other
messages, which allows you to set individual fields. This eliminates the
need for optional values because your actor can exist in only one of two
states: it is either unpopulated or fully populated. In this case, the
state object is valid only in specific states; in this example, itâs the
âcreatedâ state. You could use a var
and set it as an option to capture
this fact, but then you would need to constantly check for the value of
that option to ensure that it is what you expect. By capturing the state
by using become
you ensure that you handle only message types that are
valid in the current state. This also ensures that the state is always what
you expect without requiring additional checks.
When there are other multiple behavior transitions for which each
transition might have different state, it is often better to capture that
state by using become
because in this case it simplifies the logic
rather than making it more complex. It reduces the cognitive
overhead.
Using combinations of become
, mutable fields, and functional state
objects, you can provide rich actors that accurately capture the
applicationâs domain. Immutable state objects make it possible for you to create rich
domain constructs that can be fully tested independently of your actors.
You can use become
to ensure that your actors can exist only in a set of
valid states. You donât need to deal with creating default values for
things that are of no concern. And when the actors are simple enough, you can code them by using more familiar constructs like mutable fields with
immutable collections.
Mixing Futures with Actors
Futures are an effective way to introduce concurrency into a system. But when you combine them with Akka, you need to be careful. Futures provide one model of concurrency, whereas Akka provides a different model. Each model has certain design principles that it adheres to, and those principles donât always work well together. You should always try to avoid mixing concurrency models within a single context, but sometimes you might find yourself in a situation in which it is required. For those times, you need to be careful to follow the design patterns that allow you to use futures with actors safely.
The heart of the problem is that you are combining two very different models of concurrency. Futures treat concurrency quite differently from actors. They donât respect the single-threaded illusion that actors provide. It therefore becomes easy to break that illusion when using futures with actors. Letâs look at some very simple ways by which we can break the single-threaded illusion, beginning with this example:
trait
AvailabilityCalendarRepository
{
def
find
(
resourceId
:
Resource
)
:
Future
[
AvailabilityCalendar
]
}
This is a very simple use of a future. In this case, we have chosen a future to account for the fact that the repository can access a database, and that might take time and can fail. This is an appropriate use of a future and by itself it doesnât create any issues.
The problem in this case arises when you try to use the repository in the
context of an actor. Suppose that in order to bring the usage of
this repository into your actor system, you decide to create an Actor
wrapper around it. Our initial naive implementation might look like the
following:
object
AvailabilityCalendarWrapper
{
case
class
Find
(
resourceId
:
Resource
)
}
class
AvailabilityCalendarWrapper
(
calendarRepository
:
AvailabilityCalendarRepository
)
extends
Actor
{
import
AvailabilityCalendarWrapper._
override
def
receive
:
Receive
=
{
case
Find
(
resourceId
)
=>
calendarRepository
.
find
(
resourceId
)
.
foreach
(
result
=>
sender
()
!
result
)
// WRONG! The sender may have changed!
}
}
This seems fairly simple. The code receives a Find
message, extracts the
ResourceId
, and makes a call to the repository. That result is then returned to the sender. The problem here is that it breaks the
single-threaded illusion. The foreach
in this code is operating on a
future. This means that when you run this code, you are potentially working
within a different thread. This could mean that the actor has moved on
and is processing another message by the time that foreach
runs. And that means the sender might have changed. It might not be the
actor that you were expecting.
There are multiple ways by which you can fix this problem. You could create a temporary value to hold the correct sender reference, such as in the following:
case
Find
(
resourceId
)
=>
val
replyTo
=
sender
()
calendarRepository
.
find
(
resourceId
)
.
foreach
(
result
=>
replyTo
!
result
)
This theoretically fixes the issue. ReplyTo
will be set to the value of
sender()
at the point at which you are interested in it and it wonât change after
that. But this isnât ideal for a number of reasons. What if you werenât
just accessing the sender? What if there were other mutable states that you need to manipulate? How would you deal with that? But there is a yet more
fundamental problem with the preceding code. Without looking at the
signature of the repository, how can you verify that you are working with a
future? That foreach
could just as easily be operating on an Option
or a
collection. And although you can switch to using a for
comprehension or
change to using the onComplete
callback, it still doesnât really
highlight the fact that this operation has become multithreaded.
A better solution is to use the Pipe pattern in Akka.
With the Pipe pattern you can take a future and âpipeâ the result of
that future to another actor. The result is that the actor will receive
the result of that future at some point or a Status.Failure
if the
future fails to complete. If you alter the previous code to use the Pipe
pattern, it could look like the following:
import
akka.pattern.pipe
object
AvailabilityCalendarWrapper
{
case
class
Find
(
resourceId
:
Resource
)
}
class
AvailabilityCalendarWrapper
(
calendarRepository
:
AvailabilityCalendarRepository
)
extends
Actor
{
import
AvailabilityCalendarWrapper._
override
def
receive
:
Receive
=
{
case
Find
(
resourceId
)
=>
calendarRepository
.
find
(
resourceId
).
pipeTo
(
sender
())
}
}
There are multiple advantages to using the Pipe pattern. First, this pattern
allows you to maintain the single-threaded illusion. Because the result
of the future is sent back to the actor as just another message, you know
that when you receive it, you wonât need any concurrent access to the
state of the actor. In this example, the sender is resolved when you
call the pipeTo
, but the message isnât sent until the future completes.
This means that the sender will remain correct even though the actor might
move on to process other messages.
The other benefit of this pattern is that it is explicit. There is no
question when you look at this code as to whether concurrency is
happening. Itâs evident just by looking at the pipeTo
that this operation
will complete in the future rather than immediately. That is the very
definition of the Pipe pattern. You donât need to click through to the
signature of the repository to know that there is a future involved.
But letâs take it a step further. What if you didnât want to immediately send the result to the sender? What if you want to first perform some other operations on it, such as modifying the data so that it adheres to a different format? How does this affect this example? Again, letâs look at the naive approach first:
import
akka.pattern.pipe
object
AvailabilityCalendarWrapper
{
case
class
Find
(
resourceId
:
Resource
)
case
class
ModifiedCalendar
(...)
}
class
AvailabilityCalendarWrapper
(
calendarRepository
:
AvailabilityCalendarRepository
)
extends
Actor
{
import
AvailabilityCalendarWrapper._
private
def
modifyResults
(
calendar
:
AvailabilityCalendar
)
:
ModifiedCalendar
=
{
// Perform various modifications on the calendar
ModifiedCalendar
(...)
}
override
def
receive
:
Receive
=
{
case
Find
(
resourceId
)
=>
calendarRepository
.
find
(
resourceId
)
.
map
(
result
=>
modifyResults
)
.
pipeTo
(
sender
())
}
}
As before, at first glance this looks OK. The code is still using the Pipe
pattern so the sender is safe. But we have introduced a .map
on the
future. Within this .map
we are calling a function. The problem now is
that, again, .map
is operating within a separate thread. This too creates an opportunity to break the single-threaded illusion.
Initially the code might be entirely safe. Later, however, someone might
decide to access the sender within the modifyResults
, or you might decide
to access some other mutable state in the calendar. Because it is inside
of a function in the AvailabilityCalendarWrapper
, you might assume that it
is safe to access that mutable state. It is not obvious that this
function is actually being called within a future. It is far too easy to
accidentally make a change to this code without realizing that you were
operating in a separate execution context. So how do you solve that? Hereâs one way:
import
akka.pattern.pipe
object
AvailabilityCalendarWrapper
{
case
class
Find
(
resourceId
:
Resource
)
case
class
ModifiedCalendar
(...)
}
class
AvailabilityCalendarWrapper
(
calendarRepository
:
AvailabilityCalendarRepository
)
extends
Actor
{
import
AvailabilityCalendarWrapper._
override
def
receive
:
Receive
=
{
case
Find
(
resourceId
)
=>
calendarRepository
.
find
(
resourceId
).
pipeTo
(
self
)(
sender
())
case
AvailabilityCalendar
(...)
=>
// Perform various modifications on the calendar
sender
()
!
ModifiedCalendar
(...)
}
}
This example eliminates the .map
in this case and brings back the
pipeTo
immediately after the future resolves. This time, though, we
pipe back to self and include the sender as a secondary argument in the
pipeTo
. This allows you to maintain the link to the original sender. Again, this is more explicit about what portions of the operation are
happening in the future and what portions are happening in the present.
There is no longer a possibility of breaking the single-threaded
illusion.
Sometimes, you might find yourself in a situation in which the future returns something too
generic, like an Integer or a String. In this case, simply using the
Pipe pattern by itself can make the code confusing. Your actor is going
to need to receive a simple type that is not very descriptive. It would
be better if you could enrich the type in some way, perhaps wrapping it
in a message. In this case, it is appropriate to use a .map
on the
future for the purpose of converting the message to the appropriate
wrapper, as long as you keep that operation as simple and as isolated as
possible. For example:
ourFuture
.
map
(
result
=>
Wrapper
(
result
)).
pipeTo
(
self
)
The only operation we are making within the .map
is to construct the
wrapper type. Nothing more. This type of .map
on a future is
considered to be low risk enough that you can use it within an actor.
The truth is that even this code introduces some risk. Where is
Wrapper
defined? Does it access mutable state somewhere during its
construction? It is still possible to make a mistake with this code, but
as long as you are following best practices, it will be highly unlikely.
For that reason this code is generally considered acceptable.
You need to keep in mind that within an actor there are many things that
might be considered mutable state. The âsenderâ is certainly one form of
mutable state. A mutable var
or a mutable collection is also mutable
state. The actorâs context object also contains mutable state.
Context.become
, for example, is making use of mutable state. For that
matter an immutable val
that is declared in the parameters of a receive
method could be mutable state because the receive method can change.
Even access to a database that supports locking still constitutes
mutable state. Although it might be âsafe,â it is in fact mutating some
state of the system.
In general, you should favor pure functions within actors wherever possible. These pure functions will never read or write any of the state of the actor, instead relying on the parameters passed in, and returning any modified values from the function. These can then be safely used to modify the mutable state within the actor. Pure functions can also be extracted out of the actor itself, which can ensure that you donât access any mutable state in the actor itself. This makes it possible for you to use those functions in a thread-safe manner within the context of an actor.
When you do find yourself using impure functions within an actor, you need to ensure that you do so within the context of the single-threaded illusion. To do so, you should always prefer the Pipe pattern when working with futures. Sometimes, it can be beneficial to create a wrapper like you did for the repository. This wrapper isolates the future so that you only worry about it in one place and the rest of the system can ignore it. But even better is when you can avoid the future completely and use other techniques in its place.
Ask Pattern and Alternatives
The Ask pattern is a common pattern in Akka. Its usage is very simple and it serves very specific use cases. Letâs quickly review how it works.
Suppose that you want to make use of the âwrapperâ actor that was introduced earlier. You want to send the âFindâ message, and then you need the result of that. To do this using the Ask pattern, you might have something like the following:
import
akka.pattern.ask
import
akka.actor.Timeout
import
scala.concurrent.duration._
implicit
val
timeout
=
Timeout
(
5.
seconds
)
val
resultFuture
=
(
availabilityCalendar
?
AvailabilityCalendarWrapper
.
find
(
resourceId
)).
mapTo
[
AvailabilityCalendar
]
In this example, you are using the Ask pattern to obtain a Future[Any]
. It then uses the mapTo
function to convert that future into a
Future[AvailabilityCalendar]
.
This is a very helpful pattern when you have an operation for which you need to guarantee success within a certain amount of time. In the event of a failure, you will receive a failed future. In the event that the operation takes too long or doesnât complete, you will also receive a failed future. This can be very useful for situations in which a user is waiting on the other end. The Ask pattern therefore can be quite common when building web apps or REST APIs or other applications that have an agreed-upon time limit.
Problems with Ask
You need to be careful with this approach, though. On its own, there is nothing wrong with it, and in many cases it represents exactly what you want. The problems arise when you begin to overuse this pattern. The Ask pattern has a lot of familiarity. It bears similarity to the way functional programming works. You call a function with some set of parameters, and that function returns a value. Thatâs what we are doing here, but instead of a function, we are talking to an actor. But where the functional model emphasizes a request/response pattern, the Actor Model usually works better with a âTell, Donât Askâ approach.
Letâs explore an example that shows how overusing the Ask pattern can break down. For this example, we will leave our domain behind so that we can focus on just the problem.
Suppose that you have a series of actors. Each actor has the job of performing some small task and then passing the result to the next actor in a pipeline. At the end, you want some original actor to receive the results of the processing. Letâs also assume that the entire operation needs to happen within a 5-second window. Letâs take a look at the code:
import
akka.pattern.
{
ask
,
pipe
}
case
class
ProcessData
(
data
:
String
)
case
class
DataProcessed
(
data
:
String
)
class
Stage1
()
extends
Actor
{
val
nextStage
=
context
.
actorOf
(
Props
(
new
Stage2
()))
override
def
receive
:
Receive
=
{
case
ProcessData
(
data
)
=>
val
processedData
=
processData
(
data
)
implicit
val
timeout
=
Timeout
(
5.
seconds
)
(
nextStage
?
ProcessData
(
processedData
)).
pipeTo
(
sender
())
}
}
class
Stage2
extends
Actor
{
val
nextStage
=
context
.
actorOf
(
Props
(
new
Stage3
()))
override
def
receive
:
Receive
=
{
case
ProcessData
(
data
)
=>
val
processedData
=
processData
(
data
)
implicit
val
timeout
=
Timeout
(
5.
seconds
)
(
nextStage
?
ProcessData
(
processedData
)).
pipeTo
(
sender
())
}
}
class
Stage3
extends
Actor
{
override
def
receive
:
Receive
=
{
case
ProcessData
(
data
)
=>
val
processedData
=
processData
(
data
)
sender
!
DataProcessed
(
processedData
)
}
}
This code will certainly work and get the job done. But there a few oddities present in the solution. The first issue is the timeouts. Each stage, except for the last, has a 5-second timeout. But here is the problem. The first phase uses 5 seconds. But presumably the second phase will consume some amount of that 5 seconds. Maybe it consumes 1 second. Therefore, when you go to send a message to the next phase, you donât really want a 5-second timeout anymore; you want a 4-second timeout. Because if the third phase takes 4.5 seconds, the second phase would succeed but the entire process would still fail because the first timeout of 5 seconds would be exceeded. The problem here is that the first timeout, the original 5 seconds, is important to the application. It is critical that this operation succeeds within the 5-second window. But all the other timeouts after that are irrelevant. Whether they take 10 seconds or 2 seconds doesnât matter, itâs only that single 5-second timeout that has value. Because we have overused the Ask pattern here, we are forced to introduce timeouts at each stage of the process. These arbitrary timeouts create confusion in the code. It would be better if we could modify the solution so that only the first timeout is necessary.
The key here is that every time you introduce a timeout into the system, you need to think about whether that timeout is necessary. Does it have meaning? If the timeout fails, is there some action that can be taken to correct the problem? Do you need to inform someone in the event of a failure? If you answer âyesâ to these questions, the Ask pattern might be the right solution. But, if there is already a timeout at another level to handle this case, perhaps you should try to avoid the Ask pattern and instead allow the existing mechanism to handle the failure for you.
Accidental Complexity
There is another problem with the code in the previous example. There is a complexity involved in using the Ask pattern. Behind the scenes, it is creating a temporary actor that is responsible for waiting for the results. This temporary actor is cheap, but not free. We end up with something like that shown in Figure 4-1.
In the figure, you see the details happening inside the seemingly simple Ask.
We have created a more complex message flow than we really need. We have introduced at least two temporary actors that must be maintained by the system. Granted, that happens behind the scenes without our intervention, but it still consumes resources. And if you look at the diagram and try to reason about it, you can see that there is additional complexity here that you donât really want. In reality, what you would prefer is to bypass the middle actors and send the response directly to the sender. What you really want is to Tell, not Ask.
Alternatives to Ask
There are multiple ways that you could fix this pipeline. One option,
given the trivial nature of the example, would be to use the forward
method. By forwarding the message rather than using the Tell operator,
you ensure that the final stage in the pipeline has a reference to
the original sender. This eliminates the need for the timeouts and
it would eliminate the temporary actors acting as middlemen, as demonstrated here:
class
Stage1
()
extends
Actor
{
val
nextStage
=
context
.
actorOf
(
Props
(
new
Stage2
()))
override
def
receive
:
Receive
=
{
case
ProcessData
(
data
)
=>
val
processedData
=
processData
(
data
)
nextStage
.
forward
(
ProcessData
(
processedData
))
}
}
Another approach to eliminating the Ask would be to pass a replyTo
actor along as part of the messages. You could include a reference to the
actor that is expecting the response. Then, at any stage of the pipeline
you would have access to that actor to send it the results or perhaps to
back out of the pipeline early if necessary:
case
class
ProcessData
(
data
:
String
,
replyTo
:
ActorRef
)
case
class
DataProcessed
(
data
:
String
)
class
Stage1
()
extends
Actor
{
val
nextStage
=
context
.
actorOf
(
Props
(
new
Stage2
()))
override
def
receive
:
Receive
=
{
case
ProcessData
(
data
,
replyTo
)
=>
val
processedData
=
processData
(
data
)
nextStage
!
ProcessData
(
processedData
,
replyTo
)
}
}
class
Stage2
extends
Actor
{
val
nextStage
=
context
.
actorOf
(
Props
(
new
Stage3
()))
override
def
receive
:
Receive
=
{
case
ProcessData
(
data
,
replyTo
)
=>
val
processedData
=
processData
(
data
)
nextStage
!
ProcessData
(
processedData
,
replyTo
)
}
}
class
Stage3
extends
Actor
{
override
def
receive
:
Receive
=
{
case
ProcessData
(
data
,
replyTo
)
=>
val
processedData
=
processData
(
data
)
replyTo
!
DataProcessed
(
processedData
)
}
}
A final approach would be to pass a Promise
as part of the message. Here, you would send the Promise
through the pipeline so that the final stage
of the pipe could simply complete the Promise
. The top level of the
chain would then have access to the future that Promise
is completing,
and you could resolve the future and deal with it appropriately (using
pipeTo
). The following example shows how to do this:
case
class
ProcessData
(
data
:
String
,
response
:
Promise
[
String
])
class
Stage1
()
extends
Actor
{
val
nextStage
=
context
.
actorOf
(
Props
(
new
Stage2
()))
override
def
receive
:
Receive
=
{
case
ProcessData
(
data
,
response
)
=>
val
processedData
=
processData
(
data
)
implicit
val
timeout
=
Timeout
(
5.
seconds
)
nextStage
!
ProcessData
(
processedData
,
response
)
}
}
class
Stage2
extends
Actor
{
val
nextStage
=
context
.
actorOf
(
Props
(
new
Stage3
()))
override
def
receive
:
Receive
=
{
case
ProcessData
(
data
,
response
)
=>
val
processedData
=
processData
(
data
)
implicit
val
timeout
=
Timeout
(
5.
seconds
)
nextStage
!
ProcessData
(
processedData
,
response
)
}
}
class
Stage3
extends
Actor
{
override
def
receive
:
Receive
=
{
case
ProcessData
(
data
,
response
)
=>
val
processedData
=
processData
(
data
)
response
.
complete
(
Success
(
processedData
))
}
}
Each of these approaches is valid, and depending on the situation, one might fit better than the other. The point is to realize that you should use Ask only in very specific circumstances. Ask is designed to be used for situations in which either you need to communicate with an actor from outside the system, or you need to use a request/responseâstyle approach and a timeout is desirable. If you find you need to introduce a timeout into a situation that doesnât warrant one, you are probably using the wrong approach.
Commands Versus Events
Messages to and from actors can be broken down into two broad categories: commands and events.
A command is a request for something to happen in the future, which might or might not happen; for instance, an actor might choose to reject or ignore a command if it violates some business rule.
An event is something that records an action that has already taken place. Itâs in the past, and canât be changedâother actors or elements of the system can react to it or not, but they cannot modify it.
It is often helpful to break down the actorâs protocolâthe set of classes and objects that represents the messages this actor understands and emitsâinto commands and events so that it is clear what is inbound and what is outbound from this actor. Keep in mind, of course, that an actor could consume both commands and events, and emit both as well.
You can often think of an actorâs message protocol as its API. You are defining the inputs and outputs to the actor. The inputs are usually defined as commands, whereas the outputs are usually defined as events. Letâs look at a quick example:
object
ProjectScheduler
{
case
class
ScheduleProject
(
project
:
Project
)
case
class
ProjectScheduled
(
project
:
Project
)
}
Here, we have defined a very simple message protocol for the
ProjectScheduler
actor. ScheduleProject
is a command. We are instructing the
ProjectScheduler
to perform an operation. That operation is not yet
complete and therefore could fail. On the other hand, ProjectScheduled
is an event. It represents something that happened in the past. It canât
fail because it has already happened. ScheduleProject
is the input to
the actor and ProjectScheduled
is the output. If we werenât using actors
and we wanted to represent this functionally, it could look something
like this:
class
ProjectScheduler
{
def
execute
(
command
:
ScheduleProject
)
:
ProjectScheduled
=
...
}
Understanding the difference between commands and events in a system
is important. It can help you to eliminate dependencies. As a small
example, consider the following frequently asked question: why is the ProjectScheduled
part of the ProjectScheduler
âs message protocol rather than being part
of the protocol for the actor that receives it? The simple answer is
that it is an event rather than a command. The more complete answer,
though, is that it avoids a bidirectional dependency. If we moved it to
the message protocol of the actor that receives it, the sender
would need to know about the ProjectScheduler
âs message protocol, and
the ProjectScheduler
would need to know about the senderâs message
protocol. By keeping the commands and the resulting events in the same
message protocol, the ProjectScheduler
doesnât need to know anything
about the sender. This helps to keep your application decoupled.
A common practice, as just illustrated, is to put the message protocol for the actor into its companion object. Sometimes, it is desirable to share a message protocol among multiple actors. In this case, a protocol object can be created to contain the messages instead, as shown here:
object
ProjectSchedulerProtocol
{
case
class
ScheduleProject
(
project
:
Project
)
case
class
ProjectScheduled
(
project
:
Project
)
}
Constructor Dependency Injection
When you create Props for your actor, you can include an explicit reference to another actor. This is simple to do, and is often a good approach.
Here is an example of direct injection of another ActorRef
when an
actor is created:
val
peopleActor
:
ActorRef
=
...
val
projectsActor
:
ActorRef
=
system
.
actorOf
(
ProjectsActor
.
props
(
peopleActor
))
However, if your actors have a number of references, or the order of instantiation means that the necessary reference isnât available until after the receiving actor is already constructed, you must look for other means.
actorSelection via Path
One option is to refer to the destination actor via its path, as opposed to its reference. This requires that you know the path, of course, possibly accessing it via a static value in the destination actorâs companion object.
A path can be resolved only to an actorSelection
, however, not an actual
actor reference. If you want access to the actor reference, you must send
the identify message, and have the reply give us the reference. Or, you simply send your messages to the actor selection directly.
Actor selections have their own problems, though. An actor selection is
unverified until you send the Identify
message. This means that there
might or might not be an actor on the other end. This in turn means that when you send a
message to a selection, there is no guarantee that there will be an
actor on the other end to receive it. In addition, a selection can
include wildcards. This means that there might actually be more than one actor
on the other end, and when you send a message you are in fact
broadcasting it to all actors. This might not be what you want.
Actor selections, when not used carefully, can also lead to increased dependencies. Often, if you are using an actor selection, it implies that you have not carefully thought about your actor hierarchy. You are falling back to an actor selection because you need to reference an actor that is part of a different hierarchy. This creates coupling between different areas of your actor system and makes them more difficult to break apart or distribute later. You can think of this as being similar to littering your code with singleton objects, which is usually considered a bad practice.
Actor reference as a message
A better way to go for more complex dependencies between actors is to send the actor reference as a message to the actor that needs itâfor example, you effectively âintroduce yourselfâ to the actor that needs to send a message to you by encapsulating an actor reference in a specific type (so that it can be differentiated by the receive method of the destination actor), and then holding the passed value in the destination actor for later use.
This technique is essentially what we introduced when we used a
replyTo
actor in the message protocol when we were discussing the Ask
pattern. The replyTo
is a way to introduce one actor to another so that
the first actor knows where to send messages.
This method demands a little more thought, requiring you to design your hierarchy in such a way that the necessary actors can talk to one another. This additional care is good because it means that you are thinking about the overall structure of the application, rather than just worrying about the individual actors. But you need to be careful here, as well. If you find that youâre spending too much effort in passing reference to actors around through messages, it might still mean that you have a poorly defined hierarchy.
Conclusion
In summary, by keeping in mind a few basic principles and techniques, you can create actors that retain cohesion, avoid unnecessary blocking and complexity, and allow the data flow within your actor system to be as smooth and concurrent as possible.
Now that you have seen how individual actors can best be structured, we will raise our level of abstraction to consider the flow of data through the entire system, and how correct actor interactions support that flow.
Get Applied Akka Patterns now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.