Chapter 4. Orchestrate Anything
Now we’ll shift our attention to what problems process automation can solve for you. This chapter shows that workflow engines can orchestrate anything, especially:
-
Software components
-
Decisions
-
Humans
-
RPA bots and physical devices
But what is orchestration? It’s a loaded term with different meanings for different people. For example, in the cloud native community, orchestration is often connected to container management, which is what tools like Kubernetes are doing. In the process automation space, orchestration really means coordination.
Looking back at the BPMN examples earlier in the book, you could say that the workflow engine orchestrates the tasks contained in the models. And as these tasks might call some external services, you could also say the process orchestrates these services. Whenever you add human tasks to the mix, the workflow engine orchestrates the humans. While this sounds a bit odd, it is actually accurate (if you prefer, you can replace orchestrate with coordinate).
In this chapter we’ll use the example of a small telecommunications company. Whenever a customer wants a new mobile phone contract, the customer’s data has to be saved into four different systems: the CRM system, the billing system, the system to provision the SIM card, and the system to register the SIM card and phone number in the network.
To improve the onboarding process for new customers, the company uses a workflow engine. Depending on the situation at hand, each task within the onboarding process might involve:
-
Calling a software component
-
Evaluating a decision using a decision engine
-
A human doing the work manually
-
An RPA bot steering some graphical user interface
Each of these options is discussed in more detail in the next sections.
A quick note for the impatient: Chapter 8 will dive into choreography, another approach to automating processes. You don’t need this knowledge to apply orchestration, so we can safely postpone it until you understand more about process automation, but it will be handy to help you better understand the spectrum of solution approaches.
Orchestrate Software
We’ll start with what we as tech folks like most: orchestrating software. A workflow engine can basically orchestrate anything that has an API.
Let’s assume the onboarding process looks like Figure 4-1.
Whenever there is a new customer order, a new instance of the onboarding process is started. The new customer is saved in the CRM and billing systems in parallel. Only if both are successful is the SIM card provisioning triggered and the SIM registered in the network. The service tasks are wired up to API calls, as you saw earlier in this book.
This leads to a fully automated process, also known as straight-through processing (STP). This has big advantages over manual processing:
-
You save manual labor and reduce your operational spend on this process. At the same time, you increase your capability to scale, as the process can now handle more load.
-
You reduce the potential for human error by making sure the data is always transferred correctly.
Different architecture patterns exist, which influence the way you operate the workflow engine and design your process. We’ll look at the most important ones in the next sections: service-oriented architecture, microservices, and functions.
Service-Oriented Architecture (SOA) Services
A typical SOA blueprint is illustrated in Figure 4-2. These blueprints advocate for a central BPM platform containing the workflow engine, which then communicates with the services via a central enterprise service bus (ESB). This centralized infrastructure is the typical pain point and leads to a lot of problems, as described in “Centralized SOA and the ESB”.
This kind of architecture is typically not the architecture of choice for new projects. Of course, there are good reasons to distribute business logic into multiple services, but ideas around microservices are the more modern way to look at it, avoiding failures of the SOA era.
If you are working in a SOA environment, you can still be successful. Make sure that you avoid the issues around centralized tooling and be extra cautious about ownership of process definitions—for example, every business process model needs to be owned by a development team that cares about business logic, and should not be owned by a central BPM team. We’ll discuss this further in “Decentralized Engines”.
Microservices
The movement around microservices took a lot of lessons about SOA into account and defined what some see as SOA 2.0. Sam Newman provides a useful definition in his book Building Microservices (O’Reilly): microservices are “small, autonomous services that work together.”
Regarding them being small, the most important thing to know is that microservices are clearly scoped and focused. A microservice is purpose-built to solve a specific domain problem. Chapter 7 will dive more into the boundaries between services and processes.
To understand the autonomy aspect of a microservice, suppose that your team is fully empowered to own a microservice around SIM card provisioning. You can freely choose your tech stack (typically, as long as you stay within the boundaries of your enterprise architecture) and your team deploys and operates that service itself. This allows you to implement or change the service at your own discretion (as long as you don’t break the API). You don’t have to ask other people to do anything for you, or join a release train. This will make your team fast in delivering changes and actually also increase motivation, as owning their service makes team members truly feel empowered.
Applying the microservice architectural style does have an impact on process automation. Automating one business process typically involves multiple microservices. With SOA, the view was that an orchestration process “outside” of the services was required to piece them together. The microservices style doesn’t allow business logic outside of the microservices, which means that the collaboration between them is described within the microservices themselves.
For example, a customer onboarding microservice owns the business logic around onboarding, which includes the onboarding business process. The team implementing the microservice can decide to use a workflow engine and BPMN to automate that process, which then orchestrates other microservices. The decision is internal to the microservice and not visible from the outside; it is an implementation detail.
Communication between the microservices is done via APIs, and not through the BPM platform, as was the case with SOA. This scenario is sketched in Figure 4-3.
In microservices communities, the argument is often made to not use orchestration, but to let microservices collaborate in an event-driven way. We’ll table this question for now and discuss it in Chapter 8.
Serverless Functions
Microservices might be small, but you can disassemble your architecture into even smaller pieces: functions.
A serverless function is similar to a stateless function in your favorite programming language, but operated in a hosted cloud infrastructure. This means you don’t have to provide an environment for the function to run in yourself. A serverless function takes some input and produces some output, but needs to be completely self-contained. For example, you can’t hold any data that survives the current invocation (unless storing it in some external resource). Serverless is popular because it promises elastic scalability. You don’t pay for computational resources when your functions aren’t being used. When your traffic skyrockets, those resources are automatically scaled up to handle it.
But having a bunch of functions raises the question of how they interact to fulfill a goal. Suppose you want to use this approach for customer onboarding. You implement one function to add the customer to the CRM system, one to add them to the billing system, one for provisioning the SIM card, and so on.
The simplest way to provide the onboarding functionality would be to create a combined function that includes or calls the other functions:
function
onboardCustomer
(
customer
)
{
crmPromise
=
createCustomerInCrm
(
customer
);
// 2 seconds
billingPromise
=
createCustomerInBilling
(
customer
);
// 100 ms
// TODO: Wait for 2 promises
simCard
=
provisionSimCard
(
customer
);
// 1 second
registerSim
(
simCard
);
// 4 seconds
}
// --> 7 seconds runtime for onboardCustomer
While this looks simple, it has severe downsides. First, it only works if all of the functions are available and return fast results. Otherwise, you can easily end up with a customer created in CRM and billing that never gets a SIM card because the last function crashed. Additionally, this solution accumulates latency, as indicated in the previous code snippet. Even if a longer response time isn’t a problem, it will add up on your cloud bill, as serverless providers charge for the computing time consumed by your function.
So, a combined function is best avoided. Instead, most projects use their cloud provider’s messaging capabilities to create a chain of functions. Imagine it like this:
// callback function registered for message "customerOnboardingRequest"
function
onboardCustomer
(
customer
)
{
...
do
business
logic
...
send
(
'createCustomerInCrmRequest'
);
}
// callback function registered for message "createCustomerInCrmRequest"
function
createCustomerInCrmRequest
(
customer
)
{
...
do
business
logic
...
send
(
'createCustomerInBillingRequest'
);
}
// callback function registered for message "createCustomerInBillingRequest"
function
createCustomerInBilling
(
customer
)
{
...
do
business
logic
...
}
This way, you get rid of the one expensive combined function and make your code more resilient. The message queue will remember what to do next even if a function’s code fails.
But now you may end up with problems similar to those associated with batches and streaming: you don’t have end-to-end visibility of your chain, you have no single point where you can adjust it, and it is hard to understand and resolve failures. To mitigate these problems (which will be explained in more detail in “Limitations of Other Implementation Options”), you can use a workflow engine to orchestrate your functions. To do this, you will need a workflow engine that runs as a managed service. This means that the workflow engine itself is also a serverless resource for you.
In the onboarding example, the team responsible for developing the customer onboarding function can also define the process model, as visualized in Figure 4-4. In this process model, every service task is glued to a function call. How this is technically done depends on your exact cloud environment; typical examples are native function calls, HTTP calls via an API gateway, or messages. Your workflow engine of choice might also provide prebuilt connectors you can use (one of the examples where connectors, introduced in “Using Prebuilt Connectors”, make a lot of sense).
Whenever the team deploys the onboarding function, it also needs to deploy the process model on the workflow engine, which can probably be automated.
Every major cloud provider today has stateful function orchestration capabilities in its platform (AWS Step Functions, Azure Durable Functions, GCP Cloud Workflows). Unfortunately, they all miss important workflow engine functionality as described in this book. Specifically, none of them uses BPMN, which leads to limited language power (see “Workflow Patterns”) and no or very poor visualization capabilities (see “Benefits of Graphical Process Visualizations”).
So there is additional value in leveraging BPMN-based workflow engines to orchestrate functions, which is a very promising area to explore. You’ll find an executable example using Camunda Cloud and AWS Lambda on the book’s website.
Modular Monoliths
Not every company is able or willing to dump its monolith in favor of fine-grained systems like microservices or functions. In fact, there is even a growing trend toward embracing the monolith for some of its advantages. Because a monolith is not a distributed system, it doesn’t have to constantly fight with remote communication or consistency issues. And you can still apply modularization strategies so that any changes only affect small parts of the code.
A monolith can be perfectly fine if it solves your problem, which often has a lot to do with your internal organization and size. A development team of 10 people might very well master a monolith, but struggle with the added complexity around working on 100 microservices. On the other hand, an organization with a thousand developers might not be productive if it builds and releases one single monolith.
The interesting observation with regard to processes is that you can still apply the practices described in this book within your monolith. You will (hopefully) structure your monolith in a meaningful way, for example by forming components, sorting the code into packages, and creating interfaces for important services. To design executable processes, you simply orchestrate these internal components—for example, this might translate to using local method calls instead of remote calls. The workflow engine itself can be embedded as a library into your monolith. Process definitions simply become one additional resource in the source code of the monolith. This is visualized in Figure 4-5.
This way, you can add the benefits of using a workflow engine (long-running capabilities with state management, visibility of the process) without losing the benefits of a monolith (not having a distributed system). Adding a workflow engine typically does not have much impact on performance. Of course, this depends on the tool you choose and the architecture you set up, but even with a workflow engine operated as its own service, the overhead can be minimal (like with a database, which is also a remote service that is consumed).
Furthermore, having a workflow engine might give you the possibility to deploy changed process models without redeploying the whole monolith. That alone is sometimes enough motivation for introducing a workflow engine to a large monolith.
Deconstructing the Monolith
While a modular monolith can be a valid solution, many companies are on a migration path, moving away from monoliths and toward a more fine-grained architecture. Process automation can help with this journey. Imagine you have the telco monolith from the last section in place, but want to change the customer onboarding procedure. Instead of squeezing the process into your monolith, you can instead take the opportunity to create a (micro)service to do the onboarding.
To do this, you have to create APIs for the services required, which means that you start to add facades to your existing monolith. At the same time, you have to remove hardwired connections between components; for example, the CRM component should no longer directly call the billing component for new customers, as you want to control this connection via the new (micro)service. Figure 4-6 visualizes this approach.
These projects are typically not easy to tackle. And while this might feel like putting lipstick on a pig, it is a first step in the right direction toward deconstructing the monolith and increasing agility. If you keep doing this for every process you touch, you will decrease the monolith’s footprint slowly over time, in favor of a more fine-grained architecture. The most successful architecture transformation I’ve seen did exactly this: the developers did not do a sudden transformation, but kept migrating, one step at a time, with discipline and endurance. The first steps were hardly visible, but after five years, a huge difference could be seen.
Orchestrate Decisions
Let’s extend the onboarding example to first validate the customer order by invoking some decision logic or business rules. The resulting process is shown in Figure 4-7.
A decision involves deriving a result (output) from given facts (input) on the basis of defined logic. While this decision logic could be executed by a human, it often makes sense to automate it, especially in automated processes. Of course, it could simply be hardcoded, but there are certain characteristics that justify the use of specific tooling.
First, decision logic is important business logic and needs to be understood by business stakeholders. And compared to process control flows, decision logic changes much more rapidly, so it is vital for business agility to be able to easily change this logic. Whenever you learn about a good reason to not validate certain customer orders, you want to adjust the decision logic right away before you onboard more customers with high-risk profiles. You definitely want to avoid situations where nobody really knows the decision logic because it is buried in tons of code that was written years ago.
On top of that, you gain visibility into decision instances, so that you can understand why a certain customer order was successfully validated or not.
This is the domain of decision automation. The core software components here are decision engines, which take decision logic expressed in a model and apply it to make decisions based on the given input. These engines typically can also version decision models and store a history of decisions that have been made. You might recognize some similarity to workflow engines, but decisions are not long-running; they can be made in one atomic step.
Decision Model and Notation (DMN)
As with BPMN for business processes, there is a globally adopted standard available for decisions: Decision Model and Notation (DMN). It is close to BPMN, and they’re often used alongside one another.
Let’s take a quick look at what DMN can do. The two concepts I want to focus on in this book are:
- Decision tables
-
These are used to define decision logic. Years of experience with various formats has shown that tables are a great way to express decision logic and business rules.
- Expression language
-
In order to automate decisions, you have to express logic in a format the computer understands. At the same time, you want to end up with decision logic that can be read by non-programmers. This is why DMN defined FEEL, the friendly-enough expression language that is executable, but also human-readable. As mentioned in Chapter 2, some workflow engines also use FEEL within BPMN processes, for example to decide which path to take in a process flow.
Let’s look at an example. Assume you want to decide if you can onboard the customer automatically. For this, you create the DMN model visualized in Figure 4-8.
You will use certain data points as input: namely, the payment type, some scoring for the customer’s neighborhood, and the monthly payment associated with the contract. This will result in an output, which in this example is one Boolean field that indicates whether a manual check is necessary.
Every row in such a table is one rule. The cells on the input side contain the rules or expressions and will resolve to true
or false
. Checks included in this example are paymentType == "invoice"
and monthlyPayment < 25
. These expressions are created by some information in the header of the table and the exact cell value.
Most examples in real life are as easy, as shown here, but it is also possible to create more sophisticated expression logic using FEEL. To give you some examples, the following expressions are all possible:
Party.Date < date("2021-01-01") Party.NumberOfGuests in [25..100] not( Party.Cancelled )
In a DMN table you can have as many input columns as you want. The expressions are connected using a logical AND. If all expressions resolve to true
, one says the rule “fires.”
A DMN table can control what happens in this case. This is the hit policy you can see at the top of Figure 4-8. In the example, it is “first”; this means that the first rule (starting from the top of the table) that fires will determine the result. So in this case, if a customer selected “prepaid,” the result is clear in the first row: a manual check does not need to be performed. Other hit policies could be that you expect only one rule to fire because there is no overlap, or that you sum up the results of all firing rules, e.g., to sum up risk scores.
While the example table has only one output column, you can have as many as you want.
Under the hood, a DMN decision table is stored as an XML file, like a BPMN process. Typical decision engines parse that decision model and then provide an API to make decisions, as shown in the following pseudocode:
input
=
Map
.
putValue
(
"paymentType"
,
"invoice"
)
.
putValue
(
"customerRegionScore"
,
34
)
.
putValue
(
"monthlyPayment"
,
30
);
decisionDefinition
=
dmnEngine
.
parseDecision
(
'
automaticProcessing
.
dmn
'
)
output
=
dmnEngine
.
evaluateDecision
(
decisionDefinition
,
input
)
output
.
get
(
'
manualCheckNecessary
'
)
This pseudocode uses a decision engine in a stateless way. It parses a file and then evaluates the decision directly. While this is very lightweight, you might want to leverage some further capabilities of a decision engine, like versioning of the decision models or keeping a history of decisions. So your code might look more like this:
input
=
Map
.
putValue
(
"paymentType"
,
"invoice"
)
.
putValue
(
"customerRegionScore"
,
34
)
.
putValue
(
"monthlyPayment"
,
30
);
output
=
dmnEngine
.
evaluateDecision
(
'
automaticProcessing
'
,
input
)
output
.
get
(
'
manualCheckNecessary
'
)
Decisions in a Process Model
Decision engines can of course be used standalone. While there are good cases for doing that, this book focuses on decisions in the context of process automation. In that context, decisions can be hooked into a process.
In BPMN there is even a specific “business rule” task type available for this. It is called a business rule task instead of a decision task for historical reasons, as these tools were called business rule engines at the time BPMN was standardized; today, the industry speaks of decision engines.
While the business rule task defines that a decision shall be made by a decision engine, it does not specify what this means on a technical level. So, you can write your own glue code to invoke the decision engine of your choice.
An alternative is to use vendor-specific extensions. For example, Camunda provides a BPMN workflow engine and a DMN decision engine, and has integrated them under the hood. This means that you can simply refer to a decision in the process model. In operations, audit information about why a decision was made is then also available directly from the history of the process instance.
Decision automation with DMN is a great way to improve business–IT collaboration and increase agility as decision logic gets easier to change. DMN is a great supplement to BPMN, as automating decisions helps to automate tasks within processes.
Orchestrate Humans
Of course, not every process is fully automated, even if most companies try to automate their processes to the highest possible extent. There are three typical reasons to let humans work on tasks:
-
With automation, you often need to have human task management as a fallback. Humans can easily work on the 10% of nonstandard cases that would be too expensive to automate, or deal with exceptional situations.
-
Human task management is often a first step toward automation. It allows you to quickly develop, roll out, and verify a process model, perhaps with only human tasks. Then you can increase automation by “replacing” humans with machines task by task.
-
Humans continue to play a role in more creative areas of processes, such as handling rare cases or making decisions. Removing repetitive tasks by automating them will not only increase their capacity for doing this, but will also remove friction between manual and automated work.
Please be aware that your business department is unlikely to talk about “orchestrating humans”; the more common (and psychologically acceptable) term is human task management.
A process using human task management for the onboarding process could look like Figure 4-10.
Even if the tasks themselves are not automated, using the workflow engine to automate the control flow still has a lot of benefits, especially if you compare it with the most likely alternative—passing around new contracts via email, with different people adding data to all these systems manually. For example:
-
You can make sure that no customer order gets lost or stuck, thereby increasing reliability in your services.
-
You can control the sequence of tasks. For example, you could parallelize entering the CRM and billing data, but still make sure that both need to finish before anything is provisioned. This speeds up your overall processing time.
-
You can make sure that the right data is attached to a process instance, so everybody involved always has everything they need right at hand.
-
You can monitor cycle times and SLAs, making sure that no customer order hangs for too long. You can also analyze more systematically where you can make improvements, which helps you increase efficiency.
-
You will get some KPIs around your processes, for example about the number of customer orders, types of contracts, and so on.
Tip
Business departments might not talk about workflow engines, orchestration, or human task management at all, even if this technology is working in the background. For example, take the approval of incoming invoices. Maybe a manager has a user interface to see all open invoices where they can approve them easily so that they get paid. Someone else will do the actual paying. This is a user experience you might be familiar with from accounting tools. But in the background, there might still be a workflow engine with a process model at play, so maybe the list of invoices to be approved in reality is a list of human tasks created from process instances. In this case, neither the process model nor the human tasks are obvious from a business perspective.
We’ll discuss some interesting aspects of human task management in the next sections.
Task Assignment
One important question is who should perform a particular task. Most workflow products provide a life cycle for every human task out of the box, like the one shown in Figure 4-11.
This example allows you to differentiate between candidate people and assigned people. Any candidate might do the task, like “somebody from the sales team” or “Joe, Mary, Rick, or Sandy.” The first of these candidates to start the work claims the task, and only then is it assigned to them personally. This claiming avoids two people working on the same task by coincidence. A task can be delegated when the assigned person wants somebody else to resolve (part of) the work. When they finish, it is passed back to the assignee. This is different from reassigning the work, which means handing over the task to another person, who then is fully responsible for completing the work at hand.
As a general rule, you should route human tasks in your process to groups of people instead of specific individuals (e.g., “the sales team”). This not only eases assignment rules, but also accommodates new hires, departures, vacations, sick leave, etc. Of course, there can be exceptions, like if a certain region has a dedicated salesperson assigned to it.
Please note that not all humans in your process have to be employees of your company. You can also assign work to customers—for example, asking them to upload missing documents.
In BPMN, the assignment of people is controlled by attributes of every user task. Here is an example:
<bpmn:userTask
id=
"Check payment"
/>
<potentialOwner>
<resourceAssignmentExpression>
<formalExpression>
sales</formalExpression>
</resourceAssignmentExpression>
</potentialOwner>
</userTask>
Additional Tool Support
Some tools provide additional capabilities around notifications, timeout handling and escalation, vacation management, or replacement rules. These capabilities can typically be configured as attributes of tasks and are as such not graphically visible in the process model.
It is a good idea to leverage these capabilities and not manually model these aspects into each and every process. So while you might be tempted to model an email reminder about work that has been waiting in the queue for too long via BPMN, please avoid it if your tool can do that out of the box using a simple configuration option. This will make your models easier to create, read, and understand, as you can see in Figure 4-12.
Supporting human task management is its own challenge for workflow engines. In addition to the vendor needing to provide graphical user interfaces for end users, the engine also needs to support extensive capabilities around filtering and querying of tasks.
While this might sound easy at first, it can become quite complicated if you need to deal with thousands of employees working with millions of tasks on a daily basis. You also face the challenge of providing flexible query possibilities without allowing single users to bring the performance down for the whole company. How this is implemented varies between vendors, but it is definitely a very different type of workload than microservices doing task after task after task.
The User Interface of User Tasks
The workflow engine is controlling the process. It knows for every process instance what the next activity is that the human needs to perform. But the human needs to know this too! So the workflow engine needs a way to communicate with real people.
One approach is to use the tasklist application provided by your vendor, as introduced in “Tasklist Applications”. These tools often allow end users to filter tasks. This means they might need to blend in business data, as end users not only want to see the task name, but also business data like order IDs, products applied for, or the applicant’s name.
Another important aspect is what kinds of task forms are supported. Some products allow the creation of only basic forms, by defining simple attributes. Others provide their own form modeler. Some allow you to embed HTML or to use custom forms like a one-pager in your custom web application, or a form created by a dedicated form builder application. Keep in mind that you’ll often need to blend data from the process with domain data from entities referenced in the task, in a single form, as shown in Figure 4-13—this results in better usability for your users.
Using your workflow vendor’s tasklist application can be a good way to get started quickly. You can immediately build a prototype for your process and click through it, probably even to verify the process model with business stakeholders. Most people are much better at understanding a process model if they can role play using real-life forms, instead of reading a formal model.
But there are also situations where you have requirements for a more customized way of involving humans. For example, you might use email, chat, or voice interaction. The workflow engine could send an email to a person who needs to do something. This email contains all relevant information for that person to do the task at hand. When they are finished, they can indicate that either by replying to the email or by clicking a link in the email.
Two other common scenarios are using a third-party tasklist application and developing a completely customized user interface. Let’s briefly explore both options.
Using an external tasklist application
The workflow engine can invoke the API of an external application, as visualized in Figure 4-14. This might be a tasklist app that is already widely adopted in the company, from the likes of SAP or Siebel, or some very broad application like Trello or Wunderlist. I’ve also seen one customer using screens on the mainframe to handle open tasks, as this was the way all clerks did their daily work. Tasklists might also be referred to as job lists, to-do lists, or inboxes.
Whatever form it takes, this application gives the user possibilities to see all open tasks, to indicate that they’ve started working on a task, and to mark tasks as completed. The status gets reported back to the workflow engine. When implementing such an integration, you will need to take care of:
-
Creating tasks in the tasklist application whenever a process instance enters a user task
-
Completing user tasks in the workflow engine and moving on in the process when the user is finished
-
Canceling tasks, triggered either by the workflow engine or the user in the UI
-
Transferring business data to be edited into the to-do application, and vice versa
It’s also proved to be a good idea to think about a problem detection mechanism just in case the two systems diverge, for example because of inconsistencies caused by failures with remote calls.
Using a third-party app is often when there is an existing tasklist application that is already rolled out to employees, as it allows them to continue using the known application. They might not even recognize that a workflow engine is at play or that a product is replaced under the hood. In that case, issues of authentication and authorization are often already solved.
Building a customized tasklist application
If you need a more customized experience than the vendor’s tasklist application can deliver, you can develop a bespoke application yourself. This can be adapted to your needs without compromise. You have freedom of choice among development frameworks and programming languages, and tasks inside your custom application can follow your style guide and usability concepts. This is often done if you embed workflow tooling into your own software product, or if you want to roll out your tasklist to hundreds or thousands of users and efficiency in the UI is important.
This approach also allows you to satisfy very special requirements. For example, you might face a situation where you have several user tasks that are heavily interdependent from a business point of view and should therefore be completed in one step by the same person. Imagine a document input management process where you decided to manage each document with a separate process instance, but present mailings consisting of several such documents as a bundled task to the user. An example is shown in Figure 4-15.
In one real-life project I was involved in, this approach allowed an organization’s employees to work much more efficiently. This kind of grouping is no problem with a customized tasklist, but might not be doable with out-of-the-box applications.
Orchestrate RPA Bots
Let’s switch our attention from orchestrating humans to orchestrating bots—robotic process automation (RPA) bots, to be precise. RPA is a solution for dealing with legacy applications that do not offer an API, as many older systems were developed at a time where there was not such a big need for connectivity. RPA tools automate the control of existing graphical user interfaces. Big topics are screen scraping, image processing, OCR, and robots steering GUIs. It’s like the Windows macro recorder on steroids.
RPA has experienced rapid growth recently, and become a huge market, recognized by analysts.
Suppose your billing system is very old and does not provide any kind of API. You can use the RPA tool of your choice to automate the data entry for your onboarding process. In RPA lingo this is called a bot. How this bot is developed depends on the specific tool, but typically you record GUI interactions and edit the steps the bot needs to take in the RPA’s GUI, like “click this button” and “enter text in this text field.” An example is shown in Figure 4-16.
It is important to note that the bot should implement one function only. In terms of the BPMN process, the bot is just another way to implement one service task, as shown in Figure 4-17.
Of course, bots are always much more brittle than a real API call, so whenever possible you should prefer to use an API. But unfortunately, real life is full of obstacles. The system might not provide the API you need, or you might be facing a shortage of development resources. Suppose entering data in the billing system is getting delayed because of people being overloaded with onboarding work. The business department needs to solve this problem quickly, as customers are starting to cancel their orders due to the long delays (which causes even more manual work, leading to a very unfortunate downward spiral). But IT is buried in other urgent work, so they cannot do this integration right away.
Developing an RPA bot can be a good way for the business department to move forward quickly without the need for IT, which is beneficial for the company at this stage. But you need to keep in mind that bots are hard to maintain and depend on user interfaces that might change quickly—and if the RPA solutions and bots are not governed or operated by IT, this can lead to architecture problems down the road.
So in this example, you should directly plan for replacing the bot with a real API. I’ve even seen organizations that require projects to report technical debt whenever they introduce a new RPA bot to make sure this is addressed later.
You can tackle some of the problems around the brittleness of bots by keeping human tasks as a fallback in case there are errors within the RPA bots. This allows you to concentrate on automating the 80% of cases and route the exceptions to a human, as shown in Figure 4-18.
Now, there is one risk you should be aware of. As you saw in Figure 4-16, an RPA flow is also a kind of process model. This can lead some companies to try to automate core business processes with RPA tools, especially if they suffer from limited bandwidth in IT. Unfortunately, this does not work out.
Warning
RPA is not meant to automate core business processes. Using the RPA tool as a low code process automation platform is a trap. Using RPA flows to automate whole business processes has severe downsides and risks. All the disadvantages of low-code apply here, and additionally RPA flows quickly become a wild mix of granularities, containing business process flow logic as well as control sequences for the user interface.
The workflow engine should always be the primary driver that controls the overall process, and calls RPA bots whenever it needs to integrate with a resource that cannot be called via an API for whatever reason.
RPA is applied in one step of the process. As soon as you can switch to an API, you should do that. The beauty of this architecture is that you might not even need to change your process model: simply call the API instead of the RPA bot.
Orchestrate Physical Devices and Things
But let’s not stop with RPA bots. We can also orchestrate physical devices, like real lab robots.
Technically speaking, orchestrating devices boils down to orchestrating software, as devices are integrated via APIs. Still, there are some specific nuances to it. In particular there is a common pattern with regard to emerging use cases around the Internet of Things (IoT), where a myriad of devices connect to the internet and produce data. This data can lead to actions, which then might involve orchestration.
Let’s understand that by looking at a use case around airplane maintenance. Assume that an airplane produces a constant stream of sensor data—for example, the current oil pressure. A stream processor could derive some actual knowledge out of that measurement, such as an oil pressure that is too low. This is another stream of data. But now we have to act on that insight, and schedule maintenance at the next possible opportunity. This is where a process starts, because now we care that the mechanic looks into the insight within a defined time frame, decides how to handle the issue, and schedules the appropriate maintenance actions. This is visualized in Figure 4-19.
The transition from a passive stream to a process reacting to data in the stream is very interesting. In a concrete real-life project, a stateful connector might be developed that starts a process instance for a mechanic only once for every insight. If the oil pressure keeps being reported as too low for the same hardware, this does not start additional process instances. If the oil pressure goes back to normal, this insight is routed to the existing process instance, so that this process instance can take action. For instance, it might simply be canceled as the maintenance is no longer necessary.
Conclusion
This chapter demonstrated that workflow engines can orchestrate anything, from software to decisions to bots and devices. This should help you understand what kinds of problems process automation can solve. Of course, in real life the use cases overlap, so processes typically involve a mix of components. To implement an end-to-end process you might need to orchestrate humans, RPA bots, SOA services, microservices, decisions, functions, and other software components, all within the same process.
Note that some people do not talk about orchestration, but rather about human task management and straight-through processing. This is a subtle point based in the psychology of terminology.
Get Practical Process Automation now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.