Throttling

While it is not a direct instance management technique, throttling enables you to restrain client connections and the load they place on your service. You need throttling because software systems are not elastic, as shown in Figure 4-7.

Figure 4-7. The inelastic nature of all software systems

That is, you cannot keep increasing the load on the system and expect an infinite, gradual decline in its performance, as if stretching chewing gum. Most systems will initially handle the increase in load well, but then begin to yield and abruptly snap and break. All software systems behave this way, for reasons that are beyond the scope of this book and are related to queuing theory and the overhead inherent in managing resources. This snapping, inelastic behavior is of particular concern when there are spikes in load, as shown in Figure 4-8.

Figure 4-8. A spike in load may push the system beyond its design limit

Even if a system is handling a nominal load well (the horizontal line in Figure 4-8), a spike may push it beyond its design limit, causing it to snap and resulting in the clients experiencing a significant degradation in their level of service. Spikes can also pose a challenge in terms of the rate at which the load grows, even if the absolute level reached would not otherwise cause the system problems.

Throttling enables you to avoid maxing out your service and the underlying resources it allocates and uses. When throttling is engaged, if the settings you configure are exceeded, WCF will automatically place the pending callers in a queue and serve them out of the queue in order. If a client's call timeout expires while its call is pending in the queue, the client will get a TimeoutException. Throttling is inherently an unfair technique, because those clients whose requests are buffered will see a degradation in their level of service. However, in this case, it is better to be smart than just: if all the callers in the spike are allowed in, that will be fair, but all callers will then see a significant drop in the level of service as the system snaps. Throttling therefore makes sense when the area under the spike is relatively small compared with the area under the entire load graph, implying that the probability of the same caller being queued successively is very low. Every once in a while, in response to a spike, some callers will get buffered, but the system as a whole will still function well. Throttling does not work well when the load increases to a new level and remains constant at that level for a long time (as shown in Figure 4-9). In that case, all it does is defer the problems a bit, eventually causing all callers to time out. Such a system should be designed from the ground up to handle the higher level of load.

Figure 4-9. Inadequate justification for throttling

Throttling is done per service type; that is, it affects all instances of the service and all its endpoints. This is done by associating the throttle with every channel dispatcher the service uses.

WCF lets you control some or all of the following service consumption parameters:

Maximum number of concurrent sessions: Indicates the overall number of outstanding clients that can have a transport session with the service. In plain terms, this represents the maximum overall number of outstanding clients using TCP, IPC, or either of the WS bindings (with reliability, security, or both). Because the connectionless nature of a basic HTTP connection implies a very short transport session that exists only for the duration of the call, this number usually has no effect on clients using the basic binding or a WS binding without a transport session; such clients are instead limited by the maximum allowed number of concurrent calls. The default value is 10.
Maximum number of concurrent calls: Limits the total number of calls that can currently be in progress across all service instances. This number should usually be kept at 1 to 3 percent of the maximum number of concurrent sessions. The default value is 16.
Maximum number of concurrent instances: Controls the total number of concurrently alive contexts. The default value is 26. How instances map to contexts is a product of the instance context management mode, as well as context and instance deactivation. With a per-session service, the maximum number of instances is both the total number of concurrently active instances and the total number of concurrent sessions. When instance deactivation is employed, there may be far fewer instances than contexts, and yet clients will be blocked if the number of contexts has reached the maximum number of concurrent instances. With a per-call service, the number of instances is actually the same as the number of concurrent calls. Consequently, the maximum number of instances with a per-call service is the lesser of the configured maximum concurrent instances and maximum concurrent calls. The value of this parameter is ignored with a singleton service, since it can only have a single instance anyway.

Warning

Throttling is an aspect of hosting and deployment. When you design a service, you should make no assumptions about throttling configuration—always assume your service will bear the full brunt of the client's load. This is why, although it is fairly easy to write a throttling behavior attribute, WCF does not offer one.

Configuring Throttling

Administrators typically configure throttling in the config file. This enables you to throttle the same service code differently over time or across deployment sites. The host can also programmatically configure throttling based on some runtime decisions.

Administrative throttling

Example 4-20 shows how to configure throttling in the host config file. Using the behaviorConfiguration tag, you add to your service a custom behavior that sets throttled values.

Example 4-20. Administrative throttling

<system.serviceModel>
   <services>
      <service name = "MyService" behaviorConfiguration = "ThrottledBehavior">
         ...
      </service>
   </services>
   <behaviors>
      <serviceBehaviors>
         <behavior name = "ThrottledBehavior">
            <serviceThrottling
               maxConcurrentCalls     = "500"
               maxConcurrentSessions  = "10000"
               maxConcurrentInstances = "100"
            />
         </behavior>
      </serviceBehaviors>
   </behaviors>
</system.serviceModel>

Programmatic throttling

The host process can programmatically throttle the service based on some runtime parameters. You can only configure the throttle programmatically before the host is opened. Although the host can override the throttling behavior found in the config file by removing it and adding its own, you typically should provide a programmatic throttling behavior only when there is no throttling behavior in the config file.

The ServiceHostBase class offers the Description property of the type ServiceDescription:

public abstract class ServiceHostBase : ...
{
   public ServiceDescription Description
   {get;}
   //More members
}

The service description, as its name implies, is a description of the service, with all its aspects and behaviors. ServiceDescription contains a property called Behaviors of the type KeyedByTypeCollection<I>, with IServiceBehavior as the generic parameter.

Example 4-21 shows how to set the throttled behavior programmatically.

Example 4-21. Programmatic throttling

ServiceHost host = new ServiceHost(typeof(MyService));

ServiceThrottlingBehavior throttle;
throttle = host.Description.Behaviors.Find<ServiceThrottlingBehavior>(  );
if(throttle == null)
{
   throttle = new ServiceThrottlingBehavior(  );
   throttle.MaxConcurrentCalls     = 12;
   throttle.MaxConcurrentSessions  = 34;
   throttle.MaxConcurrentInstances = 56;
   host.Description.Behaviors.Add(throttle);
}

host.Open(  );

First, the hosting code verifies that no service throttling behavior was provided in the config file. This is done by calling the Find<T>( ) method of KeyedByTypeCollection<I>, using ServiceThrottlingBehavior as the type parameter.

ServiceThrottlingBehavior is defined in the System.ServiceModel.Design namespace:

public class ServiceThrottlingBehavior : IServiceBehavior
{
   public int MaxConcurrentCalls
   {get;set;}
   public int MaxConcurrentSessions
   {get;set;}
   public int MaxConcurrentInstances
   {get;set;}
   //More members
}

If the returned throttle is null, then the hosting code creates a new ServiceThrottlingBehavior, sets its values, and adds it to the behaviors in the service description.

Streamlining with ServiceHost<T>

Using C# 3.0 extensions, you can extend ServiceHost (or any subclass of it, such as ServiceHost<T>) to automate the code in Example 4-21, as shown in Example 4-22.

Example 4-22. Extending ServiceHost to handle throttling

public static class ServiceThrottleHelper
{
   public static void SetThrottle(this ServiceHost host,
                                  int maxCalls,int maxSessions,int maxInstances)
   {
      ServiceThrottlingBehavior throttle = new ServiceThrottlingBehavior(  );
      throttle.MaxConcurrentCalls = maxCalls;
      throttle.MaxConcurrentSessions = maxSessions;
      throttle.MaxConcurrentInstances = maxInstances;
      host.SetThrottle(throttle);
   }
   public static void SetThrottle(this ServiceHost host,
                                  ServiceThrottlingBehavior serviceThrottle,
                                  bool overrideConfig)
   {
      if(host.State == CommunicationState.Opened)
      {
         throw new InvalidOperationException("Host is already opened");
      }
      ServiceThrottlingBehavior throttle =
                   host.Description.Behaviors.Find<ServiceThrottlingBehavior>(  );
      if(throttle == null)
      {
         host.Description.Behaviors.Add(serviceThrottle);
         return;
      }
      if(overrideConfig == false)
      {
         return;
      }
      host.Description.Behaviors.Remove(throttle);
      host.Description.Behaviors.Add(serviceThrottle);
   }
   public static void SetThrottle(this ServiceHost host,
                                  ServiceThrottlingBehavior serviceThrottle)
   {
      host.SetThrottle(serviceThrottle,false);
   }
}

ServiceThrottleHelper offers the SetThrottle( ) method, which accepts the throttle to use, and a Boolean flag indicating whether or not to override the configured values, if present. The default value (using an overloaded version of SetThrottle( )) is false. SetThrottle( ) verifies that the host hasn't been opened yet using the State property of the CommunicationObject base class. If it is required to override the configured throttle, SetThrottle( ) removes it from the description. The rest of Example 4-22 is similar to Example 4-21. Here is how to use ServiceHost<T> to set a throttle programmatically:

ServiceHost<MyService> host = new ServiceHost<MyService>(  );
host.SetThrottle(12,34,56);
host.Open(  );

Tip

The InProcFactory<T> class presented in Chapter 1 was similarly extended to streamline throttling.

Reading throttle values

Service developers can read the throttle values at runtime, for diagnostic and analytical purposes. For a service instance to access its throttle properties from its dispatcher at runtime, it must first obtain a reference to the host from the operation context.

The host base class ServiceHostBase offers the read-only ChannelDispatchers property:

public abstract class ServiceHostBase : CommunicationObject,...
{
   public ChannelDispatcherCollection ChannelDispatchers
   {get;}
   //More members
}

ChannelDispatchers is a strongly typed collection of ChannelDispatcherBase objects:

public class ChannelDispatcherCollection :
                                 SynchronizedCollection<ChannelDispatcherBase>
{...}

Each item in the collection is of the type ChannelDispatcher. ChannelDispatcher offers the property ServiceThrottle:

public class ChannelDispatcher : ChannelDispatcherBase
{
   public ServiceThrottle ServiceThrottle
   {get;set;}
   //More members
}
public sealed class ServiceThrottle
{
   public int MaxConcurrentCalls
   {get;set;}
   public int MaxConcurrentSessions
   {get;set;}
   public int MaxConcurrentInstances
   {get;set;}
}

ServiceThrottle contains the configured throttle values:

class MyService : ...
{
   public void MyMethod(  ) //Contract operation
   {
      ChannelDispatcher dispatcher = OperationContext.Current.
                                   Host.ChannelDispatchers[0] as ChannelDispatcher;

      ServiceThrottle serviceThrottle = dispatcher.ServiceThrottle;

      Trace.WriteLine("Max Calls = " + serviceThrottle.MaxConcurrentCalls);
      Trace.WriteLine("Max Sessions = " + serviceThrottle.MaxConcurrentSessions);
      Trace.WriteLine("Max Instances = " + serviceThrottle.MaxConcurrentInstances);
   }
}

Note that the service can only read the throttle values and has no way of affecting them. If the service tries to set the throttle values, it will get an InvalidOperationException.

Again, you can streamline the throttle lookup via ServiceHost<T>. First, add a ServiceThrottle property:

public class ServiceHost<T> : ServiceHost
{
   public ServiceThrottle Throttle
   {
      get
      {
         if(State == CommunicationState.Created)
         {
            throw new InvalidOperationException("Host is not opened");
         }

         ChannelDispatcher dispatcher = OperationContext.Current.
                                   Host.ChannelDispatchers[0] as ChannelDispatcher;
         return dispatcher.ServiceThrottle;
      }
   }
   //More members
}

Then, use ServiceHost<T> to host the service and use the ServiceThrottle property to access the configured throttle:

//Hosting code
ServiceHost<MyService> host = new ServiceHost<MyService>(  );
host.Open(  );

class MyService : ...
{
   public void MyMethod(  )
   {
     ServiceHost<MyService> host = OperationContext.Current.
                                                    Host as ServiceHost<MyService>;

     ServiceThrottle serviceThrottle = host.Throttle;
     ...
   }
}

Tip

You can only access the Throttle property of ServiceHost<T> after the host is opened, because the dispatcher collection is initialized only after that point.

Throttled Connections in the Binding

When you use the TCP and IPC bindings, you can also configure the maximum number of connections for a particular endpoint in the binding itself. Both the NetTcpBinding and the NetNamedPipeBinding offer the MaxConnections property:

public class NetTcpBinding : Binding,...
{
   public int MaxConnections
   {get;set;}
}
public class NetNamedPipeBinding : Binding,...
{
   public int MaxConnections
   {get;set;}
}

On the host side, you can set that property either programmatically or using a config file:

<bindings>
   <netTcpBinding>
      <binding name = "TCPThrottle" maxConnections = "25"/>
   </netTcpBinding>
</bindings>

The maximum number of connections defaults to 10. When both a binding-level throttle and a service-behavior throttle set the value, WCF chooses the lesser of the two.

Get Programming WCF Services, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Programming WCF Services, 2nd Edition by Juval Lowy