Now let’s look at one way to make our subscribers faster. A common use case for pub-sub is distributing large data streams, like market data coming from stock exchanges. A typical setup would have a publisher connected to a stock exchange, taking price quotes and sending them out to a number of subscribers. If there were only a handful of subscribers, we could use TCP. With a larger number of subscribers, we’d probably use reliable multicast, i.e., PGM.
Let’s imagine our feed has an average of 100,000 100-byte messages a second. That’s a typical rate, after filtering market data we don’t need to send on to subscribers. Now we decide to record a day’s data (maybe 250 GB in 8 hours), and then replay it to a simulation network (i.e., a small group of subscribers). While 100K messages a second is easy for a ØMQ application, we want to replay it much faster.
So we set up our architecture with a bunch of boxes—one for the publisher and one for each subscriber. These are well-specified boxes—8 cores, 12 for the publisher.
And as we pump data into our subscribers, we notice two things:
When we do even the slightest amount of work with a message, it slows down our subscribers to the point where they can’t catch up with the publisher again.
We’re hitting a ceiling, at both the publisher and the subscribers, of around 6M messages a second, even after careful optimization and TCP tuning.
The first thing we have to do is break our subscriber into a multithreaded ...