System and data architecting

This section covers strategies to improve overall system performance, data indexing performance, and to maximize storage space.

Hot-Warm architecture

For time-series data, including Twitter and other social media data as well as data from Logstash, recommends setting up what they have dubbed a Hot-Warm architecture. This setup puts nodes into three groups.

Master nodes

Ideally, dedicate three nodes as master nodes that do not store data or fulfill queries. These machines don't need to be very powerful; they just perform cluster management operations.

Hot nodes

Hot nodes hold the most recent data indices. All data writes are directed at these machines, and they are likely the most-frequently queried nodes. ...

