CHAPTER 6Splayed and Partitioned Tables
6.1 INTRODUCTION
Neither the kdb+
instance, nor the q
code, live in isolation. In a kdb+tick
setup, the data may be captured straight into a kdb+
instance. In other cases, we may obtain the data as flat files from an external source and choose to use q
to process them. It is clear why we may make this choice: the database is an excellent environment for dealing with large volumes of, more often than not, high-frequency data and q
is a powerful tool for analysing this type of data.
kdb+
's efficiency extends to dealing with files: while we could use, for example, Python
or R
, to preprocess the flat files before exporting the data to a kdb+
instance, we will have to take a performance hit compared to kdb+
's native methods for reading files. It is true that Python
and R
offer richer toolkits for dealing with file system objects. It is the simplicity and speed that speak in favour of kdb+
.
However, there are situations when we may still choose an external tool to preprocess the data. We may choose to apply Python
's regular expressions and library functions. Alternatively, we may need to unzip
or otherwise decompress the files before reading them from kdb+
. We may need to call some shell commands. Luckily we can do all this without leaving the kdb+
process. kdb+
offers us tools for calling shell commands and external programs from q
scripts.
In this chapter we shall discuss kdb+
's concepts of splayed and partitioned tables and also show how to ...
Get Machine Learning and Big Data with kdb+/q now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.