CHAPTER 6Splayed and Partitioned Tables

6.1 INTRODUCTION

Neither the kdb+ instance, nor the q code, live in isolation. In a kdb+tick setup, the data may be captured straight into a kdb+ instance. In other cases, we may obtain the data as flat files from an external source and choose to use q to process them. It is clear why we may make this choice: the database is an excellent environment for dealing with large volumes of, more often than not, high-frequency data and q is a powerful tool for analysing this type of data.

kdb+'s efficiency extends to dealing with files: while we could use, for example, Python or R, to preprocess the flat files before exporting the data to a kdb+ instance, we will have to take a performance hit compared to kdb+'s native methods for reading files. It is true that Python and R offer richer toolkits for dealing with file system objects. It is the simplicity and speed that speak in favour of kdb+.

However, there are situations when we may still choose an external tool to preprocess the data. We may choose to apply Python's regular expressions and library functions. Alternatively, we may need to unzip or otherwise decompress the files before reading them from kdb+. We may need to call some shell commands. Luckily we can do all this without leaving the kdb+ process. kdb+ offers us tools for calling shell commands and external programs from q scripts.

In this chapter we shall discuss kdb+'s concepts of splayed and partitioned tables and also show how to ...

Get Machine Learning and Big Data with kdb+/q now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Machine Learning and Big Data with kdb+/q by Jan Novotny, Paul A. Bilokon, Aris Galiotos, Frederic Deleze

CHAPTER 6Splayed and Partitioned Tables

6.1 INTRODUCTION

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly