Chapter 9Data Clustering
Introduction
Data clustering has long preoccupied researchers determined to categorize data sets using observable characteristics that can drive investment decisions. One may argue that the idea behind traditional portfolio analysis is a form of clustering as it answers the question of how to compose baskets of securities for optimal global portfolio performance. Today's Data Science delivers advanced methods for classifying data based on distributional characteristics, geometry, and other factors. In fact, data clustering is one of the prominent aspects of today's Data Science and is poised to make a deep impact in finance in the near future.
Data clustering is just beginning to take root in Finance; current applications are few and far between. However, the potential of clustering is enormous as these applications to solve open problems illustrate:
- Pre-hedging in execution: find the most similar instrument.
- Selling in crisis: again, sell the most similar liquid instrument.
- Loan ratings: have ratings for public companies, quickly find most similar ones for target private loans.
- Consumer ratings: based on online behavior, match consumers into credit buckets with known credit ratings.
In this chapter, we will consider clustering in a novel context for portfolio management to create sound portfolios with illiquid instruments, alleviating traditional hurdles for illiquid instruments and expanding the range of instruments used in investments. We show ...
Get Big Data Science in Finance now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.