In-database analytics is a broad term that describes the processing of data within its repository. In many of the earlier R examples, data was extracted from a data source and loaded into R. One advantage of in-database analytics is that the need for movement of the data into an analytic tool is eliminated. Also, by performing the analysis within the database, it is possible to obtain almost real-time results. Applications of in-database analytics include credit card transaction fraud detection, product recommendations, and web advertisement selection tailored for a particular user.
A popular open-source database is PostgreSQL. This name references an important in-database analytic language known as Structured Query Language (SQL). This chapter examines basic as well as advanced topics in SQL. The provided examples of SQL code were tested against Greenplum database 220.127.116.11, which is based on PostgreSQL 8.2.15. However, the presented concepts are applicable to other SQL environments.
A relational database, part of a Relational Database Management System (RDBMS), organizes data in tables with established relationships between the tables. Figure 11-1 shows the relationships between five tables used to store details about orders placed at an e-commerce retailer.
The table orders ...