Skip to Main Content
SAS for R Users
book

SAS for R Users

by Ajay Ohri
September 2019
Beginner to intermediate content levelBeginner to intermediate
208 pages
3h 17m
English
Wiley
Content preview from SAS for R Users

7Using SQL with SAS and R

7.1 What is SQL?

SQL (Structured Query Language) is a language for querying and modifying data in Relational Database Management Systems (RDBMs). However SQL is also used within Apache Hive and Python as well as PySpark. The pandasql package allows you to query pandas DataFrames using SQL syntax. The entry point into all SQL functionality in Spark is the SQLContext class. The Apache Hive ™ data warehouse software facilitates reading, writing, and managing large datasets residing in distributed storage using SQL.

7.1.1 Basic Terminology

A database is a collection of information that is organized so that it can be easily accessed, managed and updated.

A relational database is a set of tables from which data can be accessed or reassembled in many different ways without having to reorganize the database tables.

7.1.2 CAP Theorem

CAP Theorem is a concept that a distributed database system can only have 2 of the 3: Consistency, Availability, and Partition Tolerance.

  • Consistency: Every read receives the most recent write or an error
  • Availability: Every request receives a (non‐error) response – without the guarantee that it contains the most recent write
  • Partition tolerance: The system continues to operate despite an arbitrary number of messages being dropped (or delayed) by the network between nodes.

ACID (Atomicity, Consistency, Isolation, Durability) is a set of properties of database transactions intended to guarantee validity even in the event ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Statistical Hypothesis Testing with SAS and R

Statistical Hypothesis Testing with SAS and R

Sonja Kuhnt, Dirk Taeger

Publisher Resources

ISBN: 9781119256410Purchase book