Skip to Content
Data Analytics with Hadoop
book

Data Analytics with Hadoop

by Benjamin Bengfort, Jenny Kim
June 2016
Intermediate to advanced
286 pages
8h 9m
English
O'Reilly Media, Inc.
Content preview from Data Analytics with Hadoop

Glossary

accessible

In the context of a computing cluster, a node is accessible if it is reachable through the network. In other contexts, a tool or library is accessible if it easily accessed and understandable to particular groups.

accumulator

A shared variable to which only associative operations might be applied, like addition (particularly in Spark, called counters in MapReduce). Because associative operations are order independent, accumulators can stay consistent in a distributed environment, no matter the order of operations.

actions and transformations

See transformations and actions.

agent

Services, usually background processes, that run routinely on the behalf of a user, performing tasks independently. Flume agents are the building blocks of data flows, which ingest and wrangle data from a source to a channel and eventually a sink.

anonymous functions

A function that is not specified by an identifier (variable name). These functions are typically constructed at runtime and passed as arguments to higher-order functions. They can also be used to easily create closures. Anonymous functions are passed to Spark operations to define their behavior. See also closure and lambda function.

application programming interface (API)

A collection of routines, protocols, or interfaces that specify how software components should interact. The MapReduce API specifies interfaces for constructing Mapper, Reducer, and Job subclasses that define MapReduce behavior. Similarly, ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Big Data Analytics with Hadoop 3

Big Data Analytics with Hadoop 3

Sridhar Alla
Hadoop Fundamentals for Data Scientists

Hadoop Fundamentals for Data Scientists

Jenny Kim, Benjamin Bengfort
Data Science on AWS

Data Science on AWS

Chris Fregly, Antje Barth

Publisher Resources

ISBN: 9781491913734Supplemental ContentErrata Page