O'Reilly logo

Hadoop for Finance Essentials by Rajiv Tiwari

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 5. Getting Started

The traditional data platforms follow a standard paradigm—it takes data feeds from multiple sources, loads it into the staging area, transforms it, and loads it into the final results data warehouse for business intelligence tools.

In this chapter, I will explain how a big data platform using Hadoop can be developed by using a similar paradigm.

I will cover the full data life cycle of a project on a risk and regulatory big data platform:

  • Data collection—data ingestion from multiple sources scheduled using Oozie or Informatica
  • Data transformation—transform data using Hive, Pig, and Java MapReduce
  • Data analysis—integration of BI tools with Hadoop

This chapter will again be a bit more technical with architecture, data flow diagrams, ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required