Hadoop Data Warehousing with Hive at StrataConf iCal

Date: This event took place live on February 26 2013

Presented by: Dean Wampler

Room 224, Santa Clara Convention Center

In this hands-on tutorial, you値l learn how to use Hive for Hadoop-based data warehousing. You値l also learn some tricks of the trade and how to handle known issues. Writing Hive Queries We値l spend most of the tutorial using a series of hands-on exercises with actual Hive queries, so you can learn by doing. We値l go over all the main features of Hive痴 query language, HiveQL, and how Hive works with data in Hadoop. Advanced Techniques Hive is very flexible about the formats of data files, the 都chema of records and so forth. We値l discuss options for customizing these and other aspects of your Hive and data cluster setup. We値l briefly examine how you can write Java user defined functions (UDFs) and other plugins that extend Hive for data formats that aren稚 supported natively. Hive in the Hadoop Ecosystem We値l learn Hive痴 place in the Hadoop ecosystem, such as how it compares to other available tools. We値l discuss installation and configuration issues that ensure the best performance and ease of use in a real production cluster. In particular, we値l discuss how to create Hive痴 separate 杜etadata store in a traditional relational database, such as MySQL. We値l offer tips on data formats and layouts that improve performance in various scenarios.

More information about this event is available at: http://strataconf.com/strata2013/public/schedule/detail/26899

Return to O'Reilly Events