If Apache Hive is there in your Data Lake as a technology, including HCatalog would be quite handy to deal with the diverse technology ecosystem (especially data processing tools) with a wide variety of storage formats.
HCatalog is a table and storage management layer for Hadoop that enables users with different data processing tools — Pig, MapReduce — to more easily read and write data on the grid. HCatalog’s table abstraction presents users with a relational view of data in the Hadoop distributed file system (HDFS) and ensures that users need not worry about where or in what format their data is stored — RCFile format, text files, SequenceFiles, or ORC files.
HCatalog supports reading and writing files ...