Processing different file and compression types in Impala
Impala loads files stored in HDFS and these files could be of various types. Some of these files are stored in HDFS directly from their source, or some of the files could be the output of MapReduce or Pig or any other application running on Hadoop.
Impala is limited in terms of supporting various file types on Hadoop; however, it does cover most popular Big Data file formats, which gives Impala a very wide range to cover user input requests. If Impala cannot read an input file type, you can perform the following steps to use a combination of Hive and Impala:
- Use the
CREATE TABLEstatement in the Hive shell to create the table with input data.
- Use the Impala shell with the
INVALIDATE METADATA ...