Hive has an optional component known as HiveServer or HiveThrift that allows access to Hive over a single port. Thrift is a software framework for scalable cross-language services development. See http://thrift.apache.org/ for more details. Thrift allows clients using languages including Java, C++, Ruby, and many others, to programmatically access Hive remotely.
The CLI is the most common way to access Hive. However, the design of the CLI can make it difficult to use programmatically. The CLI is a fat client; it requires a local copy of all the Hive components and configuration as well as a copy of a Hadoop client and its configuration. Additionally, it works as an HDFS client, a MapReduce client, and a JDBC client (to access the metastore). Even with the proper client installation, having all of the correct network access can be difficult, especially across subnets or datacenters.
To Get started with the HiveServer, start it in the
background using the
service knob for
$bin/hive --service hiveserver & Starting Hive Thrift Server
A quick way to ensure the HiveServer is running is to use the
netstat command to determine if port 10,000 is open and
listening for connections:
$netstat -nl | grep 10000 tcp 0 0 :::10000 :::* LISTEN
(Some whitespace removed.) As mentioned, the HiveService uses Thrift. Thrift provides an interface language. With the interface, the Thrift compiler generates code that creates network ...