;

Hive Configuration Over Hadoop Platform

admin 05:21:am Feb 26 2018

The Apache Hive ™ data warehouse software facilitates querying and managing large data sets residing in distributed storage. Hive provides a mechanism to project structure onto this data and query the data using a SQL-like language called HiveQL. At the same time this language also allows traditional map/reduce programmers to plug in their custom mappers and reducers when it is inconvenient or inefficient to express this logic in HiveQL.

So, today we will look into the installation and configuration of Hive. And we will explore the advantages of SQL-like queries over the Hadoop platform.

Pre-requisites:

  • To install Hive, make sure you have the Hadoop instances are running on your clusters. If not, get it done first!!

  • Download hive from Hive downloads

Steps to configure Hive:

  • First, extract the hive-<version>.gz file.

  • Now, go to Hive directory:

    cd path/to/hive
  • Now run following commands one by one:

    export HIVE_HOME={{pwd}}
    export PATH=$HIVE_HOME/bin:$PATH
    export HADOOP_HOME=/path/to/hadoop/
  • Now, create /tmp and /user/hive/warehouse directory at the HDFS location.

    For that, go to Hadoop directory:

    cd path/to/hadoop/
  • And, run the following commands:

    bin/hadoop fs -mkdir /tmp
    bin/hadoop fs -mkdir /user/hive/warehouse
    bin/hadoop fs -chmod g+w /tmp
    bin/hadoop fs -chmod g+w /user/hive/warehouse
  • Now, set the Hive home:

  • export HIVE_HOME=/path/to/hive
    Congratulations, you are done with the configuration..!!
    To start hive go to Hive home:
    cd /path/to/hive
  • And run the command:

  • bin/hive

Hive will be started, If this article helps do share your feedback below

Related Post

Comments

No comments yet. Be the first!

Add Comment

Adding comments has been disabled.