Sqoop-Importing Data into Hive

Importing Data into Hive:

Until now we have used Import tool for importing data into HDFS. Similarly, if we have hive metastore associated with HDFS cluster, then we can import data into hive by creating table in hive. We can use ‘—hive-import’ for importing data into hive.

If the hive table already exists we can use’—hive-overwrite’ option for overwriting the table in hive. After importing the data into HDFS, we can use ’CREATE TABLE’ statement in hive by defining the columns in hive types and then load the data into hive directory.

Hive will generate errors if we have hive default row delimiters like ‘\n’,’\r’,’\01’ in database rows. Then we have to use –hive-drop-import-delims option to drop those characters while importing.

Alternatively we can use –hive-delims-replacement option to replace those characters with a user-defined string on import. We can also use hive partitions in importing by using –hive-partition-key and –hive-partition-value arguments.

Importing the data to Hive:

hdadmin@ubuntu:~/sqoop-1.4.5-cdh5.3.2$ bin/sqoop import -connect jdbc:mysql://localhost:3306/mysql -username root -password **** --table employee  --hive-import -m 1



hdadmin@ubuntu:~$ hdfs dfs -cat /user/hive/warehouse/employee/part-m-00000

1#sai

2#kishore

3#chandu

4#gopal

5#nanda

With the help of ‘—hive-import’ argument, the data will be loaded directly from source database into hive.