Welcome toVigges Developer Community-Open, Learning,Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
177 views
in Technique[技术] by (71.8m points)

Bulk hive table creation in Google Dataproc

I am very new to Google Cloud Platform, and I am doing a POC for moving a hive application (tables and jobs) to Google Dataproc. The data has already been moved to Google cloud Storage.

Is there an inbuilt way to create all the tables from hive in dataproc in bulk, instead of creating one by one using the hive prompt?


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Dataproc support Hive job type, so you can use the gcloud command:

gcloud dataproc jobs submit hive --cluster=CLUSTER 
   -e 'create table t1 (id int, name string); create table t2 ...;'

or

gcloud dataproc jobs submit hive --cluster=CLUSTER -f create_tables.hql

You can also SSH into the master node, then use beeline to execute the script:

beeline -u jdbc:hive2://localhost:10000 -f create_tables.hql

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to Vigges Developer Community for programmer and developer-Open, Learning and Share
...