Home >> Big Data Hadoop >> Hive Performance Tuning in Big Data Hadoop

Hive Performance Tuning in Big Data Hadoop

1: Use Tez

set hive.execution.engine=tez;

2: Use ORCFile

CREATE TABLE A_ORC (customerID int, name string, age int, address string) STORED AS ORC tblproperties (“orc.compress" = “SNAPPY”);

3: Use Vectorization

  set hive.vectorized.execution.enabled = true;

  set hive.vectorized.execution.reduce.enabled = true;

4: cost based query optimization

set hive.cbo.enable=true;

set hive.compute.query.using.stats=true;

set hive.stats.fetch.column.stats=true;

set hive.stats.fetch.partition.stats=true;

5: Write good SQL


Post Your Comment

Next Questions
Hive Rank and Over
Hive Directed Acyclic Graph
Hive with Sqoop
How to save hive query output in csv using python
Hive How To Convert External table to Internal table or vice-versa
Hive What is User Defined Function and User Defined Aggregate Function
What are the different components of a Hive architecture
How can you prevent a large job from running for a long time
What is a Hive Metastore
Explain about the different types of join in Hive
How can you configure remote metastore mode with Hive
How data transfer happens from HDFS to Hive
Hbase Vs Hive
Hive What is the use of Hcatalog
Where is table data stored in Apache Hive by default
Hive Difference between partitioning and bucketing
Explain about the different types of partitioning in Hive
How will you read and write HDFS files in Hive
What are the components of a Hive query processor
Differentiate between describe and describe extended
Will the reducer work or not if you use Limit 1 in any HiveQL query
What is difference between hive internal table and external table
Why you should choose Hive instead of Hadoop MapReduce

Copyright ©2022 coderraj.com. All Rights Reserved.