Home >> Big Data Hadoop >> Hive Joins in Big Data Hadoop

Hive Joins in Big Data Hadoop

Only equality joins, outer joins, and left semi joins are supported in Hive. Also, more than two tables can be joined in Hive.

Hive does not support join conditions that are not equality conditions as it is very difficult to express such conditions as a map/reduce job.

Allowed Equality Joins

SELECT a.* FROM a JOIN b ON (a.id = b.id)

SELECT a.* FROM a JOIN b ON (a.id = b.id AND a.department = b.department)

Disallowed Joins

SELECT a.* FROM a JOIN b ON (a.id <> b.id)

Multiple Tables can be joined in the same query

SELECT a.val, b.val, c.val FROM a JOIN b ON (a.key = b.key1) JOIN c ON (c.key = b.key2)

Largest Table LAST:

Map/reduce stage of the join, the last table in the sequence is streamed through the reducers where as the others are buffered.It helps to reduce the memory needed in the reducer for buffering the rows for a particular value of the join key by organizing the tables such that the largest tables appear last in the sequence.

SELECT a.val, b.val, c.val FROM a JOIN b ON (a.key = b.key1) JOIN c ON (c.key = b.key1)

Post Your Comment

Next Questions
Hive Joins cont
Hive Built-in Operators
Hive Data Types
Hive Built-in Functions
Hive Explode
Hive SORT BY vs ORDER By
HIve DISTRIBUTE BY vs CLUSTER BY
Hive HUE Editor
Hive vs Impala
Hive Data & Schema
Hive Partitioning
Hive Bucketing
Hive File Format
Hive Engine
Hive Vectorization
Hive User Defined Function
Hive How to Write a User Defined Function
Hive User Defined Aggregate Functions
Hive Performance Tuning
Hive Rank and Over
Hive SERDE
Hive Directed Acyclic Graph
Hive with Sqoop
How to save hive query output in csv using python
Hive How To Convert External table to Internal table or vice-versa

Copyright ©2022 coderraj.com. All Rights Reserved.