Home >> Big Data Hadoop >> Hive User Defined Aggregate Functions in Big Data Hadoop

Hive User Defined Aggregate Functions in Big Data Hadoop

UDAF-User Defined Aggregate Functions.

COUNT, AVG, SUM, MIN, and MAX
Create a Java class which extends org.apache.hadoop.hive.ql.exec.hive.UDAF;
Create an inner class which implements UDAFEvaluator;

Implement five methods ()

init() – The init() method initializes the evaluator and resets its internal state. We are using new Column() in the code below to indicate that no values have been aggregated yet.

iterate() – this method is called every time there is a new value to be aggregated. The evaulator should update its internal state with the result of performing the aggregation (we are doing sum – see below). We return true to indicate that the input was valid.

terminatePartial() – this method is called when Hive wants a result for the partial aggregation. The method must return an object that encapsulates the state of the aggregation.

merge() – this method is called when Hive decides to combine one partial aggregation with another.

terminate() – this method is called when the final result of the aggregation is needed.

Post Your Comment

Next Questions
Hive Performance Tuning
Hive Rank and Over
Hive SERDE
Hive Directed Acyclic Graph
Hive with Sqoop
How to save hive query output in csv using python
Hive How To Convert External table to Internal table or vice-versa
Hive What is User Defined Function and User Defined Aggregate Function
What are the different components of a Hive architecture
How can you prevent a large job from running for a long time
What is a Hive Metastore
Explain about the different types of join in Hive
How can you configure remote metastore mode with Hive
How data transfer happens from HDFS to Hive
Hbase Vs Hive
Hive What is the use of Hcatalog
Where is table data stored in Apache Hive by default
Hive Difference between partitioning and bucketing
Explain about the different types of partitioning in Hive
How will you read and write HDFS files in Hive
What are the components of a Hive query processor
Differentiate between describe and describe extended
Will the reducer work or not if you use Limit 1 in any HiveQL query
Hive Explain about SORT BY, ORDER BY, DISTRIBUTE BY and CLUSTER BY
What is difference between hive internal table and external table

Copyright ©2022 coderraj.com. All Rights Reserved.