Time for action – adding a new User Defined Function (UDF)

Let us show how to create and invoke some custom Java code via a new UDF.

  1. Save the following code as City.java:
    package com.kycorsystems ; import java.util.regex.Matcher ; import java.util.regex.Pattern ; import org.apache.hadoop.hive.ql.exec.UDF ; import org.apache.hadoop.io.Text ; public class City extends UDF { private static Pattern pattern = Pattern.compile( "[a-zA-z]+?[\\. ]*[a-zA-z]+?[\\, ][^a-zA-Z]") ; public Text evaluate( final Text str) { Text result ; String location = str.toString().trim() ; Matcher matcher = pattern.matcher(location) ; if (matcher.find()) { result = new Text( location.substring(matcher.start(), matcher.end()-2)) ; } else { result = new Text("Unknown") ; } return ...

Get Hadoop: Data Processing and Modelling now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.