This is how case statements can be written in DataFrames:
Hive Query:
select key, case when value is null then 0 else value as new_value from users limit 10;
In data frames:
val sqlQuery = "select key, value from users"
val data = hqlContext.sql(sqlQuery)
val caseStatement=data.withColumn("new_value", when(data("value") isNull, 0.0).otherwise(data("value"))).take(10)
Now here withColumn function creates one more column other than key and value So if we do printSchema() on caseStatement object then it will return:
caseStatement.printSchema()
----key: int
----value: int
----new_value:int
Source:
http://sparksofdata.blogspot.com/
0 comments:
Post a Comment