Com.esotericsoftware.kryo.KryoException:java.lang.NullPointerException
Serialization Trace:
Underlying (Org.apache.spark.util.BoundedPriorityQueue)
At Com.esotericsoftware.kryo.serializers.ObjectField.read (objectfield.java:144)
At Com.esotericsoftware.kryo.serializers.FieldSerializer.read (fieldserializer.java:551)
At Com.esotericsoftware.kryo.Kryo.readClassAndObject (kryo.java:793)
At Com.twitter.chill.SomeSerializer.read (someserializer.scala:25)
At Com.twitter.chill.SomeSerializer.read (someserializer.scala:19)
At Com.esotericsoftware.kryo.Kryo.readClassAndObject (kryo.java:793)
At Org.apache.spark.serializer.KryoSerializerInstance.deserialize (kryoserializer.scala:312)
At Org.apache.spark.scheduler.DirectTaskResult.value (taskresult.scala:87)
At org.apache.spark.scheduler.taskresultgetter$ $anon $2$ $anonfun $run$1.apply$mcv$sp (taskresultgetter.scala:66)
At org.apache.spark.scheduler.taskresultgetter$ $anon $2$ $anonfun $run$1.apply (taskresultgetter.scala:57)
At org.apache.spark.scheduler.taskresultgetter$ $anon $2$ $anonfun $run$1.apply (taskresultgetter.scala:57)
At Org.apache.spark.util.utils$.loguncaughtexceptions (utils.scala:1793)
At org.apache.spark.scheduler.taskresultgetter$ $anon $2.run (taskresultgetter.scala:56)
At Java.util.concurrent.ThreadPoolExecutor.runWorker (threadpoolexecutor.java:1142)
At Java.util.concurrent.threadpoolexecutor$worker.run (threadpoolexecutor.java:617)
At Java.lang.Thread.run (thread.java:745)
caused by:java.lang.NullPointerException
At Org.apache.spark.sql.catalyst.expressions.codegen.LazilyGeneratedOrdering.compare (generateordering.scala:157 )
At Org.apache.spark.sql.catalyst.expressions.codegen.LazilyGeneratedOrdering.compare (generateordering.scala:148 )
At scala.math.ordering$ $anon $4.compare (ordering.scala:111)
At Java.util.PriorityQueue.siftUpUsingComparator (priorityqueue.java:669)
At Java.util.PriorityQueue.siftUp (priorityqueue.java:645)
At Java.util.PriorityQueue.offer (priorityqueue.java:344)
At Java.util.PriorityQueue.add (priorityqueue.java:321)
At Com.twitter.chill.java.PriorityQueueSerializer.read (priorityqueueserializer.java:78)
At Com.twitter.chill.java.PriorityQueueSerializer.read (priorityqueueserializer.java:31)
At Com.esotericsoftware.kryo.Kryo.readObject (kryo.java:711)
At Com.esotericsoftware.kryo.serializers.ObjectField.read (objectfield.java:125)
... More
16/05/24 09:42:53 ERROR sparksqldriver:failed in [select
Dt.d_year
, item.i_brand_id brand_id
, Item.i_brand Brand
, SUM (ss_ext_sales_price) Sum_agg
From Date_dim DT
, Store_sales
, item
where Dt.d_date_sk = Store_sales.ss_sold_date_sk
and Store_sales.ss_item_sk = Item.i_item_sk
and item.i_manufact_id = 436
and dt.d_moy=12
GROUP BY Dt.d_year
, Item.i_brand
, item.i_brand_id
ORDER BY Dt.d_year
, Sum_agg desc
, brand_id
Limit 100]
Inexplicably a null pointer exception occurred-later on the internet to find other people's similar situation:
When Kryo serialization was used, the query fails when ORDER by and LIMIT is combined. After removing either ORDER by or LIMIT clause, the query also runs.
Checked, found that spark 2.0.0 has a bug in the Kryo serialization dependency, to spark_home/conf/spark-defaults.conf
The default is:
# Spark.serializer Org.apache.spark.serializer.KryoSerializer
Change to:
Spark.serializer Org.apache.spark.serializer.JavaSerializer
Spark 2.0.0 Spark-sql returns NPE Error