Spark 2.0.0 Spark-sql returns NPE Error

Source: Internet
Author: User
Tags serialization

Com.esotericsoftware.kryo.KryoException:java.lang.NullPointerException
Serialization Trace:
Underlying (Org.apache.spark.util.BoundedPriorityQueue)
At Com.esotericsoftware.kryo.serializers.ObjectField.read (objectfield.java:144)
At Com.esotericsoftware.kryo.serializers.FieldSerializer.read (fieldserializer.java:551)
At Com.esotericsoftware.kryo.Kryo.readClassAndObject (kryo.java:793)
At Com.twitter.chill.SomeSerializer.read (someserializer.scala:25)
At Com.twitter.chill.SomeSerializer.read (someserializer.scala:19)
At Com.esotericsoftware.kryo.Kryo.readClassAndObject (kryo.java:793)
At Org.apache.spark.serializer.KryoSerializerInstance.deserialize (kryoserializer.scala:312)
At Org.apache.spark.scheduler.DirectTaskResult.value (taskresult.scala:87)
At org.apache.spark.scheduler.taskresultgetter$ $anon $2$ $anonfun $run$1.apply$mcv$sp (taskresultgetter.scala:66)
At org.apache.spark.scheduler.taskresultgetter$ $anon $2$ $anonfun $run$1.apply (taskresultgetter.scala:57)
At org.apache.spark.scheduler.taskresultgetter$ $anon $2$ $anonfun $run$1.apply (taskresultgetter.scala:57)
At Org.apache.spark.util.utils$.loguncaughtexceptions (utils.scala:1793)
At org.apache.spark.scheduler.taskresultgetter$ $anon $2.run (taskresultgetter.scala:56)
At Java.util.concurrent.ThreadPoolExecutor.runWorker (threadpoolexecutor.java:1142)
At Java.util.concurrent.threadpoolexecutor$worker.run (threadpoolexecutor.java:617)
At Java.lang.Thread.run (thread.java:745)
caused by:java.lang.NullPointerException
At Org.apache.spark.sql.catalyst.expressions.codegen.LazilyGeneratedOrdering.compare (generateordering.scala:157 )
At Org.apache.spark.sql.catalyst.expressions.codegen.LazilyGeneratedOrdering.compare (generateordering.scala:148 )
At scala.math.ordering$ $anon $4.compare (ordering.scala:111)
At Java.util.PriorityQueue.siftUpUsingComparator (priorityqueue.java:669)
At Java.util.PriorityQueue.siftUp (priorityqueue.java:645)
At Java.util.PriorityQueue.offer (priorityqueue.java:344)
At Java.util.PriorityQueue.add (priorityqueue.java:321)
At Com.twitter.chill.java.PriorityQueueSerializer.read (priorityqueueserializer.java:78)
At Com.twitter.chill.java.PriorityQueueSerializer.read (priorityqueueserializer.java:31)
At Com.esotericsoftware.kryo.Kryo.readObject (kryo.java:711)
At Com.esotericsoftware.kryo.serializers.ObjectField.read (objectfield.java:125)
... More
16/05/24 09:42:53 ERROR sparksqldriver:failed in [select
Dt.d_year
, item.i_brand_id brand_id
, Item.i_brand Brand
, SUM (ss_ext_sales_price) Sum_agg
From Date_dim DT
, Store_sales
, item
where Dt.d_date_sk = Store_sales.ss_sold_date_sk
and Store_sales.ss_item_sk = Item.i_item_sk
and item.i_manufact_id = 436
and dt.d_moy=12
GROUP BY Dt.d_year
, Item.i_brand
, item.i_brand_id
ORDER BY Dt.d_year
, Sum_agg desc
, brand_id
Limit 100]

Inexplicably a null pointer exception occurred-later on the internet to find other people's similar situation:

When Kryo serialization was used, the query fails when ORDER by and LIMIT is combined. After removing either ORDER by or LIMIT clause, the query also runs.

Checked, found that spark 2.0.0 has a bug in the Kryo serialization dependency, to spark_home/conf/spark-defaults.conf

The default is:

# Spark.serializer                 Org.apache.spark.serializer.KryoSerializer

Change to:

Spark.serializer Org.apache.spark.serializer.JavaSerializer

Spark 2.0.0 Spark-sql returns NPE Error

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.