Spark tip caused By:java.lang.classcastexception:scala.collection.mutable.wrappedarray$ofref cannot be cast to [ Lscala.collection.immutable.Map; cause
A UDF that handles the equality of two columns is written, and the data structures are the same, but the structure is more complex, as follows:
|-- list: array (nullable = true)
| |-- element: map (containsNull = true)
| | |-- key: string
| | |-- value: array (valueContainsNull = true)
| | | |-- element: struct (containsNull = true)
| | | | |-- Date: integer (nullable = true)
| | | | |-- Name: string (nullable = true)
|-- list2: array (nullable = true)
| |-- element: map (containsNull = true)
| | |-- key: string
| | |-- value: array (valueContainsNull = true)
| | | |-- element: struct (containsNull = true)
| | | | |-- Date: integer (nullable = true)
| | | | |-- Name: string (nullable = true)
This means that the array embedded in the Map,map is also embedded in an array, can only be compared in turn, the following UDF is written:
case class AppList(Date: Int, versionCode: Int, Name:String)
def isMapEqual(map1: Map[String, Array[AppList]], map2:Map[String, Array[AppList]]): Boolean = {
try{
if (map1.size != map2.size){
return false
} else{
for ( x <- map1.keys){
if (map1(x) != map2(x)){
return false
}
}
return true
}
} catch {
case e: Exception => false
}
}
def isListEqual(list1: Array[Map[String, Array[AppList]]], list2:Seq[Map[String, Seq[AppList]]]): Boolean = {
try {
if (list1.length != list2.length){
return false
} else if (list1.length == 0 || list2.length == 0){
return false
} else {
return isMapEqual(list1(0), list2(0))
}
} catch {
case e: Exception => false
}
}
val isColumnEqual = udf((list1: Array[Map[String, Array[AppList]]], list2:Array[Map[String, Array[AppList]]]) => {
isListEqual(list1, list2)
})
And then I put it in the Spark-shell. Executes the following statement:
val dat = df.withColumn("equal", isColumnEqual($"list1", $"list2"))dat.show()
At this point, the following error occurred:
Caused by: org.apache.spark.SparkException: Failed to execute user defined function($anonfun$1: (array<map<string,array<struct<Date:int,Name:string>>>>, array<map<string,array<struct<Date:int,Name:string>>>>) => boolean)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:377)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:231)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:225)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:827)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:827)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:99)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassCastException: scala.collection.mutable.WrappedArray$ofRef cannot be cast to [Lscala.collection.immutable.Map;
at $anonfun$1.apply(<console>:42)
... 16 more
Solutions
The so-called solution, nature is going to Google ...
See here, said to change the array to seq just fine, embarrassed, tried a bit, sure enough.
Reason
Here says:
So it looks like the arraytype in Dataframe "Iddf" is really a wrappedarray and not a array-so the function call to "fi Ltermapkeyswithset "failed as it expected an Array but got a wrappedarray/seq instead (which doesn ' t implicitly convert t o Array in Scala 2.8 and above).
It means that this array is not a native array in Scala, but instead encapsulates the array (it must be pointed out that I have never written Scala, panic
Reference
- Https://stackoverflow.com/questions/40199507/scala-collection-mutable-wrappedarrayofref-cannot-be-cast-to-integer
- Https://stackoverflow.com/questions/40764957/spark-java-lang-classcastexception-scala-collection-mutable-wrappedarrayofref
Spark hint caused by:java.lang.classcastexception:scala.collection.mutable.wrappedarray$ofref cannot be cast to [ Lscala.collection.immutable.Map;