Login  Register

Exception throws when I load data using carbondata-1.0.0

Posted by hexiaoqiao on Feb 14, 2017; 5:49am
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Exception-throws-when-I-load-data-using-carbondata-1-0-0-tp7553.html

 hi, dev,

The latest release version apache-carbondata-1.0.0-incubating-rc2 which
takes Spark-1.6.2 to build throws exception `
java.lang.ClassNotFoundException:
org.apache.carbondata.spark.rdd.CarbonBlockDistinctValuesCombineRDD` when i
load data following Quick Start Guide.

Env:
a. CarbonData-1.0.0-incubating-rc2
b. Spark-1.6.2
c. Hadoop-2.7.1
d. CarbonData on "Spark on YARN" Cluster and run yarn-client mode.

any suggestions? Thank you.

The exception stack trace as below:

--------
ERROR 14-02 12:21:02,005 - main generate global dictionary failed
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0
in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage
0.0 (TID 3, nodemanger): java.lang.ClassNotFoundException:
org.apache.carbondata.spark.rdd.CarbonBlockDistinctValuesCombineRDD
     at
org.apache.spark.repl.ExecutorClassLoader.findClass(ExecutorClassLoader.scala:84)

     at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
     at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
     at java.lang.Class.forName0(Native Method)
     at java.lang.Class.forName(Class.java:274)
     at
org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:68)

     at
java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1612)
     at
java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
     at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
     at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
     at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
     at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
     at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
     at
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:76)

     at
org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:115)

     at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:64)
     at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
     at org.apache.spark.scheduler.Task.run(Task.scala:89)
     at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
     at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

     at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

     at java.lang.Thread.run(Thread.java:745)

Driver stacktrace:
     at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1431)

     at
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1419)

     at
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1418)

     at
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)

     at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
     at
org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1418)
     at
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)

     at
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)

     at scala.Option.foreach(Option.scala:236)
     at
org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:799)

     at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1640)

     at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599)

     at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588)

     at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
     at
org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:620)
     at org.apache.spark.SparkContext.runJob(SparkContext.scala:1832)
     at org.apache.spark.SparkContext.runJob(SparkContext.scala:1845)
     at org.apache.spark.SparkContext.runJob(SparkContext.scala:1858)
     at org.apache.spark.SparkContext.runJob(SparkContext.scala:1929)
     at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:927)
     at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)

     at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)

     at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
     at org.apache.spark.rdd.RDD.collect(RDD.scala:926)
     at
org.apache.carbondata.spark.util.GlobalDictionaryUtil$.generateGlobalDictionary(GlobalDictionaryUtil.scala:742)

     at
org.apache.spark.sql.execution.command.LoadTable.run(carbonTableSchema.scala:577)

     at
org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult$lzycompute(commands.scala:58)

     at
org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult(commands.scala:56)

     at
org.apache.spark.sql.execution.ExecutedCommand.doExecute(commands.scala:70)
     at
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:132)

     at
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:130)

     at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)

     at
org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:130)
     at
org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:55)

     at
org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:55)

     at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:145)
     at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:130)
     at org.apache.spark.sql.CarbonContext.sql(CarbonContext.scala:139)
     at
$line22.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:33)

     at
$line22.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:38)
     at
$line22.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:40)
     at $line22.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:42)
     at $line22.$read$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:44)
     at $line22.$read$$iwC$$iwC$$iwC$$iwC.<init>(<console>:46)
     at $line22.$read$$iwC$$iwC$$iwC.<init>(<console>:48)
     at $line22.$read$$iwC$$iwC.<init>(<console>:50)
     at $line22.$read$$iwC.<init>(<console>:52)
     at $line22.$read.<init>(<console>:54)
     at $line22.$read$.<init>(<console>:58)
     at $line22.$read$.<clinit>(<console>)
     at $line22.$eval$.<init>(<console>:7)
     at $line22.$eval$.<clinit>(<console>)
     at $line22.$eval.$print(<console>)
     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
     at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

     at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

     at java.lang.reflect.Method.invoke(Method.java:606)
     at
org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
     at
org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346)
     at
org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
     at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
     at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
     at
org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
     at
org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)

     at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
     at
org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657)
     at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665)
     at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:670)

     at
org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:997)

     at
org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)

     at
org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)

     at
scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)

     at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945)

     at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
     at org.apache.spark.repl.Main$.main(Main.scala:31)
     at org.apache.spark.repl.Main.main(Main.scala)
     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
     at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

     at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

     at java.lang.reflect.Method.invoke(Method.java:606)
     at
org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:735)

     at
org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
     at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
     at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
     at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException:
org.apache.carbondata.spark.rdd.CarbonBlockDistinctValuesCombineRDD
     at
org.apache.spark.repl.ExecutorClassLoader.findClass(ExecutorClassLoader.scala:84)

     at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
     at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
     at java.lang.Class.forName0(Native Method)
     at java.lang.Class.forName(Class.java:274)
     at
org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:68)

     at
java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1612)
     at
java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
     at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
     at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
     at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
     at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
     at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
     at
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:76)

     at
org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:115)

     at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:64)
     at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
     at org.apache.spark.scheduler.Task.run(Task.scala:89)
     at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
     at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

     at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

     at java.lang.Thread.run(Thread.java:745)