Apache CarbonData Dev Mailing List archive › Apache CarbonData JIRA issues

[jira] [Commented] (CARBONDATA-635) ClassCastException in Spark 2.1 Cluster mode in insert query when name of column is changed and When the orders of columns are changed in the tables

Classic

List

Threaded

1 message

Akash R Nilugal (Jira)

[jira] [Commented] (CARBONDATA-635) ClassCastException in Spark 2.1 Cluster mode in insert query when name of column is changed and When the orders of columns are changed in the tables

[ https://issues.apache.org/jira/browse/CARBONDATA-635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15824333#comment-15824333 ]

Ravindra Pesala commented on CARBONDATA-635:
--------------------------------------------

Please verify this issue once on latest master.

> ClassCastException in Spark 2.1 Cluster mode in insert query when name of column is changed and When the orders of columns are changed in the tables
> ----------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: CARBONDATA-635
> URL: https://issues.apache.org/jira/browse/CARBONDATA-635
> Project: CarbonData
> Issue Type: Bug
> Components: data-load
> Affects Versions: 1.0.0-incubating
> Environment: Spark 2.1 Cluster mode
> Reporter: Harsh Sharma
> Priority: Minor
> Attachments: 2000_UniqData.csv, driverlog
>
>
> ::::::::: SCENARIO 1 :::::::
> CREATE TABLE uniqdata (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 int) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES ("TABLE_BLOCKSIZE"= "256 MB");
> CREATE TABLE student (CUST_ID2 int,CUST_ADDR String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 int) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES ("TABLE_BLOCKSIZE"= "256 MB");
> LOAD DATA inpath 'hdfs://hadoop-master:54311/data/2000_UniqData.csv' INTO table uniqdata options('DELIMITER'=',', 'FILEHEADER'='CUST_ID, CUST_NAME, ACTIVE_EMUI_VERSION, DOB, DOJ, BIGINT_COLUMN1, BIGINT_COLUMN2, DECIMAL_COLUMN1, DECIMAL_COLUMN2, Double_COLUMN1, Double_COLUMN2, INTEGER_COLUMN1');
> insert into student select * from uniqdata;
> ::::::::: SCENARIO 2 :::::::
> CREATE TABLE uniqdata (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES ("TABLE_BLOCKSIZE"= "256 MB");
> CREATE TABLE student (ACTIVE_EMUI_VERSION string, DOB timestamp, CUST_ID int,CUST_NAME String, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES ("TABLE_BLOCKSIZE"= "256 MB");
> LOAD DATA inpath 'hdfs://hadoop-master:54311/data/2000_UniqData.csv' INTO table uniqdata options('DELIMITER'=',', 'FILEHEADER'='CUST_ID, CUST_NAME, ACTIVE_EMUI_VERSION, DOB, DOJ, BIGINT_COLUMN1, BIGINT_COLUMN2, DECIMAL_COLUMN1, DECIMAL_COLUMN2, Double_COLUMN1, Double_COLUMN2, INTEGER_COLUMN1');
> Above two scenarios have the same result and exception as below,
> 0: jdbc:hive2://hadoop-master:10000> insert into student select * from uniqdata;
> Error: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 26.0 failed 4 times, most recent failure: Lost task 0.3 in stage 26.0 (TID 38, 192.168.2.176, executor 0): java.lang.ClassCastException: org.apache.spark.unsafe.types.UTF8String cannot be cast to java.lang.Integer
> at scala.runtime.BoxesRunTime.unboxToInt(BoxesRunTime.java:101)
> at org.apache.spark.sql.CarbonDictionaryDecoder$$anonfun$doExecute$1$$anonfun$7$$anon$1$$anonfun$next$1.apply$mcVI$sp(CarbonDictionaryDecoder.scala:186)
> at org.apache.spark.sql.CarbonDictionaryDecoder$$anonfun$doExecute$1$$anonfun$7$$anon$1$$anonfun$next$1.apply(CarbonDictionaryDecoder.scala:183)
> at org.apache.spark.sql.CarbonDictionaryDecoder$$anonfun$doExecute$1$$anonfun$7$$anon$1$$anonfun$next$1.apply(CarbonDictionaryDecoder.scala:183)
> at scala.collection.mutable.ArraySeq.foreach(ArraySeq.scala:74)
> at org.apache.spark.sql.CarbonDictionaryDecoder$$anonfun$doExecute$1$$anonfun$7$$anon$1.next(CarbonDictionaryDecoder.scala:183)
> at org.apache.spark.sql.CarbonDictionaryDecoder$$anonfun$doExecute$1$$anonfun$7$$anon$1.next(CarbonDictionaryDecoder.scala:174)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
> at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
> at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
> at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:377)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
> at org.apache.carbondata.spark.rdd.CarbonBlockDistinctValuesCombineRDD.compute(CarbonGlobalDictionaryRDD.scala:293)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
> at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
> at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
> at org.apache.spark.scheduler.Task.run(Task.scala:99)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Driver stacktrace: (state=,code=0)

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)