[jira] [Commented] (CARBONDATA-635) ClassCastException in Spark 2.1 Cluster mode in insert query when name of column is changed and When the orders of columns are changed in the tables

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (CARBONDATA-635) ClassCastException in Spark 2.1 Cluster mode in insert query when name of column is changed and When the orders of columns are changed in the tables

Akash R Nilugal (Jira)

    [ https://issues.apache.org/jira/browse/CARBONDATA-635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15824333#comment-15824333 ]

Ravindra Pesala commented on CARBONDATA-635:
--------------------------------------------

Please verify this issue once on latest master.

> ClassCastException in Spark 2.1 Cluster mode in insert query when name of column is changed and When the orders of columns are changed in the tables
> ----------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CARBONDATA-635
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-635
>             Project: CarbonData
>          Issue Type: Bug
>          Components: data-load
>    Affects Versions: 1.0.0-incubating
>         Environment: Spark 2.1 Cluster mode
>            Reporter: Harsh Sharma
>            Priority: Minor
>         Attachments: 2000_UniqData.csv, driverlog
>
>
> :::::::::  SCENARIO 1 :::::::
> CREATE TABLE uniqdata (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 int) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES ("TABLE_BLOCKSIZE"= "256 MB");
> CREATE TABLE student (CUST_ID2 int,CUST_ADDR String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 int) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES ("TABLE_BLOCKSIZE"= "256 MB");
> LOAD DATA inpath 'hdfs://hadoop-master:54311/data/2000_UniqData.csv' INTO table uniqdata options('DELIMITER'=',', 'FILEHEADER'='CUST_ID, CUST_NAME, ACTIVE_EMUI_VERSION, DOB, DOJ, BIGINT_COLUMN1, BIGINT_COLUMN2, DECIMAL_COLUMN1, DECIMAL_COLUMN2, Double_COLUMN1, Double_COLUMN2, INTEGER_COLUMN1');
> insert into student select * from uniqdata;
> :::::::::  SCENARIO 2 :::::::
> CREATE TABLE uniqdata (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES ("TABLE_BLOCKSIZE"= "256 MB");
> CREATE TABLE student (ACTIVE_EMUI_VERSION string, DOB timestamp, CUST_ID int,CUST_NAME String, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES ("TABLE_BLOCKSIZE"= "256 MB");
> LOAD DATA inpath 'hdfs://hadoop-master:54311/data/2000_UniqData.csv' INTO table uniqdata options('DELIMITER'=',', 'FILEHEADER'='CUST_ID, CUST_NAME, ACTIVE_EMUI_VERSION, DOB, DOJ, BIGINT_COLUMN1, BIGINT_COLUMN2, DECIMAL_COLUMN1, DECIMAL_COLUMN2, Double_COLUMN1, Double_COLUMN2, INTEGER_COLUMN1');
> Above two scenarios have the same result and exception as below,
> 0: jdbc:hive2://hadoop-master:10000> insert into student select * from uniqdata;
> Error: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 26.0 failed 4 times, most recent failure: Lost task 0.3 in stage 26.0 (TID 38, 192.168.2.176, executor 0): java.lang.ClassCastException: org.apache.spark.unsafe.types.UTF8String cannot be cast to java.lang.Integer
> at scala.runtime.BoxesRunTime.unboxToInt(BoxesRunTime.java:101)
> at org.apache.spark.sql.CarbonDictionaryDecoder$$anonfun$doExecute$1$$anonfun$7$$anon$1$$anonfun$next$1.apply$mcVI$sp(CarbonDictionaryDecoder.scala:186)
> at org.apache.spark.sql.CarbonDictionaryDecoder$$anonfun$doExecute$1$$anonfun$7$$anon$1$$anonfun$next$1.apply(CarbonDictionaryDecoder.scala:183)
> at org.apache.spark.sql.CarbonDictionaryDecoder$$anonfun$doExecute$1$$anonfun$7$$anon$1$$anonfun$next$1.apply(CarbonDictionaryDecoder.scala:183)
> at scala.collection.mutable.ArraySeq.foreach(ArraySeq.scala:74)
> at org.apache.spark.sql.CarbonDictionaryDecoder$$anonfun$doExecute$1$$anonfun$7$$anon$1.next(CarbonDictionaryDecoder.scala:183)
> at org.apache.spark.sql.CarbonDictionaryDecoder$$anonfun$doExecute$1$$anonfun$7$$anon$1.next(CarbonDictionaryDecoder.scala:174)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
> at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
> at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
> at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:377)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
> at org.apache.carbondata.spark.rdd.CarbonBlockDistinctValuesCombineRDD.compute(CarbonGlobalDictionaryRDD.scala:293)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
> at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
> at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
> at org.apache.spark.scheduler.Task.run(Task.scala:99)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Driver stacktrace: (state=,code=0)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)