Pallavi Singh created CARBONDATA-631:
---------------------------------------- Summary: Select Query Failure for table created in 0.2 with data loaded in 1.0 Key: CARBONDATA-631 URL: https://issues.apache.org/jira/browse/CARBONDATA-631 Project: CarbonData Issue Type: Bug Environment: Spark 1.6 Reporter: Pallavi Singh Fix For: 0.1.0-incubating Created table with the 0.2 jar: CREATE TABLE uniqdata (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 int) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES ("TABLE_BLOCKSIZE"= "256 MB"); then LOAD DATA INPATH 'hdfs://localhost:54310/csv/2000_UniqData.csv' into table uniqdata OPTIONS('DELIMITER'=',' , 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1'); Switched to 1.0 jar LOAD DATA INPATH 'hdfs://localhost:54310/csv/2000_UniqData.csv' into table uniqdata OPTIONS('DELIMITER'=',' , 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1'); After successful load : select count(*) from uniqdata; I get following error : INFO 12-01 18:31:04,057 - Running query 'select count(*) from uniqdata' with 81129cf3-fcd4-429d-9adf-d37d35cdf051 INFO 12-01 18:31:04,058 - pool-27-thread-46 Query [SELECT COUNT(*) FROM UNIQDATA] INFO 12-01 18:31:04,060 - Parsing command: select count(*) from uniqdata INFO 12-01 18:31:04,060 - Parse Completed INFO 12-01 18:31:04,061 - Parsing command: select count(*) from uniqdata INFO 12-01 18:31:04,061 - Parse Completed INFO 12-01 18:31:04,061 - 27: get_table : db=12jan17 tbl=uniqdata INFO 12-01 18:31:04,061 - ugi=pallavi ip=unknown-ip-addr cmd=get_table : db=12jan17 tbl=uniqdata INFO 12-01 18:31:04,061 - 27: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore INFO 12-01 18:31:04,063 - ObjectStore, initialize called INFO 12-01 18:31:04,068 - Reading in results for query "org.datanucleus.store.rdbms.query.SQLQuery@0" since the connection used is closing INFO 12-01 18:31:04,069 - Using direct SQL, underlying DB is DERBY INFO 12-01 18:31:04,069 - Initialized ObjectStore INFO 12-01 18:31:04,101 - pool-27-thread-46 Starting to optimize plan ERROR 12-01 18:31:04,168 - pool-27-thread-46 Cannot convert12-01-2017 16:02:28 to Time/Long type valueUnparseable date: "12-01-2017 16:02:28" ERROR 12-01 18:31:04,185 - pool-27-thread-46 Cannot convert12-01-2017 16:02:08 to Time/Long type valueUnparseable date: "12-01-2017 16:02:08" ERROR 12-01 18:31:04,185 - pool-27-thread-46 Cannot convert12-01-2017 16:02:08 to Time/Long type valueUnparseable date: "12-01-2017 16:02:08" ERROR 12-01 18:31:04,204 - pool-27-thread-46 Cannot convert12-01-2017 16:02:08 to Time/Long type valueUnparseable date: "12-01-2017 16:02:08" ERROR 12-01 18:31:04,210 - Error executing query, currentState RUNNING, org.apache.spark.sql.catalyst.errors.package$TreeNodeException: execute, tree: CarbonDictionaryDecoder [CarbonDecoderRelation(Map(dob#280 -> dob#280, double_column1#287 -> double_column1#287, decimal_column1#285 -> decimal_column1#285, cust_id#282L -> cust_id#282L, integer_column1#289L -> integer_column1#289L, decimal_column2#286 -> decimal_column2#286, cust_name#278 -> cust_name#278, double_column2#288 -> double_column2#288, active_emui_version#279 -> active_emui_version#279, bigint_column1#283L -> bigint_column1#283L, bigint_column2#284L -> bigint_column2#284L, doj#281 -> doj#281),CarbonDatasourceRelation(`12jan17`.`uniqdata`,None))], ExcludeProfile(ArrayBuffer()), CarbonAliasDecoderRelation() +- TungstenAggregate(key=[], functions=[(count(1),mode=Final,isDistinct=false)], output=[_c0#750L]) +- TungstenExchange SinglePartition, None +- TungstenAggregate(key=[], functions=[(count(1),mode=Partial,isDistinct=false)], output=[count#754L]) +- CarbonScan CarbonRelation 12jan17, uniqdata, CarbonMetaData(ArrayBuffer(cust_name, active_emui_version, dob, doj),ArrayBuffer(cust_id, bigint_column1, bigint_column2, decimal_column1, decimal_column2, double_column1, double_column2, integer_column1),org.apache.carbondata.core.carbon.metadata.schema.table.CarbonTable@2302bcb1,DictionaryMap(Map(cust_name -> true, active_emui_version -> true, dob -> false, doj -> false))), org.apache.carbondata.spark.merger.TableMeta@2d38370a, None, true at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:49) at org.apache.spark.sql.CarbonDictionaryDecoder.doExecute(CarbonDictionaryDecoder.scala:153) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:132) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:130) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:130) at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:166) at org.apache.spark.sql.execution.SparkPlan.executeCollectPublic(SparkPlan.scala:174) at org.apache.spark.sql.DataFrame$$anonfun$org$apache$spark$sql$DataFrame$$execute$1$1.apply(DataFrame.scala:1499) at org.apache.spark.sql.DataFrame$$anonfun$org$apache$spark$sql$DataFrame$$execute$1$1.apply(DataFrame.scala:1499) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:56) at org.apache.spark.sql.DataFrame.withNewExecutionId(DataFrame.scala:2086) at org.apache.spark.sql.DataFrame.org$apache$spark$sql$DataFrame$$execute$1(DataFrame.scala:1498) at org.apache.spark.sql.DataFrame$$anonfun$org$apache$spark$sql$DataFrame$$collect$1.apply(DataFrame.scala:1503) at org.apache.spark.sql.DataFrame$$anonfun$org$apache$spark$sql$DataFrame$$collect$1.apply(DataFrame.scala:1503) at org.apache.spark.sql.DataFrame.withCallback(DataFrame.scala:2099) at org.apache.spark.sql.DataFrame.org$apache$spark$sql$DataFrame$$collect(DataFrame.scala:1503) at org.apache.spark.sql.DataFrame.collect(DataFrame.scala:1480) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:226) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:154) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:151) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:164) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: execute, tree: TungstenAggregate(key=[], functions=[(count(1),mode=Final,isDistinct=false)], output=[_c0#750L]) +- TungstenExchange SinglePartition, None +- TungstenAggregate(key=[], functions=[(count(1),mode=Partial,isDistinct=false)], output=[count#754L]) +- CarbonScan CarbonRelation 12jan17, uniqdata, CarbonMetaData(ArrayBuffer(cust_name, active_emui_version, dob, doj),ArrayBuffer(cust_id, bigint_column1, bigint_column2, decimal_column1, decimal_column2, double_column1, double_column2, integer_column1),org.apache.carbondata.core.carbon.metadata.schema.table.CarbonTable@2302bcb1,DictionaryMap(Map(cust_name -> true, active_emui_version -> true, dob -> false, doj -> false))), org.apache.carbondata.spark.merger.TableMeta@2d38370a, None, true at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:49) at org.apache.spark.sql.execution.aggregate.TungstenAggregate.doExecute(TungstenAggregate.scala:80) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:132) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:130) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:130) at org.apache.spark.sql.CarbonDictionaryDecoder$$anonfun$doExecute$1.apply(CarbonDictionaryDecoder.scala:214) at org.apache.spark.sql.CarbonDictionaryDecoder$$anonfun$doExecute$1.apply(CarbonDictionaryDecoder.scala:153) at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:48) ... 29 more Caused by: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: execute, tree: TungstenExchange SinglePartition, None +- TungstenAggregate(key=[], functions=[(count(1),mode=Partial,isDistinct=false)], output=[count#754L]) +- CarbonScan CarbonRelation 12jan17, uniqdata, CarbonMetaData(ArrayBuffer(cust_name, active_emui_version, dob, doj),ArrayBuffer(cust_id, bigint_column1, bigint_column2, decimal_column1, decimal_column2, double_column1, double_column2, integer_column1),org.apache.carbondata.core.carbon.metadata.schema.table.CarbonTable@2302bcb1,DictionaryMap(Map(cust_name -> true, active_emui_version -> true, dob -> false, doj -> false))), org.apache.carbondata.spark.merger.TableMeta@2d38370a, None, true at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:49) at org.apache.spark.sql.execution.Exchange.doExecute(Exchange.scala:247) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:132) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:130) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:130) at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1.apply(TungstenAggregate.scala:86) at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1.apply(TungstenAggregate.scala:80) at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:48) ... 37 more Caused by: java.io.IOException: Problem in loading segment block. at org.apache.carbondata.core.carbon.datastore.SegmentTaskIndexStore.get(SegmentTaskIndexStore.java:97) at org.apache.carbondata.core.carbon.datastore.SegmentTaskIndexStore.get(SegmentTaskIndexStore.java:52) at org.apache.carbondata.hadoop.CacheAccessClient.get(CacheAccessClient.java:67) at org.apache.carbondata.hadoop.CarbonInputFormat.getSegmentAbstractIndexs(CarbonInputFormat.java:486) at org.apache.carbondata.hadoop.CarbonInputFormat.getDataBlocksOfSegment(CarbonInputFormat.java:338) at org.apache.carbondata.hadoop.CarbonInputFormat.getSplits(CarbonInputFormat.java:295) at org.apache.carbondata.hadoop.CarbonInputFormat.getSplits(CarbonInputFormat.java:244) at org.apache.carbondata.spark.rdd.CarbonScanRDD.getPartitions(CarbonScanRDD.scala:82) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.ShuffleDependency.<init>(Dependency.scala:91) at org.apache.spark.sql.execution.Exchange.prepareShuffleDependency(Exchange.scala:220) at org.apache.spark.sql.execution.Exchange$$anonfun$doExecute$1.apply(Exchange.scala:254) at org.apache.spark.sql.execution.Exchange$$anonfun$doExecute$1.apply(Exchange.scala:248) at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:48) ... 45 more Caused by: java.lang.RuntimeException: Missing Carbon index file for partition[0] Segment[0], taskId[0] at org.apache.carbondata.core.carbon.path.CarbonTablePath.getCarbonIndexFilePath(CarbonTablePath.java:270) at org.apache.carbondata.core.util.CarbonUtil.calculateDriverBTreeSize(CarbonUtil.java:917) at org.apache.carbondata.core.carbon.datastore.SegmentTaskIndexStore.calculateRequiredSize(SegmentTaskIndexStore.java:280) at org.apache.carbondata.core.carbon.datastore.SegmentTaskIndexStore.loadAndGetTaskIdToSegmentsMap(SegmentTaskIndexStore.java:225) at org.apache.carbondata.core.carbon.datastore.SegmentTaskIndexStore.get(SegmentTaskIndexStore.java:91) ... 76 more ERROR 12-01 18:31:04,213 - Error running hive query: org.apache.hive.service.cli.HiveSQLException: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: execute, tree: CarbonDictionaryDecoder [CarbonDecoderRelation(Map(dob#280 -> dob#280, double_column1#287 -> double_column1#287, decimal_column1#285 -> decimal_column1#285, cust_id#282L -> cust_id#282L, integer_column1#289L -> integer_column1#289L, decimal_column2#286 -> decimal_column2#286, cust_name#278 -> cust_name#278, double_column2#288 -> double_column2#288, active_emui_version#279 -> active_emui_version#279, bigint_column1#283L -> bigint_column1#283L, bigint_column2#284L -> bigint_column2#284L, doj#281 -> doj#281),CarbonDatasourceRelation(`12jan17`.`uniqdata`,None))], ExcludeProfile(ArrayBuffer()), CarbonAliasDecoderRelation() +- TungstenAggregate(key=[], functions=[(count(1),mode=Final,isDistinct=false)], output=[_c0#750L]) +- TungstenExchange SinglePartition, None +- TungstenAggregate(key=[], functions=[(count(1),mode=Partial,isDistinct=false)], output=[count#754L]) +- CarbonScan CarbonRelation 12jan17, uniqdata, CarbonMetaData(ArrayBuffer(cust_name, active_emui_version, dob, doj),ArrayBuffer(cust_id, bigint_column1, bigint_column2, decimal_column1, decimal_column2, double_column1, double_column2, integer_column1),org.apache.carbondata.core.carbon.metadata.schema.table.CarbonTable@2302bcb1,DictionaryMap(Map(cust_name -> true, active_emui_version -> true, dob -> false, doj -> false))), org.apache.carbondata.spark.merger.TableMeta@2d38370a, None, true at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:246) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:154) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:151) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:164) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) After that we run : select * from uniqdata; we get following error : INFO 12-01 18:33:58,353 - Running query 'select * from uniqdata' with fcb69ebc-292d-4759-bb89-404837c1674b INFO 12-01 18:33:58,353 - pool-27-thread-47 Query [SELECT * FROM UNIQDATA] INFO 12-01 18:33:58,355 - Parsing command: select * from uniqdata INFO 12-01 18:33:58,355 - Parse Completed INFO 12-01 18:33:58,355 - Parsing command: select * from uniqdata INFO 12-01 18:33:58,356 - Parse Completed INFO 12-01 18:33:58,356 - 28: get_table : db=12jan17 tbl=uniqdata INFO 12-01 18:33:58,356 - ugi=pallavi ip=unknown-ip-addr cmd=get_table : db=12jan17 tbl=uniqdata INFO 12-01 18:33:58,356 - 28: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore INFO 12-01 18:33:58,358 - ObjectStore, initialize called INFO 12-01 18:33:58,363 - Reading in results for query "org.datanucleus.store.rdbms.query.SQLQuery@0" since the connection used is closing INFO 12-01 18:33:58,363 - Using direct SQL, underlying DB is DERBY INFO 12-01 18:33:58,363 - Initialized ObjectStore INFO 12-01 18:33:58,396 - pool-27-thread-47 Starting to optimize plan ERROR 12-01 18:33:58,459 - pool-27-thread-47 Cannot convert12-01-2017 16:02:28 to Time/Long type valueUnparseable date: "12-01-2017 16:02:28" ERROR 12-01 18:33:58,468 - pool-27-thread-47 Cannot convert12-01-2017 16:02:08 to Time/Long type valueUnparseable date: "12-01-2017 16:02:08" ERROR 12-01 18:33:58,468 - pool-27-thread-47 Cannot convert12-01-2017 16:02:08 to Time/Long type valueUnparseable date: "12-01-2017 16:02:08" ERROR 12-01 18:33:58,477 - pool-27-thread-47 Cannot convert12-01-2017 16:02:08 to Time/Long type valueUnparseable date: "12-01-2017 16:02:08" ERROR 12-01 18:33:58,478 - Error executing query, currentState RUNNING, java.io.IOException: Problem in loading segment block. at org.apache.carbondata.core.carbon.datastore.SegmentTaskIndexStore.get(SegmentTaskIndexStore.java:97) at org.apache.carbondata.core.carbon.datastore.SegmentTaskIndexStore.get(SegmentTaskIndexStore.java:52) at org.apache.carbondata.hadoop.CacheAccessClient.get(CacheAccessClient.java:67) at org.apache.carbondata.hadoop.CarbonInputFormat.getSegmentAbstractIndexs(CarbonInputFormat.java:486) at org.apache.carbondata.hadoop.CarbonInputFormat.getDataBlocksOfSegment(CarbonInputFormat.java:338) at org.apache.carbondata.hadoop.CarbonInputFormat.getSplits(CarbonInputFormat.java:295) at org.apache.carbondata.hadoop.CarbonInputFormat.getSplits(CarbonInputFormat.java:244) at org.apache.carbondata.spark.rdd.CarbonScanRDD.getPartitions(CarbonScanRDD.scala:82) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1929) at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:927) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111) at org.apache.spark.rdd.RDD.withScope(RDD.scala:316) at org.apache.spark.rdd.RDD.collect(RDD.scala:926) at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:166) at org.apache.spark.sql.execution.SparkPlan.executeCollectPublic(SparkPlan.scala:174) at org.apache.spark.sql.DataFrame$$anonfun$org$apache$spark$sql$DataFrame$$execute$1$1.apply(DataFrame.scala:1499) at org.apache.spark.sql.DataFrame$$anonfun$org$apache$spark$sql$DataFrame$$execute$1$1.apply(DataFrame.scala:1499) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:56) at org.apache.spark.sql.DataFrame.withNewExecutionId(DataFrame.scala:2086) at org.apache.spark.sql.DataFrame.org$apache$spark$sql$DataFrame$$execute$1(DataFrame.scala:1498) at org.apache.spark.sql.DataFrame$$anonfun$org$apache$spark$sql$DataFrame$$collect$1.apply(DataFrame.scala:1503) at org.apache.spark.sql.DataFrame$$anonfun$org$apache$spark$sql$DataFrame$$collect$1.apply(DataFrame.scala:1503) at org.apache.spark.sql.DataFrame.withCallback(DataFrame.scala:2099) at org.apache.spark.sql.DataFrame.org$apache$spark$sql$DataFrame$$collect(DataFrame.scala:1503) at org.apache.spark.sql.DataFrame.collect(DataFrame.scala:1480) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:226) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:154) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:151) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:164) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: Missing Carbon index file for partition[0] Segment[0], taskId[0] at org.apache.carbondata.core.carbon.path.CarbonTablePath.getCarbonIndexFilePath(CarbonTablePath.java:270) at org.apache.carbondata.core.util.CarbonUtil.calculateDriverBTreeSize(CarbonUtil.java:917) at org.apache.carbondata.core.carbon.datastore.SegmentTaskIndexStore.calculateRequiredSize(SegmentTaskIndexStore.java:280) at org.apache.carbondata.core.carbon.datastore.SegmentTaskIndexStore.loadAndGetTaskIdToSegmentsMap(SegmentTaskIndexStore.java:225) at org.apache.carbondata.core.carbon.datastore.SegmentTaskIndexStore.get(SegmentTaskIndexStore.java:91) ... 56 more ERROR 12-01 18:33:58,479 - Error running hive query: org.apache.hive.service.cli.HiveSQLException: java.io.IOException: Problem in loading segment block. at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:246) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:154) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:151) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:164) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332) |
Free forum by Nabble | Edit this page |