Apache CarbonData Dev Mailing List archive - Re: Re: Getting [Problem in loading segment blocks] error after doing multi update operations

Apache CarbonData Dev Mailing List archive

Re: Re: Getting [Problem in loading segment blocks] error after doing multi update operations

Posted by BabuLal on Mar 22, 2018; 4:09pm
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Getting-Problem-in-loading-segment-blocks-error-after-doing-multi-update-operations-tp40249p42886.html

hi all
i am able to reproduce same exception in my cluster and got the same
exception. (Trace is listed below)
------
scala> carbon.sql("select count(*) from public.c_compact4").show

2018-03-22 20:40:33,105 | WARN | main | main spark.sql.sources.options.keys
expected, but read nothing |
org.apache.carbondata.common.logging.impl.StandardLogService.logWarnMessage(StandardLogService.java:168)

----------------Store location---- ----
linux-49:/opt/babu # hadoop fs -ls
/user/hive/warehouse/carbon.store/public/c_compact4/Fact/Part0/Segment_0/*.deletedelta

-rw-rw-r--+ 3 hdfs hive 177216 2018-03-22 18:20
/user/hive/warehouse/carbon.store/public/c_compact4/Fact/Part0/Segment_0/part-0-0_batchno0-0-1521723019528.deletedelta
-rw-r--r-- 3 hdfs hive 0 2018-03-22 19:35
/user/hive/warehouse/carbon.store/public/c_compact4/Fact/Part0/Segment_0/part-0-0_batchno0-0-1521723886214.deletedelta
-rw-rw-r--+ 3 hdfs hive 87989 2018-03-22 18:20
/user/hive/warehouse/carbon.store/public/c_compact4/Fact/Part0/Segment_0/part-0-1_batchno0-0-1521723019528.deletedelta
-rw-r--r-- 3 hdfs hive 0 2018-03-22 19:35
/user/hive/warehouse/carbon.store/public/c_compact4/Fact/Part0/Segment_0/part-0-1_batchno0-0-1521723886214.deletedelta
-rw-rw-r--+ 3 hdfs hive 87989 2018-03-22 18:20
/user/hive/warehouse/carbon.store/public/c_compact4/Fact/Part0/Segment_0/part-0-2_batchno0-0-1521723019528.deletedelta
-rw-r--r-- 3 hdfs hive 0 2018-03-22 19:35
/user/hive/warehouse/carbon.store/public/c_compact4/Fact/Part0/Segment_0/part-0-2_batchno0-0-1521723886214.deletedelta

-----------------------------------------------------------
Below Method /steps are used to reproduce :-

Writing content of delete delta is failed but deletedelta file created
successfully . Failed during Horizontal Compaction ( added setSpaceQuota
in hdfs so that file can created successfully and write to this file is
failed)

org.apache.spark.sql.catalyst.errors.package$TreeNodeException: execute,
tree:
Exchange SinglePartition
+- *HashAggregate(keys=[], functions=[partial_count(1)],
output=[count#1443L])
+- *BatchedScan CarbonDatasourceHadoopRelation [ Database name :public,
Table name :c_compact4, Schema
:Some(StructType(StructField(id,StringType,true),
StructField(qqnum,StringType,true), StructField(nick,StringType,true),
StructField(age,StringType,true), StructField(gender,StringType,true),
StructField(auth,StringType,true), StructField(qunnum,StringType,true),
StructField(mvcc,StringType,true))) ] public.c_compact4[]
at
org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:56)
at
org.apache.spark.sql.execution.exchange.ShuffleExchange.doExecute(ShuffleExchange.scala:112)
at
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
at
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
at
org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:135)
at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at
org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:132)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:113)
at
org.apache.spark.sql.execution.InputAdapter.inputRDDs(WholeStageCodegenExec.scala:235)
at
org.apache.spark.sql.execution.aggregate.HashAggregateExec.inputRDDs(HashAggregateExec.scala:141)
at
org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:372)
at
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114
at
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
at
org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:135
at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at
org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:132)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:113)
at
org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:225)
at
org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:308)
at
org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:113)
at
org.apache.spark.sql.Dataset$$anonfun$org$apache$spark$sql$Dataset$$execute$1$1.apply(Dataset.scala:2386)
at
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57)
at org.apache.spark.sql.Dataset.withNewExecutionId(Dataset.scala:2788)
at
org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$execute$1(Dataset.scala:2385)
at
org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collect(Dataset.scala:2392)
at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:2128
at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:2127)
at org.apache.spark.sql.Dataset.withTypedCallback(Dataset.scala:2818)
at org.apache.spark.sql.Dataset.head(Dataset.scala:2127)
at org.apache.spark.sql.Dataset.take(Dataset.scala:2342)
at org.apache.spark.sql.Dataset.showString(Dataset.scala:248)
at org.apache.spark.sql.Dataset.show(Dataset.scala:638)
at org.apache.spark.sql.Dataset.show(Dataset.scala:597)
at org.apache.spark.sql.Dataset.show(Dataset.scala:606)
... 48 elided
Caused by: java.io.IOException: Problem in loading segment blocks.
at
org.apache.carbondata.core.indexstore.BlockletDataMapIndexStore.getAll(BlockletDataMapIndexStore.java:153)
at
org.apache.carbondata.core.indexstore.blockletindex.BlockletDataMapFactory.getDataMaps(BlockletDataMapFactory.java:76)
at
org.apache.carbondata.core.datamap.TableDataMap.prune(TableDataMap.java:72)
at
org.apache.carbondata.hadoop.api.CarbonTableInputFormat.getDataBlocksOfSegment(CarbonTableInputFormat.java:739
at
org.apache.carbondata.hadoop.api.CarbonTableInputFormat.getSplits(CarbonTableInputFormat.java:666)
at
org.apache.carbondata.hadoop.api.CarbonTableInputFormat.getSplits(CarbonTableInputFormat.java:426)
at
org.apache.carbondata.spark.rdd.CarbonScanRDD.getPartitions(CarbonScanRDD.scala:107)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)
at
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
at scala.Option.getOrElse(Option.scala:121
at org.apache.spark.rdd.RDD.partitions(RDD.scala:251
at
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)
at org.apache.spark.ShuffleDependency.<init>(Dependency.scala:91)
at
org.apache.spark.sql.execution.exchange.ShuffleExchange$.prepareShuffleDependency(ShuffleExchange.scala:273)
at
org.apache.spark.sql.execution.exchange.ShuffleExchange.prepareShuffleDependency(ShuffleExchange.scala:84)
at
org.apache.spark.sql.execution.exchange.ShuffleExchange$$anonfun$doExecute$1.apply(ShuffleExchange.scala:121)
at
org.apache.spark.sql.execution.exchange.ShuffleExchange$$anonfun$doExecute$1.apply(ShuffleExchange.scala:112)
at
org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:52)
... 81 more
Caused by: java.lang.ArrayIndexOutOfBoundsException: 0
at
org.apache.carbondata.core.datastore.filesystem.AbstractDFSCarbonFile.getLocations(AbstractDFSCarbonFile.java:509)
at
org.apache.carbondata.core.indexstore.BlockletDataMapIndexStore.getAll(BlockletDataMapIndexStore.java:142)

*Below points to be handled to fix this issue.*

1. When Horizontal Compaction is failed 0 byte delete delta file should be
deleted currently it is not deleted. This is a cleaning part of the
Horizontal Compaction fail .

2. delete delta of 0 byte should not be considered while reading .( we can
further discuss about this solution ) . currently tablestatus file has the
entry of deletedelta timestamp for 0 byte also.

3. If deleting is in progress , file is created (name node has entry of
file) but data writing is in progress (not yet flush) but at same time
select query is triggered ,then Query will failed so this scenario also
need to handle.

@dev :- Please Let me know if any other detail you want.

Thanks
Babu

--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/