Chetan Bhat created CARBONDATA-1032:
--------------------------------------- Summary: NumberFormatException and NegativeArraySizeException for select with in clause filter limit for unsafe true configuration Key: CARBONDATA-1032 URL: https://issues.apache.org/jira/browse/CARBONDATA-1032 Project: CarbonData Issue Type: Bug Components: data-query Affects Versions: 0.1.1-incubating Environment: 3 node cluster SUSE 11 SP4 Reporter: Chetan Bhat Carbon .properties are configured as below: carbon.allowed.compaction.days = 2 carbon.enable.auto.load.merge = false carbon.compaction.level.threshold = 3,2 carbon.timestamp.format = yyyy-MM-dd carbon.badRecords.location = /tmp/carbon carbon.numberof.preserve.segments = 2 carbon.sort.file.buffer.size = 20 max.query.execution.time = 60 carbon.number.of.cores.while.loading = 8 carbon.storelocation =hdfs://hacluster/opt/CarbonStore enable.data.loading.statistics = true enable.unsafe.sort = true offheap.sort.chunk.size.inmb = 128 sort.inmemory.size.inmb = 30720 carbon.enable.vector.reader=true enable.unsafe.in.query.processing=true enable.query.statistics=true carbon.blockletgroup.size.in.mb=128 high.cardinality.identify.enable=TRUE high.cardinality.threshold=10000 high.cardinality.value=1000 high.cardinality.row.count.percentage=40 carbon.data.file.version=2 carbon.major.compaction.size=2 carbon.enable.auto.load.merge=FALSE carbon.numberof.preserve.segments=1 carbon.allowed.compaction.days=1 User creates table, loads data and executes the select with in clause filter limit. Actual Result : NumberFormatException and NegativeArraySizeException for select with in clause filter limit for unsafe true configuration. 0: jdbc:hive2://172.168.100.199:23040> select * from flow_carbon_test4 where opp_bk in ('1491999999158','1491999999116','1491999999022','1491999999031') and dt>='20140101' and dt <= '20160101' order by bal asc limit 1000; Error: org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 2109.0 failed 4 times, most recent failure: Lost task 1.3 in stage 2109.0 (TID 75120, linux-49, executor 2): java.lang.NegativeArraySizeException at org.apache.carbondata.core.datastore.chunk.store.impl.unsafe.UnsafeBigDecimalMeasureChunkStore.getBigDecimal(UnsafeBigDecimalMeasureChunkStore.java:132) at org.apache.carbondata.core.datastore.compression.decimal.CompressByteArray.getBigDecimalValue(CompressByteArray.java:94) at org.apache.carbondata.core.datastore.dataholder.CarbonReadDataHolder.getReadableBigDecimalValueByIndex(CarbonReadDataHolder.java:38) at org.apache.carbondata.core.scan.result.vector.MeasureDataVectorProcessor$DecimalMeasureVectorFiller.fillMeasureVectorForFilter(MeasureDataVectorProcessor.java:253) at org.apache.carbondata.core.scan.result.impl.FilterQueryScannedResult.fillColumnarMeasureBatch(FilterQueryScannedResult.java:119) at org.apache.carbondata.core.scan.collector.impl.DictionaryBasedVectorResultCollector.scanAndFillResult(DictionaryBasedVectorResultCollector.java:145) at org.apache.carbondata.core.scan.collector.impl.DictionaryBasedVectorResultCollector.collectVectorBatch(DictionaryBasedVectorResultCollector.java:137) at org.apache.carbondata.core.scan.processor.impl.DataBlockIteratorImpl.processNextBatch(DataBlockIteratorImpl.java:65) at org.apache.carbondata.core.scan.result.iterator.VectorDetailQueryResultIterator.processNextBatch(VectorDetailQueryResultIterator.java:46) at org.apache.carbondata.spark.vectorreader.VectorizedCarbonRecordReader.nextBatch(VectorizedCarbonRecordReader.java:251) at org.apache.carbondata.spark.vectorreader.VectorizedCarbonRecordReader.nextKeyValue(VectorizedCarbonRecordReader.java:141) at org.apache.carbondata.spark.rdd.CarbonScanRDD$$anon$1.hasNext(CarbonScanRDD.scala:221) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.scan_nextBatch$(Unknown Source) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:377) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.convert.Wrappers$IteratorWrapper.hasNext(Wrappers.scala:30) at org.spark_project.guava.collect.Ordering.leastOf(Ordering.java:628) at org.apache.spark.util.collection.Utils$.takeOrdered(Utils.scala:37) at org.apache.spark.sql.execution.TakeOrderedAndProjectExec$$anonfun$5.apply(limit.scala:148) at org.apache.spark.sql.execution.TakeOrderedAndProjectExec$$anonfun$5.apply(limit.scala:147) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:796) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:796) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53) at org.apache.spark.scheduler.Task.run(Task.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Driver stacktrace: (state=,code=0) 0: jdbc:hive2://172.168.100.199:23040> select * from flow_carbon_test4 where cus_ac like '622262135067246539%' and (txn_dte>='20150101' and txn_dte<='20160101') and txn_bk IN ('00000000000', '00000000001','00000000002') OR own_bk IN ('00000000424','00000001383','00000001942','00000001262') limit 1000; Error: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 131.0 failed 4 times, most recent failure: Lost task 0.3 in stage 131.0 (TID 240, linux-51, executor 1): java.lang.NumberFormatException: Zero length BigInteger at java.math.BigInteger.<init>(BigInteger.java:293) at org.apache.carbondata.core.util.DataTypeUtil.byteToBigDecimal(DataTypeUtil.java:189) at org.apache.carbondata.core.datastore.chunk.store.impl.unsafe.UnsafeBigDecimalMeasureChunkStore.getBigDecimal(UnsafeBigDecimalMeasureChunkStore.java:136) at org.apache.carbondata.core.datastore.compression.decimal.CompressByteArray.getBigDecimalValue(CompressByteArray.java:94) at org.apache.carbondata.core.datastore.dataholder.CarbonReadDataHolder.getReadableBigDecimalValueByIndex(CarbonReadDataHolder.java:38) at org.apache.carbondata.core.scan.collector.impl.AbstractScannedResultCollector.getMeasureData(AbstractScannedResultCollector.java:104) at org.apache.carbondata.core.scan.collector.impl.AbstractScannedResultCollector.fillMeasureData(AbstractScannedResultCollector.java:78) at org.apache.carbondata.core.scan.collector.impl.DictionaryBasedResultCollector.fillMeasureData(DictionaryBasedResultCollector.java:158) at org.apache.carbondata.core.scan.collector.impl.DictionaryBasedResultCollector.collectData(DictionaryBasedResultCollector.java:115) at org.apache.carbondata.core.scan.processor.impl.DataBlockIteratorImpl.next(DataBlockIteratorImpl.java:51) at org.apache.carbondata.core.scan.processor.impl.DataBlockIteratorImpl.next(DataBlockIteratorImpl.java:32) at org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.getBatchResult(DetailQueryResultIterator.java:50) at org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.next(DetailQueryResultIterator.java:41) at org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.next(DetailQueryResultIterator.java:31) at org.apache.carbondata.core.scan.result.iterator.ChunkRowIterator.<init>(ChunkRowIterator.java:41) at org.apache.carbondata.hadoop.CarbonRecordReader.initialize(CarbonRecordReader.java:78) at org.apache.carbondata.spark.rdd.CarbonScanRDD.compute(CarbonScanRDD.scala:204) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.sql.CarbonDecoderRDD.compute(CarbonDictionaryDecoder.scala:538) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Driver stacktrace: (state=,code=0) Expected Result : select with in clause filter limit for unsafe true configuration should execute successfully displaying correct result set without exception. -- This message was sent by Atlassian JIRA (v6.3.15#6346) |
Free forum by Nabble | Edit this page |