hi all, when I use carbondata to run a query "select count(*) from action_carbondata where starttimestr = 20180301;", then an error occurs. This is the error info:
################### 0: jdbc:hive2://localhost:10000> select count(*) from action_carbondata where starttimestr = 20180301; Error: org.apache.spark.SparkException: Job aborted due to stage failure: Task 12 in stage 7.0 failed 4 times, most recent failure: Lost task 12.3 in stage 7.0 (TID 173, sz-pg-entanalytics-research-001.tendcloud.com, executor 1): org.apache.spark.util.TaskCompletionListenerException: org.apache.carbondata.core.scan.executor.exception.QueryExecutionException: Previous exception in task: java.util.concurrent.ExecutionException: java.util.concurrent.ExecutionException: java.io.IOException: org.apache.thrift.protocol.TProtocolException: Required field 'data_chunk_list' was not present! Struct: DataChunk3(data_chunk_list:null) org.apache.carbondata.core.scan.processor.AbstractDataBlockIterator.updateScanner(AbstractDataBlockIterator.java:136) org.apache.carbondata.core.scan.processor.impl.DataBlockIteratorImpl.processNextBatch(DataBlockIteratorImpl.java:64) org.apache.carbondata.core.scan.result.iterator.VectorDetailQueryResultIterator.processNextBatch(VectorDetailQueryResultIterator.java:46) org.apache.carbondata.spark.vectorreader.VectorizedCarbonRecordReader.nextBatch(VectorizedCarbonRecordReader.java:283) org.apache.carbondata.spark.vectorreader.VectorizedCarbonRecordReader.nextKeyValue(VectorizedCarbonRecordReader.java:171) org.apache.carbondata.spark.rdd.CarbonScanRDD$$anon$1.hasNext(CarbonScanRDD.scala:391) org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.scan_nextBatch$(Unknown Source) org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.agg_doAggregateWithoutKey$(Unknown Source) org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125) org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96) org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53) org.apache.spark.scheduler.Task.run(Task.scala:108) org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) java.lang.Thread.run(Thread.java:745) at org.apache.spark.TaskContextImpl.invokeListeners(TaskContextImpl.scala:138) at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:116) at org.apache.spark.scheduler.Task.run(Task.scala:118) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Driver stacktrace: (state=,code=0) ################### create table statement: CREATE TABLE action_carbondata( cur_appversioncode integer, cur_appversionname integer, cur_browserid integer, cur_carrierid integer, cur_channelid integer, cur_cityid integer, cur_countryid integer, cur_ip string, cur_networkid integer, cur_osid integer, cur_provinceid integer, deviceproductoffset long, duration integer, eventcount integer, eventlabelid integer, eventtypeid integer, organizationid integer, platformid integer, productid integer, relatedaccountproductoffset long, sessionduration integer, sessionid string, sessionstarttime long, sessionstatus integer, sourceid integer, starttime long, starttimestr string ) partitioned by (eventid int) STORED BY 'carbondata' TBLPROPERTIES ('partition_type'='Hash','NUM_PARTITIONS'='39', 'SORT_COLUMNS'='productid,sourceid,starttimestr,platformid,organizationid,eventtypeid,eventlabelid,cur_channelid,cur_provinceid,cur_countryid,cur_cityid,cur_osid,cur_appversioncode,cur_appversionname,cur_carrierid,cur_networkid,cur_browserid,sessionstatus,cur_ip'); The value of "starttimestr" field: 20180303 20180304. any advice is appreciated! the carbondata version is : apache-carbondata-1.3.1-bin-spark2.2.1-hadoop2.7.2.jar spark version is : spark-2.2.1-bin-hadoop2.7 |
Administrator
|
Hi
From the log message, seems like can't find the data files. Can you provide more detail info : 1. How you created carbonsession and how loaded data. 2. Have you deployed cluster or only single machine? Regards Liang 喜之郎 wrote > hi all, when I use carbondata to run a query "select count(*) from > action_carbondata where starttimestr = 20180301;", then an error occurs. > This is the error info: > ################### > 0: jdbc:hive2://localhost:10000> select count(*) from action_carbondata > where starttimestr = 20180301; > Error: org.apache.spark.SparkException: Job aborted due to stage failure: > Task 12 in stage 7.0 failed 4 times, most recent failure: Lost task 12.3 > in stage 7.0 (TID 173, sz-pg-entanalytics-research-001.tendcloud.com, > executor 1): org.apache.spark.util.TaskCompletionListenerException: > org.apache.carbondata.core.scan.executor.exception.QueryExecutionException: > > > Previous exception in task: java.util.concurrent.ExecutionException: > java.util.concurrent.ExecutionException: java.io.IOException: > org.apache.thrift.protocol.TProtocolException: Required field > 'data_chunk_list' was not present! Struct: > DataChunk3(data_chunk_list:null) > > org.apache.carbondata.core.scan.processor.AbstractDataBlockIterator.updateScanner(AbstractDataBlockIterator.java:136) > > org.apache.carbondata.core.scan.processor.impl.DataBlockIteratorImpl.processNextBatch(DataBlockIteratorImpl.java:64) > > org.apache.carbondata.core.scan.result.iterator.VectorDetailQueryResultIterator.processNextBatch(VectorDetailQueryResultIterator.java:46) > > org.apache.carbondata.spark.vectorreader.VectorizedCarbonRecordReader.nextBatch(VectorizedCarbonRecordReader.java:283) > > org.apache.carbondata.spark.vectorreader.VectorizedCarbonRecordReader.nextKeyValue(VectorizedCarbonRecordReader.java:171) > > org.apache.carbondata.spark.rdd.CarbonScanRDD$$anon$1.hasNext(CarbonScanRDD.scala:391) > > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.scan_nextBatch$(Unknown > Source) > > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.agg_doAggregateWithoutKey$(Unknown > Source) > > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown > Source) > > org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) > > org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) > scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > > org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125) > > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96) > > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53) > org.apache.spark.scheduler.Task.run(Task.scala:108) > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338) > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > java.lang.Thread.run(Thread.java:745) > at > org.apache.spark.TaskContextImpl.invokeListeners(TaskContextImpl.scala:138) > at > org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:116) > at org.apache.spark.scheduler.Task.run(Task.scala:118) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > > > Driver stacktrace: (state=,code=0) > > ################### > > > create table statement: > CREATE TABLE action_carbondata( > cur_appversioncode integer, > cur_appversionname integer, > cur_browserid integer, > cur_carrierid integer, > cur_channelid integer, > cur_cityid integer, > cur_countryid integer, > cur_ip string, > cur_networkid integer, > cur_osid integer, > cur_provinceid integer, > deviceproductoffset long, > duration integer, > eventcount integer, > eventlabelid integer, > eventtypeid integer, > organizationid integer, > platformid integer, > productid integer, > relatedaccountproductoffset long, > sessionduration integer, > sessionid string, > sessionstarttime long, > sessionstatus integer, > sourceid integer, > starttime long, > starttimestr string ) > partitioned by (eventid int) > STORED BY 'carbondata' > TBLPROPERTIES ('partition_type'='Hash','NUM_PARTITIONS'='39', > 'SORT_COLUMNS'='productid,sourceid,starttimestr,platformid,organizationid,eventtypeid,eventlabelid,cur_channelid,cur_provinceid,cur_countryid,cur_cityid,cur_osid,cur_appversioncode,cur_appversionname,cur_carrierid,cur_networkid,cur_browserid,sessionstatus,cur_ip'); > > > > The value of "starttimestr" field: > 20180303 > 20180304. > > > > > any advice is appreciated! > > > > > > the carbondata version is : > apache-carbondata-1.3.1-bin-spark2.2.1-hadoop2.7.2.jar > > > spark version is : > spark-2.2.1-bin-hadoop2.7 -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ |
I think the problem may be metadata related. What's your thrift version? Have you update carbon version recently after the data is loaded? FROM MOBILE EMAIL CLIENT On 04/16/2018 15:51, Liang Chen wrote: Hi From the log message, seems like can't find the data files. Can you provide more detail info : 1. How you created carbonsession and how loaded data. 2. Have you deployed cluster or only single machine? Regards Liang 喜之郎 wrote > hi all, when I use carbondata to run a query "select count(*) from > action_carbondata where starttimestr = 20180301;", then an error occurs. > This is the error info: > ################### > 0: jdbc:hive2://localhost:10000> select count(*) from action_carbondata > where starttimestr = 20180301; > Error: org.apache.spark.SparkException: Job aborted due to stage failure: > Task 12 in stage 7.0 failed 4 times, most recent failure: Lost task 12.3 > in stage 7.0 (TID 173, sz-pg-entanalytics-research-001.tendcloud.com, > executor 1): org.apache.spark.util.TaskCompletionListenerException: > org.apache.carbondata.core.scan.executor.exception.QueryExecutionException: > > > Previous exception in task: java.util.concurrent.ExecutionException: > java.util.concurrent.ExecutionException: java.io.IOException: > org.apache.thrift.protocol.TProtocolException: Required field > 'data_chunk_list' was not present! Struct: > DataChunk3(data_chunk_list:null) > > org.apache.carbondata.core.scan.processor.AbstractDataBlockIterator.updateScanner(AbstractDataBlockIterator.java:136) > > org.apache.carbondata.core.scan.processor.impl.DataBlockIteratorImpl.processNextBatch(DataBlockIteratorImpl.java:64) > > org.apache.carbondata.core.scan.result.iterator.VectorDetailQueryResultIterator.processNextBatch(VectorDetailQueryResultIterator.java:46) > > org.apache.carbondata.spark.vectorreader.VectorizedCarbonRecordReader.nextBatch(VectorizedCarbonRecordReader.java:283) > > org.apache.carbondata.spark.vectorreader.VectorizedCarbonRecordReader.nextKeyValue(VectorizedCarbonRecordReader.java:171) > > org.apache.carbondata.spark.rdd.CarbonScanRDD$$anon$1.hasNext(CarbonScanRDD.scala:391) > > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.scan_nextBatch$(Unknown > Source) > > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.agg_doAggregateWithoutKey$(Unknown > Source) > > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown > Source) > > org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) > > org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) > scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > > org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125) > > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96) > > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53) > org.apache.spark.scheduler.Task.run(Task.scala:108) > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338) > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > java.lang.Thread.run(Thread.java:745) > at > org.apache.spark.TaskContextImpl.invokeListeners(TaskContextImpl.scala:138) > at > org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:116) > at org.apache.spark.scheduler.Task.run(Task.scala:118) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > > > Driver stacktrace: (state=,code=0) > > ################### > > > create table statement: > CREATE TABLE action_carbondata( > cur_appversioncode integer, > cur_appversionname integer, > cur_browserid integer, > cur_carrierid integer, > cur_channelid integer, > cur_cityid integer, > cur_countryid integer, > cur_ip string, > cur_networkid integer, > cur_osid integer, > cur_provinceid integer, > deviceproductoffset long, > duration integer, > eventcount integer, > eventlabelid integer, > eventtypeid integer, > organizationid integer, > platformid integer, > productid integer, > relatedaccountproductoffset long, > sessionduration integer, > sessionid string, > sessionstarttime long, > sessionstatus integer, > sourceid integer, > starttime long, > starttimestr string ) > partitioned by (eventid int) > STORED BY 'carbondata' > TBLPROPERTIES ('partition_type'='Hash','NUM_PARTITIONS'='39', > 'SORT_COLUMNS'='productid,sourceid,starttimestr,platformid,organizationid,eventtypeid,eventlabelid,cur_channelid,cur_provinceid,cur_countryid,cur_cityid,cur_osid,cur_appversioncode,cur_appversionname,cur_carrierid,cur_networkid,cur_browserid,sessionstatus,cur_ip'); > > > > The value of "starttimestr" field: > 20180303 > 20180304. > > > > > any advice is appreciated! > > > > > > the carbondata version is : > apache-carbondata-1.3.1-bin-spark2.2.1-hadoop2.7.2.jar > > > spark version is : > spark-2.2.1-bin-hadoop2.7 -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
|
In reply to this post by Liang Chen
hi, liang chen.
I start thriftserver, then use beeline to execute this sql. I use "insert into XXX select * from a_parquet_table" to load data. I deploy a yarn cluster. Because I can not find what's the problem, I use "insert overwrite" to load data again, then the problem disappear. ------------------ 原始邮件 ------------------ 发件人: "Liang Chen"<[hidden email]>; 发送时间: 2018年4月16日(星期一) 下午3:51 收件人: "dev"<[hidden email]>; 主题: Re: query on string type return error Hi From the log message, seems like can't find the data files. Can you provide more detail info : 1. How you created carbonsession and how loaded data. 2. Have you deployed cluster or only single machine? Regards Liang 喜之郎 wrote > hi all, when I use carbondata to run a query "select count(*) from > action_carbondata where starttimestr = 20180301;", then an error occurs. > This is the error info: > ################### > 0: jdbc:hive2://localhost:10000> select count(*) from action_carbondata > where starttimestr = 20180301; > Error: org.apache.spark.SparkException: Job aborted due to stage failure: > Task 12 in stage 7.0 failed 4 times, most recent failure: Lost task 12.3 > in stage 7.0 (TID 173, sz-pg-entanalytics-research-001.tendcloud.com, > executor 1): org.apache.spark.util.TaskCompletionListenerException: > org.apache.carbondata.core.scan.executor.exception.QueryExecutionException: > > > Previous exception in task: java.util.concurrent.ExecutionException: > java.util.concurrent.ExecutionException: java.io.IOException: > org.apache.thrift.protocol.TProtocolException: Required field > 'data_chunk_list' was not present! Struct: > DataChunk3(data_chunk_list:null) > > org.apache.carbondata.core.scan.processor.AbstractDataBlockIterator.updateScanner(AbstractDataBlockIterator.java:136) > > org.apache.carbondata.core.scan.processor.impl.DataBlockIteratorImpl.processNextBatch(DataBlockIteratorImpl.java:64) > > org.apache.carbondata.core.scan.result.iterator.VectorDetailQueryResultIterator.processNextBatch(VectorDetailQueryResultIterator.java:46) > > org.apache.carbondata.spark.vectorreader.VectorizedCarbonRecordReader.nextBatch(VectorizedCarbonRecordReader.java:283) > > org.apache.carbondata.spark.vectorreader.VectorizedCarbonRecordReader.nextKeyValue(VectorizedCarbonRecordReader.java:171) > > org.apache.carbondata.spark.rdd.CarbonScanRDD$$anon$1.hasNext(CarbonScanRDD.scala:391) > > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.scan_nextBatch$(Unknown > Source) > > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.agg_doAggregateWithoutKey$(Unknown > Source) > > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown > Source) > > org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) > > org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) > scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > > org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125) > > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96) > > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53) > org.apache.spark.scheduler.Task.run(Task.scala:108) > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338) > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > java.lang.Thread.run(Thread.java:745) > at > org.apache.spark.TaskContextImpl.invokeListeners(TaskContextImpl.scala:138) > at > org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:116) > at org.apache.spark.scheduler.Task.run(Task.scala:118) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > > > Driver stacktrace: (state=,code=0) > > ################### > > > create table statement: > CREATE TABLE action_carbondata( > cur_appversioncode integer, > cur_appversionname integer, > cur_browserid integer, > cur_carrierid integer, > cur_channelid integer, > cur_cityid integer, > cur_countryid integer, > cur_ip string, > cur_networkid integer, > cur_osid integer, > cur_provinceid integer, > deviceproductoffset long, > duration integer, > eventcount integer, > eventlabelid integer, > eventtypeid integer, > organizationid integer, > platformid integer, > productid integer, > relatedaccountproductoffset long, > sessionduration integer, > sessionid string, > sessionstarttime long, > sessionstatus integer, > sourceid integer, > starttime long, > starttimestr string ) > partitioned by (eventid int) > STORED BY 'carbondata' > TBLPROPERTIES ('partition_type'='Hash','NUM_PARTITIONS'='39', > 'SORT_COLUMNS'='productid,sourceid,starttimestr,platformid,organizationid,eventtypeid,eventlabelid,cur_channelid,cur_provinceid,cur_countryid,cur_cityid,cur_osid,cur_appversioncode,cur_appversionname,cur_carrierid,cur_networkid,cur_browserid,sessionstatus,cur_ip'); > > > > The value of "starttimestr" field: > 20180303 > 20180304. > > > > > any advice is appreciated! > > > > > > the carbondata version is : > apache-carbondata-1.3.1-bin-spark2.2.1-hadoop2.7.2.jar > > > spark version is : > spark-2.2.1-bin-hadoop2.7 -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ |
In reply to this post by xuchuanyin
I use apache-carbondata-1.3.1-bin-spark2.2.1-hadoop2.7.2.jar, which is downloaded from website. I did not build myself.
So I don't know thrift version. I don't update carbon version. ------------------ 原始邮件 ------------------ 发件人: "xuchuanyin"<[hidden email]>; 发送时间: 2018年4月16日(星期一) 晚上7:04 收件人: "carbondata"<[hidden email]>; 主题: Re: query on string type return error I think the problem may be metadata related. What's your thrift version? Have you update carbon version recently after the data is loaded? FROM MOBILE EMAIL CLIENT On 04/16/2018 15:51, Liang Chen wrote: Hi From the log message, seems like can't find the data files. Can you provide more detail info : 1. How you created carbonsession and how loaded data. 2. Have you deployed cluster or only single machine? Regards Liang 喜之郎 wrote > hi all, when I use carbondata to run a query "select count(*) from > action_carbondata where starttimestr = 20180301;", then an error occurs. > This is the error info: > ################### > 0: jdbc:hive2://localhost:10000> select count(*) from action_carbondata > where starttimestr = 20180301; > Error: org.apache.spark.SparkException: Job aborted due to stage failure: > Task 12 in stage 7.0 failed 4 times, most recent failure: Lost task 12.3 > in stage 7.0 (TID 173, sz-pg-entanalytics-research-001.tendcloud.com, > executor 1): org.apache.spark.util.TaskCompletionListenerException: > org.apache.carbondata.core.scan.executor.exception.QueryExecutionException: > > > Previous exception in task: java.util.concurrent.ExecutionException: > java.util.concurrent.ExecutionException: java.io.IOException: > org.apache.thrift.protocol.TProtocolException: Required field > 'data_chunk_list' was not present! Struct: > DataChunk3(data_chunk_list:null) > > org.apache.carbondata.core.scan.processor.AbstractDataBlockIterator.updateScanner(AbstractDataBlockIterator.java:136) > > org.apache.carbondata.core.scan.processor.impl.DataBlockIteratorImpl.processNextBatch(DataBlockIteratorImpl.java:64) > > org.apache.carbondata.core.scan.result.iterator.VectorDetailQueryResultIterator.processNextBatch(VectorDetailQueryResultIterator.java:46) > > org.apache.carbondata.spark.vectorreader.VectorizedCarbonRecordReader.nextBatch(VectorizedCarbonRecordReader.java:283) > > org.apache.carbondata.spark.vectorreader.VectorizedCarbonRecordReader.nextKeyValue(VectorizedCarbonRecordReader.java:171) > > org.apache.carbondata.spark.rdd.CarbonScanRDD$$anon$1.hasNext(CarbonScanRDD.scala:391) > > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.scan_nextBatch$(Unknown > Source) > > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.agg_doAggregateWithoutKey$(Unknown > Source) > > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown > Source) > > org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) > > org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) > scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > > org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125) > > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96) > > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53) > org.apache.spark.scheduler.Task.run(Task.scala:108) > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338) > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > java.lang.Thread.run(Thread.java:745) > at > org.apache.spark.TaskContextImpl.invokeListeners(TaskContextImpl.scala:138) > at > org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:116) > at org.apache.spark.scheduler.Task.run(Task.scala:118) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > > > Driver stacktrace: (state=,code=0) > > ################### > > > create table statement: > CREATE TABLE action_carbondata( > cur_appversioncode integer, > cur_appversionname integer, > cur_browserid integer, > cur_carrierid integer, > cur_channelid integer, > cur_cityid integer, > cur_countryid integer, > cur_ip string, > cur_networkid integer, > cur_osid integer, > cur_provinceid integer, > deviceproductoffset long, > duration integer, > eventcount integer, > eventlabelid integer, > eventtypeid integer, > organizationid integer, > platformid integer, > productid integer, > relatedaccountproductoffset long, > sessionduration integer, > sessionid string, > sessionstarttime long, > sessionstatus integer, > sourceid integer, > starttime long, > starttimestr string ) > partitioned by (eventid int) > STORED BY 'carbondata' > TBLPROPERTIES ('partition_type'='Hash','NUM_PARTITIONS'='39', > 'SORT_COLUMNS'='productid,sourceid,starttimestr,platformid,organizationid,eventtypeid,eventlabelid,cur_channelid,cur_provinceid,cur_countryid,cur_cityid,cur_osid,cur_appversioncode,cur_appversionname,cur_carrierid,cur_networkid,cur_browserid,sessionstatus,cur_ip'); > > > > The value of "starttimestr" field: > 20180303 > 20180304. > > > > > any advice is appreciated! > > > > > > the carbondata version is : > apache-carbondata-1.3.1-bin-spark2.2.1-hadoop2.7.2.jar > > > spark version is : > spark-2.2.1-bin-hadoop2.7 -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ |
In reply to this post by 喜之郎
I encounter the same error in later carbon version. Have you find the root
cause and solution for this error? -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ |
Free forum by Nabble | Edit this page |