hi dev.
I have a parquet table and a carbon table. This table have 1 billion rows. parquet table : ============ CREATE TABLE mc_idx3( COL_1 integer, COL_2 integer, COL_3 string, COL_4 integer, COL_5 string, COL_6 string, COL_7 string, COL_8 string, COL_9 integer, COL_10 long, COL_11 string, COL_12 string, COL_13 string, COL_14 string, COL_15 integer, COL_16 string, COL_17 Timestamp ) STORED AS PARQUET; ============== carbon table: =============== CREATE TABLE mc_idxok_cd1( COL_1 integer, COL_2 integer, COL_3 string, COL_4 integer, COL_5 string, COL_6 string, COL_7 string, COL_8 string, COL_9 integer, COL_10 long, COL_11 string, COL_12 string, COL_13 string, COL_14 string, COL_15 integer, COL_16 string, COL_17 Timestamp ) STORED BY 'carbondata' TBLPROPERTIES ( 'SORT_COLUMNS'='COL_17,COL_1'); ============= when I using insert into table mc_idxok_cd1 select * from mc_idx3. It always failed. ERROR LOG: org.apache.carbondata.processing.loading.exception.CarbonDataLoadingException: There is an unexpected error: org.apache.carbondata.core.datastore.exception.CarbonDataWriterException: Problem while copying file from local store to carbon store at org.apache.carbondata.processing.loading.steps.DataWriterProcessorStepImpl.execute(DataWriterProcessorStepImpl.java:123) at org.apache.carbondata.processing.loading.DataLoadExecutor.execute(DataLoadExecutor.java:51) at org.apache.carbondata.spark.rdd.NewDataFrameLoaderRDD$$anon$2.<init>(NewCarbonDataLoadRDD.scala:390) at org.apache.carbondata.spark.rdd.NewDataFrameLoaderRDD.internalCompute(NewCarbonDataLoadRDD.scala:353) at org.apache.carbondata.spark.rdd.CarbonRDD.compute(CarbonRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.carbondata.processing.loading.exception.CarbonDataLoadingException: org.apache.carbondata.core.datastore.exception.CarbonDataWriterException: Problem while copying file from local store to carbon store at org.apache.carbondata.processing.loading.steps.DataWriterProcessorStepImpl.processingComplete(DataWriterProcessorStepImpl.java:162) at org.apache.carbondata.processing.loading.steps.DataWriterProcessorStepImpl.finish(DataWriterProcessorStepImpl.java:148) at org.apache.carbondata.processing.loading.steps.DataWriterProcessorStepImpl.execute(DataWriterProcessorStepImpl.java:112) ------------------- can anybody give me some advice? Any advice is appreciated! |
component version:
carbondata version:1.3.1 spark:2.2.1 ------------------ 原始邮件 ------------------ 发件人: "251922566"<[hidden email]>; 发送时间: 2018年5月7日(星期一) 上午10:41 收件人: "dev"<[hidden email]>; 主题: loading data from parquet table always hi dev. I have a parquet table and a carbon table. This table have 1 billion rows. parquet table : ============ CREATE TABLE mc_idx3( COL_1 integer, COL_2 integer, COL_3 string, COL_4 integer, COL_5 string, COL_6 string, COL_7 string, COL_8 string, COL_9 integer, COL_10 long, COL_11 string, COL_12 string, COL_13 string, COL_14 string, COL_15 integer, COL_16 string, COL_17 Timestamp ) STORED AS PARQUET; ============== carbon table: =============== CREATE TABLE mc_idxok_cd1( COL_1 integer, COL_2 integer, COL_3 string, COL_4 integer, COL_5 string, COL_6 string, COL_7 string, COL_8 string, COL_9 integer, COL_10 long, COL_11 string, COL_12 string, COL_13 string, COL_14 string, COL_15 integer, COL_16 string, COL_17 Timestamp ) STORED BY 'carbondata' TBLPROPERTIES ( 'SORT_COLUMNS'='COL_17,COL_1'); ============= when I using insert into table mc_idxok_cd1 select * from mc_idx3. It always failed. ERROR LOG: org.apache.carbondata.processing.loading.exception.CarbonDataLoadingException: There is an unexpected error: org.apache.carbondata.core.datastore.exception.CarbonDataWriterException: Problem while copying file from local store to carbon store at org.apache.carbondata.processing.loading.steps.DataWriterProcessorStepImpl.execute(DataWriterProcessorStepImpl.java:123) at org.apache.carbondata.processing.loading.DataLoadExecutor.execute(DataLoadExecutor.java:51) at org.apache.carbondata.spark.rdd.NewDataFrameLoaderRDD$$anon$2.<init>(NewCarbonDataLoadRDD.scala:390) at org.apache.carbondata.spark.rdd.NewDataFrameLoaderRDD.internalCompute(NewCarbonDataLoadRDD.scala:353) at org.apache.carbondata.spark.rdd.CarbonRDD.compute(CarbonRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.carbondata.processing.loading.exception.CarbonDataLoadingException: org.apache.carbondata.core.datastore.exception.CarbonDataWriterException: Problem while copying file from local store to carbon store at org.apache.carbondata.processing.loading.steps.DataWriterProcessorStepImpl.processingComplete(DataWriterProcessorStepImpl.java:162) at org.apache.carbondata.processing.loading.steps.DataWriterProcessorStepImpl.finish(DataWriterProcessorStepImpl.java:148) at org.apache.carbondata.processing.loading.steps.DataWriterProcessorStepImpl.execute(DataWriterProcessorStepImpl.java:112) ------------------- can anybody give me some advice? Any advice is appreciated! |
In reply to this post by 喜之郎
Hi,
The exception says, there is problem while copying from local to carbonstore(HDFS). It means the writing has already finished in the temp folder and after writing it will copy the files to hdfs and it is failing during that time. So with this exception trace, it will be difficult to know the root cause for the failure, failure can happen because of HDFS also. So you can check two things 1. Check whether the space is available in HDFS or not 2. When this exception came, check what is the exception in hdfs logs. May be with that you can get some idea. There is one property called *carbon.load.directWriteHdfs.enabled* By default, this property will be false, and if you make it true, the files will be directly written to carbonstore(hdfs), instead of writing first in local and then copying. You can check by setting this property whether the load is successful or not. Regards, Akash R Nilugal -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ |
In reply to this post by 喜之郎
Hi,
The exception says, there is problem while copying from local to carbonstore(HDFS). It means the writing has already finished in the temp folder and after writing it will copy the files to hdfs and it is failing during that time. So with this exception trace, it will be difficult to know the root cause for the failure, failure can happen because of HDFS also. So you can check two things 1. Check whether the space is available in HDFS or not 2. When this exception came, check what is the exception in hdfs logs. May be with that you can get some idea. There is one property called *carbon.load.directWriteHdfs.enabled* By default, this property will be false, and if you make it true, the files will be directly written to carbonstore(hdfs), instead of writing first in local and then copying. You can check by setting this property whether the load is successful or not. Regards, Akash R Nilugal -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ |
Free forum by Nabble | Edit this page |