Login  Register

Re: Problem while copying file from local store to carbon store

Posted by Liang Chen on Jan 10, 2017; 7:22am
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/load-data-error-from-csv-file-at-hdfs-error-in-standalone-spark-cluster-tp5783p5868.html

Hi liyinwei

Very good! You are the person who i met learnt Apache CarbonData fastest!
Can you raise one mailing list discussion about improving log info what you
mentioned.
Look forward to seeing your code contribution :)

Regards
Liang

2017-01-10 14:44 GMT+08:00 251469031 <[hidden email]>:

> thx 陈亮总。
>
>
> I've solved the problem, here is my record:
>
>
> first,
>
>
> I found the spark job failed when loading data and there is an error
> "CarbonDataWriterException: Problem while copying file from local store to
> carbon store", when located to the source code at
> ./processing/src/main/java/org/apache/carbondata/processing/store/writer/AbstractFactDataWriter,
> it shows:
>
>
> private void copyCarbonDataFileToCarbonStorePath(String localFileName)
>       throws CarbonDataWriterException {
>     long copyStartTime = System.currentTimeMillis();
>     LOGGER.info("Copying " + localFileName + " --> " + dataWriterVo.
> getCarbonDataDirectoryPath());
>     try {
>       CarbonFile localCarbonFile =
>           FileFactory.getCarbonFile(localFileName,
> FileFactory.getFileType(localFileName));
>       String carbonFilePath = dataWriterVo.getCarbonDataDirectoryPath() +
> localFileName
>           .substring(localFileName.lastIndexOf(File.separator));
>       copyLocalFileToCarbonStore(carbonFilePath, localFileName,
>           CarbonCommonConstants.BYTEBUFFER_SIZE,
>           getMaxOfBlockAndFileSize(fileSizeInBytes,
> localCarbonFile.getSize()));
>     } catch (IOException e) {
>       throw new CarbonDataWriterException(
>           "Problem while copying file from local store to carbon store");
>     }
>     LOGGER.info(
>         "Total copy time (ms) to copy file " + localFileName + " is " +
> (System.currentTimeMillis()
>             - copyStartTime));
>   }
>
>
>
> the main reason is that the method copyLocalFileToCarbonStore cause an
> IOException, but the catch block doesn't tell me what is the real reason
> that coused the error(at this moment, I really like technical logs more
> then business logs). so I add a line of code:
> ...
> catch (IOException e) {
>       LOGGER.info("-------------------logs print by liyinwei
> start---------------------");
>       LOGGER.error(e, "");
>       LOGGER.info("-------------------logs print by liyinwei end
> ---------------------");
>       throw new CarbonDataWriterException(
>           "Problem while copying file from local store to carbon store");
>
>
>
> then I rebuild the source code and it logs as follows:
>
>
> INFO  10-01 10:29:59,546 - [test_table: Graph - MDKeyGentest_table][partitionID:0]
> -------------------logs print by liyinwei start---------------------
> ERROR 10-01 10:29:59,547 - [test_table: Graph - MDKeyGentest_table][
> partitionID:0]
> java.io.FileNotFoundException: /home/hadoop/carbondata/bin/
> carbonshellstore/default/test_table/Fact/Part0/Segment_0/
> part-0-0-1484015398000.carbondata (No such file or directory)
>         at java.io.FileOutputStream.open0(Native Method)
>         ...
> INFO  10-01 10:29:59,547 - [test_table: Graph - MDKeyGentest_table][partitionID:0]
> -------------------logs print by liyinwei end  ---------------------
> ERROR 10-01 10:29:59,547 - [test_table: Graph - MDKeyGentest_table][partitionID:0]
> Problem while copying file from local store to carbon store
>
>
>
> second,
>
>
> as u see, the main reason that cause the error is a FileNotFoundException,
> which means the metadata is not found. with the help of Liang Chen & Brave
> heart, I found that the default of carbondata storePath is as below if we
> start the spark-shell by using carbon-spark-shell:
> scala> print(cc.storePath)
> /home/hadoop/carbondata/bin/carbonshellstore
>
>
>
> so I added a parameter when starting carbon-spark-shell:
> ./bin/carbon-spark-shell --conf spark.carbon.storepath=hdfs://
> master:9000/home/hadoop/carbondata/bin/carbonshellstore
>
>
> and then print the storePath:
> scala> print(cc.storePath)
> hdfs://master:9000/home/hadoop/carbondata/bin/carbonshellstore
>
>
>
>
>
> finally,
>
>
> I run the command
>
>
> cc.sql(s"load data inpath 'hdfs://master:9000/home/hadoop/sample.csv'
> into table test_table")
>
>
> again and it success, which follows:
>
>
> cc.sql("select * from test_table").show
>
>
>
>
>
>
> ------------------ Original ------------------
> From:  "Liang Chen";<[hidden email]>;
> Date:  Tue, Jan 10, 2017 12:11 PM
> To:  "dev"<[hidden email]>;
>
> Subject:  Re: Problem while copying file from local store to carbon store
>
>
>
> Hi
>
> Please use spark-shell to create carboncontext, you can refer to these
> articles :
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=67635497
>
> Regards
> Liang
>
>
>
> --
> View this message in context: http://apache-carbondata-
> mailing-list-archive.1130556.n5.nabble.com/load-data-error-
> from-csv-file-at-hdfs-error-in-standalone-spark-cluster-tp5783p5844.html
> Sent from the Apache CarbonData Mailing List archive mailing list archive
> at Nabble.com.




--
Regards
Liang