Hi,
when i run the following script: scala>val dataFilePath = new File("/carbondata/pt/sample.csv").getCanonicalPath scala>cc.sql(s"load data inpath '$dataFilePath' into table test_table") is turns out: org.apache.carbondata.processing.etl.DataLoadingException: The input file does not exist: hdfs://master:9000hdfs://master/opt/data/carbondata/pt/sample.csv at org.apache.spark.util.FileUtils$$anonfun$getPaths$1.apply$mcVI$sp(FileUtils.scala:66) at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141) It confused me that why there is a string "hdfs://master:9000" before "hdfs://master/opt/data/carbondata/pt/sample.csv", I can't found some configuration that contains "hdfs://master:9000", could any one help me~ |
Administrator
|
Hi
This is because that you use cluster mode, but the input file is local file. 1.If you use cluster mode, please load hadoop files 2.If you just want to load local files, please use local mode.
|
Well, In the source code of carbondata, the filetype is determined as :
if (property.startsWith(CarbonUtil.HDFS_PREFIX)) { storeDefaultFileType = FileType.HDFS; } and CarbonUtil.HDFS_PREFIX="hdfs://" but when I run the following script, the dataFilePath is still local: scala> val dataFilePath = new File("hdfs://master:9000/carbondata/sample.csv").getCanonicalPath dataFilePath: String = /home/hadoop/carbondata/hdfs:/master:9000/carbondata/sample.csv ------------------ 原始邮件 ------------------ 发件人: "Liang Chen";<[hidden email]>; 发送时间: 2016年12月22日(星期四) 晚上8:47 收件人: "dev"<[hidden email]>; 主题: Re: etl.DataLoadingException: The input file does not exist Hi This is because that you use cluster mode, but the input file is local file. 1.If you use cluster mode, please load hadoop files 2.If you just want to load local files, please use local mode. 李寅威 wrote > Hi, > > when i run the following script: > > > scala>val dataFilePath = new > File("/carbondata/pt/sample.csv").getCanonicalPath > scala>cc.sql(s"load data inpath '$dataFilePath' into table test_table") > > > is turns out: > > > org.apache.carbondata.processing.etl.DataLoadingException: The input file > does not exist: > hdfs://master:9000hdfs://master/opt/data/carbondata/pt/sample.csv > at > org.apache.spark.util.FileUtils$$anonfun$getPaths$1.apply$mcVI$sp(FileUtils.scala:66) > at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141) > > > It confused me that why there is a string "hdfs://master:9000" before > "hdfs://master/opt/data/carbondata/pt/sample.csv", I can't found some > configuration that contains "hdfs://master:9000", could any one help me~ -- View this message in context: http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/etl-DataLoadingException-The-input-file-does-not-exist-tp4853p4854.html Sent from the Apache CarbonData Mailing List archive mailing list archive at Nabble.com. |
Please find the following item in carbon.properties file, give a proper path(hdfs://master:9000/)
carbon.ddl.base.hdfs.url During loading, will combine this url and data file path. BTW, better to provide the version number.
Best Regards
David Cai |
Hi 251469031,
Thanks for showing interest in carbon. For your question please refer the explanation below. scala> val dataFilePath = new File("hdfs://master:9000/ carbondata/sample.csv").getCanonicalPath dataFilePath: String = /home/hadoop/carbondata/hdfs:/ master:9000/carbondata/sample.csv If you use new File, it will always return the pointer for path from local file system. So Incase you are not appending hdfs url to the file/folder path in the Load data DDL command, you can configure *carbon.ddl.base.hdfs.url* in carbon.properties file as suggested by QiangCai. *carbon.ddl.base.hdfs.url=hdfs://<IP>:<port>* example *carbon.ddl.base.hdfs.url=hdfs://9.82.101.42:54310 <http://9.82.101.42:54310>* Regards Manish Gupta On Fri, Dec 23, 2016 at 10:09 AM, QiangCai <[hidden email]> wrote: > Please find the following item in carbon.properties file, give a proper > path(hdfs://master:9000/) > carbon.ddl.base.hdfs.url > > During loading, will combine this url and data file path. > > BTW, better to provide the version number. > > > > -- > View this message in context: http://apache-carbondata- > mailing-list-archive.1130556.n5.nabble.com/etl-DataLoadingException-The- > input-file-does-not-exist-tp4853p4888.html > Sent from the Apache CarbonData Mailing List archive mailing list archive > at Nabble.com. > |
Oh I see, I've solved it, thx very much to Manish & QiangCai~~
here is my dml script: cc.sql(s"load data inpath 'hdfs://master:9000/carbondata/pt/sample.csv' into table test_table") ------------------ 原始邮件 ------------------ 发件人: "manish gupta";<[hidden email]>; 发送时间: 2016年12月23日(星期五) 下午2:32 收件人: "dev"<[hidden email]>; 主题: Re: 回复: etl.DataLoadingException: The input file does not exist Hi 251469031, Thanks for showing interest in carbon. For your question please refer the explanation below. scala> val dataFilePath = new File("hdfs://master:9000/ carbondata/sample.csv").getCanonicalPath dataFilePath: String = /home/hadoop/carbondata/hdfs:/ master:9000/carbondata/sample.csv If you use new File, it will always return the pointer for path from local file system. So Incase you are not appending hdfs url to the file/folder path in the Load data DDL command, you can configure *carbon.ddl.base.hdfs.url* in carbon.properties file as suggested by QiangCai. *carbon.ddl.base.hdfs.url=hdfs://<IP>:<port>* example *carbon.ddl.base.hdfs.url=hdfs://9.82.101.42:54310 <http://9.82.101.42:54310>* Regards Manish Gupta On Fri, Dec 23, 2016 at 10:09 AM, QiangCai <[hidden email]> wrote: > Please find the following item in carbon.properties file, give a proper > path(hdfs://master:9000/) > carbon.ddl.base.hdfs.url > > During loading, will combine this url and data file path. > > BTW, better to provide the version number. > > > > -- > View this message in context: http://apache-carbondata- > mailing-list-archive.1130556.n5.nabble.com/etl-DataLoadingException-The- > input-file-does-not-exist-tp4853p4888.html > Sent from the Apache CarbonData Mailing List archive mailing list archive > at Nabble.com. > |
Free forum by Nabble | Edit this page |