Hi Dev team,
I run spark-shell in my local spark standalone mode. It returned error java.io.IOException: No input paths specified in job when I was trying to save the df to carbondata file. Do I miss any settings about the path?? ========================================================================================================================== scala> df.write.format("carbondata").option("tableName", "MyCarbon1").option("compress", "true").option("useKettle", "false").mode(SaveMode.Overwrite).save() INFO 13-12 13:58:12,899 - main Query [ CREATE TABLE IF NOT EXISTS DEFAULT.MYCARBON1 (VIN STRING, DATA_DATE STRING, WORK_MODEL DOUBLE) STORED BY 'ORG.APACHE.CARBONDATA.FORMAT' ] INFO 13-12 13:58:13,060 - Removed broadcast_0_piece0 on localhost:56692 in memory (size: 19.5 KB, free: 143.2 MB) INFO 13-12 13:58:13,081 - Parsing command: CREATE TABLE IF NOT EXISTS default.MyCarbon1 (vin STRING, data_date STRING, work_model DOUBLE) STORED BY 'org.apache.carbondata.format' INFO 13-12 13:58:14,008 - Parse Completed AUDIT 13-12 13:58:14,326 - [lumac.local][lucao][Thread-1]Creating Table with Database name [default] and Table name [mycarbon1] INFO 13-12 13:58:14,335 - 0: get_tables: db=default pat=.* INFO 13-12 13:58:14,335 - ugi=lucao ip=unknown-ip-addr cmd=get_tables: db=default pat=.* INFO 13-12 13:58:14,342 - main Table block size not specified for default_mycarbon1. Therefore considering the default value 1024 MB INFO 13-12 13:58:14,434 - Table mycarbon1 for Database default created successfully. INFO 13-12 13:58:14,434 - main Table mycarbon1 for Database default created successfully. INFO 13-12 13:58:14,440 - main Query [CREATE TABLE DEFAULT.MYCARBON1 USING CARBONDATA OPTIONS (TABLENAME "DEFAULT.MYCARBON1", TABLEPATH "HDFS://LOCALHOST:9000/USER/LUCAO/DEFAULT/MYCARBON1") ] INFO 13-12 13:58:14,452 - 0: get_table : db=default tbl=mycarbon1 INFO 13-12 13:58:14,452 - ugi=lucao ip=unknown-ip-addr cmd=get_table : db=default tbl=mycarbon1 WARN 13-12 13:58:14,463 - Couldn't find corresponding Hive SerDe for data source provider carbondata. Persisting data source relation `default`.`mycarbon1` into Hive metastore in Spark SQL specific format, which is NOT compatible with Hive. INFO 13-12 13:58:14,588 - 0: create_table: Table(tableName:mycarbon1, dbName:default, owner:lucao, createTime:1481608694, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:col, type:array<string>, comment:from deserializer)], location:null, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.MetadataTypedColumnsetSerDe, parameters:{tablePath=hdfs://localhost:9000/user/lucao/default/mycarbon1, serialization.format=1, tableName=default.mycarbon1}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], skewedColValueLocationMaps:{})), partitionKeys:[], parameters:{EXTERNAL=TRUE, spark.sql.sources.provider=carbondata}, viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE, privileges:PrincipalPrivilegeSet(userPrivileges:{}, groupPrivileges:null, rolePrivileges:null)) INFO 13-12 13:58:14,588 - ugi=lucao ip=unknown-ip-addr cmd=create_table: Table(tableName:mycarbon1, dbName:default, owner:lucao, createTime:1481608694, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:col, type:array<string>, comment:from deserializer)], location:null, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.MetadataTypedColumnsetSerDe, parameters:{tablePath=hdfs://localhost:9000/user/lucao/default/mycarbon1, serialization.format=1, tableName=default.mycarbon1}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], skewedColValueLocationMaps:{})), partitionKeys:[], parameters:{EXTERNAL=TRUE, spark.sql.sources.provider=carbondata}, viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE, privileges:PrincipalPrivilegeSet(userPrivileges:{}, groupPrivileges:null, rolePrivileges:null)) INFO 13-12 13:58:14,598 - Creating directory if it doesn't exist: hdfs://localhost:9000/user/hive/warehouse/mycarbon1 AUDIT 13-12 13:58:14,717 - [lumac.local][lucao][Thread-1]Table created with Database name [default] and Table name [mycarbon1] INFO 13-12 13:58:14,767 - mapred.output.compress is deprecated. Instead, use mapreduce.output.fileoutputformat.compress INFO 13-12 13:58:14,767 - mapred.output.compression.codec is deprecated. Instead, use mapreduce.output.fileoutputformat.compress.codec INFO 13-12 13:58:14,767 - mapred.output.compression.type is deprecated. Instead, use mapreduce.output.fileoutputformat.compress.type INFO 13-12 13:58:14,781 - mapred.tip.id is deprecated. Instead, use mapreduce.task.id INFO 13-12 13:58:14,781 - mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id INFO 13-12 13:58:14,782 - mapred.task.is.map is deprecated. Instead, use mapreduce.task.ismap INFO 13-12 13:58:14,782 - mapred.task.partition is deprecated. Instead, use mapreduce.task.partition INFO 13-12 13:58:14,782 - mapred.job.id is deprecated. Instead, use mapreduce.job.id java.io.IOException: No input paths specified in job at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:201) at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:313) at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:199) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1922) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1.apply$mcV$sp(PairRDDFunctions.scala:1213) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1.apply(PairRDDFunctions.scala:1156) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1.apply(PairRDDFunctions.scala:1156) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111) at org.apache.spark.rdd.RDD.withScope(RDD.scala:316) at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopDataset(PairRDDFunctions.scala:1156) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$4.apply$mcV$sp(PairRDDFunctions.scala:1060) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$4.apply(PairRDDFunctions.scala:1026) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$4.apply(PairRDDFunctions.scala:1026) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111) at org.apache.spark.rdd.RDD.withScope(RDD.scala:316) at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:1026) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$3.apply$mcV$sp(PairRDDFunctions.scala:1007) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$3.apply(PairRDDFunctions.scala:1007) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$3.apply(PairRDDFunctions.scala:1007) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111) at org.apache.spark.rdd.RDD.withScope(RDD.scala:316) at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:1006) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$2.apply$mcV$sp(PairRDDFunctions.scala:964) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$2.apply(PairRDDFunctions.scala:962) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$2.apply(PairRDDFunctions.scala:962) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111) at org.apache.spark.rdd.RDD.withScope(RDD.scala:316) at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:962) at org.apache.spark.rdd.RDD$$anonfun$saveAsTextFile$2.apply$mcV$sp(RDD.scala:1461) at org.apache.spark.rdd.RDD$$anonfun$saveAsTextFile$2.apply(RDD.scala:1449) at org.apache.spark.rdd.RDD$$anonfun$saveAsTextFile$2.apply(RDD.scala:1449) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111) at org.apache.spark.rdd.RDD.withScope(RDD.scala:316) at org.apache.spark.rdd.RDD.saveAsTextFile(RDD.scala:1449) at com.databricks.spark.csv.package$CsvSchemaRDD.saveAsCsvFile(package.scala:170) at com.databricks.spark.csv.newapi.DefaultSource.createRelation(DefaultSource.scala:177) at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:222) at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:148) at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:139) at org.apache.carbondata.spark.CarbonDataFrameWriter.writeToTempCSVFile(CarbonDataFrameWriter.scala:116) at org.apache.carbondata.spark.CarbonDataFrameWriter.loadTempCSV(CarbonDataFrameWriter.scala:72) at org.apache.carbondata.spark.CarbonDataFrameWriter.writeToCarbonFile(CarbonDataFrameWriter.scala:52) at org.apache.carbondata.spark.CarbonDataFrameWriter.saveAsCarbonFile(CarbonDataFrameWriter.scala:39) at org.apache.spark.sql.CarbonSource.createRelation(CarbonDatasourceRelation.scala:112) at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:222) at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:148) at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:46) at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:51) at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:53) at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:55) at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:57) at $iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:59) at $iwC$$iwC$$iwC$$iwC.<init>(<console>:61) at $iwC$$iwC$$iwC.<init>(<console>:63) at $iwC$$iwC.<init>(<console>:65) at $iwC.<init>(<console>:67) at <init>(<console>:69) at .<init>(<console>:73) at .<clinit>(<console>) at .<init>(<console>:7) at .<clinit>(<console>) at $print(<console>) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065) at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346) at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840) at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871) at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819) at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857) at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902) at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814) at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657) at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665) at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:670) at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:997) at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945) at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945) at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135) at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945) at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059) at org.apache.spark.repl.Main$.main(Main.scala:31) at org.apache.spark.repl.Main.main(Main.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Thanks, Lionel |
Administrator
|
Hi
As discussed, please use 0.2.0 version, and use load method. 2016-12-13 14:08 GMT+08:00 Lu Cao <[hidden email]>: > Hi Dev team, > I run spark-shell in my local spark standalone mode. It returned error > > java.io.IOException: No input paths specified in job > > when I was trying to save the df to carbondata file. Do I miss any > settings about the path?? > > > > ============================================================ > ============================================================== > > scala> df.write.format("carbondata").option("tableName", > "MyCarbon1").option("compress", "true").option("useKettle", > "false").mode(SaveMode.Overwrite).save() > > INFO 13-12 13:58:12,899 - main Query [ > > CREATE TABLE IF NOT EXISTS DEFAULT.MYCARBON1 > > (VIN STRING, DATA_DATE STRING, WORK_MODEL DOUBLE) > > STORED BY 'ORG.APACHE.CARBONDATA.FORMAT' > > ] > > INFO 13-12 13:58:13,060 - Removed broadcast_0_piece0 on > localhost:56692 in memory (size: 19.5 KB, free: 143.2 MB) > > INFO 13-12 13:58:13,081 - Parsing command: > > CREATE TABLE IF NOT EXISTS default.MyCarbon1 > > (vin STRING, data_date STRING, work_model DOUBLE) > > STORED BY 'org.apache.carbondata.format' > > > > INFO 13-12 13:58:14,008 - Parse Completed > > AUDIT 13-12 13:58:14,326 - [lumac.local][lucao][Thread-1]Creating > Table with Database name [default] and Table name [mycarbon1] > > INFO 13-12 13:58:14,335 - 0: get_tables: db=default pat=.* > > INFO 13-12 13:58:14,335 - ugi=lucao ip=unknown-ip-addr > cmd=get_tables: db=default pat=.* > > INFO 13-12 13:58:14,342 - main Table block size not specified for > default_mycarbon1. Therefore considering the default value 1024 MB > > INFO 13-12 13:58:14,434 - Table mycarbon1 for Database default > created successfully. > > INFO 13-12 13:58:14,434 - main Table mycarbon1 for Database default > created successfully. > > INFO 13-12 13:58:14,440 - main Query [CREATE TABLE DEFAULT.MYCARBON1 > USING CARBONDATA OPTIONS (TABLENAME "DEFAULT.MYCARBON1", TABLEPATH > "HDFS://LOCALHOST:9000/USER/LUCAO/DEFAULT/MYCARBON1") ] > > INFO 13-12 13:58:14,452 - 0: get_table : db=default tbl=mycarbon1 > > INFO 13-12 13:58:14,452 - ugi=lucao ip=unknown-ip-addr cmd=get_table > : db=default tbl=mycarbon1 > > WARN 13-12 13:58:14,463 - Couldn't find corresponding Hive SerDe for > data source provider carbondata. Persisting data source relation > `default`.`mycarbon1` into Hive metastore in Spark SQL specific > format, which is NOT compatible with Hive. > > INFO 13-12 13:58:14,588 - 0: create_table: Table(tableName:mycarbon1, > dbName:default, owner:lucao, createTime:1481608694, lastAccessTime:0, > retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:col, > type:array<string>, comment:from deserializer)], location:null, > inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, > outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, > compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, > serializationLib:org.apache.hadoop.hive.serde2. > MetadataTypedColumnsetSerDe, > parameters:{tablePath=hdfs://localhost:9000/user/lucao/default/mycarbon1, > serialization.format=1, tableName=default.mycarbon1}), bucketCols:[], > sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], > skewedColValues:[], skewedColValueLocationMaps:{})), partitionKeys:[], > parameters:{EXTERNAL=TRUE, spark.sql.sources.provider=carbondata}, > viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE, > privileges:PrincipalPrivilegeSet(userPrivileges:{}, > groupPrivileges:null, rolePrivileges:null)) > > INFO 13-12 13:58:14,588 - ugi=lucao ip=unknown-ip-addr > cmd=create_table: Table(tableName:mycarbon1, dbName:default, > owner:lucao, createTime:1481608694, lastAccessTime:0, retention:0, > sd:StorageDescriptor(cols:[FieldSchema(name:col, type:array<string>, > comment:from deserializer)], location:null, > inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, > outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, > compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, > serializationLib:org.apache.hadoop.hive.serde2. > MetadataTypedColumnsetSerDe, > parameters:{tablePath=hdfs://localhost:9000/user/lucao/default/mycarbon1, > serialization.format=1, tableName=default.mycarbon1}), bucketCols:[], > sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], > skewedColValues:[], skewedColValueLocationMaps:{})), partitionKeys:[], > parameters:{EXTERNAL=TRUE, spark.sql.sources.provider=carbondata}, > viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE, > privileges:PrincipalPrivilegeSet(userPrivileges:{}, > groupPrivileges:null, rolePrivileges:null)) > > INFO 13-12 13:58:14,598 - Creating directory if it doesn't exist: > hdfs://localhost:9000/user/hive/warehouse/mycarbon1 > > AUDIT 13-12 13:58:14,717 - [lumac.local][lucao][Thread-1]Table created > with Database name [default] and Table name [mycarbon1] > > INFO 13-12 13:58:14,767 - mapred.output.compress is deprecated. > Instead, use mapreduce.output.fileoutputformat.compress > > INFO 13-12 13:58:14,767 - mapred.output.compression.codec is > deprecated. Instead, use > mapreduce.output.fileoutputformat.compress.codec > > INFO 13-12 13:58:14,767 - mapred.output.compression.type is > deprecated. Instead, use > mapreduce.output.fileoutputformat.compress.type > > INFO 13-12 13:58:14,781 - mapred.tip.id is deprecated. Instead, use > mapreduce.task.id > > INFO 13-12 13:58:14,781 - mapred.task.id is deprecated. Instead, use > mapreduce.task.attempt.id > > INFO 13-12 13:58:14,782 - mapred.task.is.map is deprecated. Instead, > use mapreduce.task.ismap > > INFO 13-12 13:58:14,782 - mapred.task.partition is deprecated. > Instead, use mapreduce.task.partition > > INFO 13-12 13:58:14,782 - mapred.job.id is deprecated. Instead, use > mapreduce.job.id > > java.io.IOException: No input paths specified in job > > at org.apache.hadoop.mapred.FileInputFormat.listStatus( > FileInputFormat.java:201) > > at org.apache.hadoop.mapred.FileInputFormat.getSplits( > FileInputFormat.java:313) > > at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:199) > > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) > > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) > > at scala.Option.getOrElse(Option.scala:120) > > at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) > > at org.apache.spark.rdd.MapPartitionsRDD.getPartitions( > MapPartitionsRDD.scala:35) > > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) > > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) > > at scala.Option.getOrElse(Option.scala:120) > > at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) > > at org.apache.spark.rdd.MapPartitionsRDD.getPartitions( > MapPartitionsRDD.scala:35) > > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) > > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) > > at scala.Option.getOrElse(Option.scala:120) > > at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) > > at org.apache.spark.rdd.MapPartitionsRDD.getPartitions( > MapPartitionsRDD.scala:35) > > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) > > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) > > at scala.Option.getOrElse(Option.scala:120) > > at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) > > at org.apache.spark.rdd.MapPartitionsRDD.getPartitions( > MapPartitionsRDD.scala:35) > > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) > > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) > > at scala.Option.getOrElse(Option.scala:120) > > at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) > > at org.apache.spark.rdd.MapPartitionsRDD.getPartitions( > MapPartitionsRDD.scala:35) > > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) > > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) > > at scala.Option.getOrElse(Option.scala:120) > > at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) > > at org.apache.spark.rdd.MapPartitionsRDD.getPartitions( > MapPartitionsRDD.scala:35) > > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) > > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) > > at scala.Option.getOrElse(Option.scala:120) > > at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) > > at org.apache.spark.rdd.MapPartitionsRDD.getPartitions( > MapPartitionsRDD.scala:35) > > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) > > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) > > at scala.Option.getOrElse(Option.scala:120) > > at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) > > at org.apache.spark.rdd.MapPartitionsRDD.getPartitions( > MapPartitionsRDD.scala:35) > > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) > > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) > > at scala.Option.getOrElse(Option.scala:120) > > at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) > > at org.apache.spark.SparkContext.runJob(SparkContext.scala:1922) > > at org.apache.spark.rdd.PairRDDFunctions$$anonfun$ > saveAsHadoopDataset$1.apply$mcV$sp(PairRDDFunctions.scala:1213) > > at org.apache.spark.rdd.PairRDDFunctions$$anonfun$ > saveAsHadoopDataset$1.apply(PairRDDFunctions.scala:1156) > > at org.apache.spark.rdd.PairRDDFunctions$$anonfun$ > saveAsHadoopDataset$1.apply(PairRDDFunctions.scala:1156) > > at org.apache.spark.rdd.RDDOperationScope$.withScope( > RDDOperationScope.scala:150) > > at org.apache.spark.rdd.RDDOperationScope$.withScope( > RDDOperationScope.scala:111) > > at org.apache.spark.rdd.RDD.withScope(RDD.scala:316) > > at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopDataset( > PairRDDFunctions.scala:1156) > > at org.apache.spark.rdd.PairRDDFunctions$$anonfun$ > saveAsHadoopFile$4.apply$mcV$sp(PairRDDFunctions.scala:1060) > > at org.apache.spark.rdd.PairRDDFunctions$$anonfun$ > saveAsHadoopFile$4.apply(PairRDDFunctions.scala:1026) > > at org.apache.spark.rdd.PairRDDFunctions$$anonfun$ > saveAsHadoopFile$4.apply(PairRDDFunctions.scala:1026) > > at org.apache.spark.rdd.RDDOperationScope$.withScope( > RDDOperationScope.scala:150) > > at org.apache.spark.rdd.RDDOperationScope$.withScope( > RDDOperationScope.scala:111) > > at org.apache.spark.rdd.RDD.withScope(RDD.scala:316) > > at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile( > PairRDDFunctions.scala:1026) > > at org.apache.spark.rdd.PairRDDFunctions$$anonfun$ > saveAsHadoopFile$3.apply$mcV$sp(PairRDDFunctions.scala:1007) > > at org.apache.spark.rdd.PairRDDFunctions$$anonfun$ > saveAsHadoopFile$3.apply(PairRDDFunctions.scala:1007) > > at org.apache.spark.rdd.PairRDDFunctions$$anonfun$ > saveAsHadoopFile$3.apply(PairRDDFunctions.scala:1007) > > at org.apache.spark.rdd.RDDOperationScope$.withScope( > RDDOperationScope.scala:150) > > at org.apache.spark.rdd.RDDOperationScope$.withScope( > RDDOperationScope.scala:111) > > at org.apache.spark.rdd.RDD.withScope(RDD.scala:316) > > at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile( > PairRDDFunctions.scala:1006) > > at org.apache.spark.rdd.PairRDDFunctions$$anonfun$ > saveAsHadoopFile$2.apply$mcV$sp(PairRDDFunctions.scala:964) > > at org.apache.spark.rdd.PairRDDFunctions$$anonfun$ > saveAsHadoopFile$2.apply(PairRDDFunctions.scala:962) > > at org.apache.spark.rdd.PairRDDFunctions$$anonfun$ > saveAsHadoopFile$2.apply(PairRDDFunctions.scala:962) > > at org.apache.spark.rdd.RDDOperationScope$.withScope( > RDDOperationScope.scala:150) > > at org.apache.spark.rdd.RDDOperationScope$.withScope( > RDDOperationScope.scala:111) > > at org.apache.spark.rdd.RDD.withScope(RDD.scala:316) > > at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile( > PairRDDFunctions.scala:962) > > at org.apache.spark.rdd.RDD$$anonfun$saveAsTextFile$2. > apply$mcV$sp(RDD.scala:1461) > > at org.apache.spark.rdd.RDD$$anonfun$saveAsTextFile$2. > apply(RDD.scala:1449) > > at org.apache.spark.rdd.RDD$$anonfun$saveAsTextFile$2. > apply(RDD.scala:1449) > > at org.apache.spark.rdd.RDDOperationScope$.withScope( > RDDOperationScope.scala:150) > > at org.apache.spark.rdd.RDDOperationScope$.withScope( > RDDOperationScope.scala:111) > > at org.apache.spark.rdd.RDD.withScope(RDD.scala:316) > > at org.apache.spark.rdd.RDD.saveAsTextFile(RDD.scala:1449) > > at com.databricks.spark.csv.package$CsvSchemaRDD. > saveAsCsvFile(package.scala:170) > > at com.databricks.spark.csv.newapi.DefaultSource. > createRelation(DefaultSource.scala:177) > > at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply( > ResolvedDataSource.scala:222) > > at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:148) > > at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:139) > > at org.apache.carbondata.spark.CarbonDataFrameWriter.writeToTempCSVFile( > CarbonDataFrameWriter.scala:116) > > at org.apache.carbondata.spark.CarbonDataFrameWriter.loadTempCSV( > CarbonDataFrameWriter.scala:72) > > at org.apache.carbondata.spark.CarbonDataFrameWriter.writeToCarbonFile( > CarbonDataFrameWriter.scala:52) > > at org.apache.carbondata.spark.CarbonDataFrameWriter.saveAsCarbonFile( > CarbonDataFrameWriter.scala:39) > > at org.apache.spark.sql.CarbonSource.createRelation( > CarbonDatasourceRelation.scala:112) > > at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply( > ResolvedDataSource.scala:222) > > at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:148) > > at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:46) > > at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:51) > > at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:53) > > at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:55) > > at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:57) > > at $iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:59) > > at $iwC$$iwC$$iwC$$iwC.<init>(<console>:61) > > at $iwC$$iwC$$iwC.<init>(<console>:63) > > at $iwC$$iwC.<init>(<console>:65) > > at $iwC.<init>(<console>:67) > > at <init>(<console>:69) > > at .<init>(<console>:73) > > at .<clinit>(<console>) > > at .<init>(<console>:7) > > at .<clinit>(<console>) > > at $print(<console>) > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > at sun.reflect.NativeMethodAccessorImpl.invoke( > NativeMethodAccessorImpl.java:62) > > at sun.reflect.DelegatingMethodAccessorImpl.invoke( > DelegatingMethodAccessorImpl.java:43) > > at java.lang.reflect.Method.invoke(Method.java:498) > > at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call( > SparkIMain.scala:1065) > > at org.apache.spark.repl.SparkIMain$Request.loadAndRun( > SparkIMain.scala:1346) > > at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840) > > at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871) > > at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819) > > at org.apache.spark.repl.SparkILoop.reallyInterpret$1( > SparkILoop.scala:857) > > at org.apache.spark.repl.SparkILoop.interpretStartingWith( > SparkILoop.scala:902) > > at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814) > > at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657) > > at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665) > > at org.apache.spark.repl.SparkILoop.org$apache$spark$ > repl$SparkILoop$$loop(SparkILoop.scala:670) > > at org.apache.spark.repl.SparkILoop$$anonfun$org$ > apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:997) > > at org.apache.spark.repl.SparkILoop$$anonfun$org$ > apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945) > > at org.apache.spark.repl.SparkILoop$$anonfun$org$ > apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945) > > at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader( > ScalaClassLoader.scala:135) > > at org.apache.spark.repl.SparkILoop.org$apache$spark$ > repl$SparkILoop$$process(SparkILoop.scala:945) > > at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059) > > at org.apache.spark.repl.Main$.main(Main.scala:31) > > at org.apache.spark.repl.Main.main(Main.scala) > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > at sun.reflect.NativeMethodAccessorImpl.invoke( > NativeMethodAccessorImpl.java:62) > > at sun.reflect.DelegatingMethodAccessorImpl.invoke( > DelegatingMethodAccessorImpl.java:43) > > at java.lang.reflect.Method.invoke(Method.java:498) > > at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$ > deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) > > at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) > > at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) > > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) > > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > > > Thanks, > Lionel > -- Regards Liang |
Free forum by Nabble | Edit this page |