Storing Data Frame as CarbonData Table
Posted by Michael Shtelma on Mar 31, 2018; 10:15am
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Storing-Data-Frame-as-CarbonData-Table-tp43874.html
Hi Team,
I am new to CarbonData and wanted to test it using a couple of my test queries.
In my test I have used CarbonData 1.3.1 and Spark 2.2.1.
I have tried saving my data frame as carbon data table using the
following command :
myDF.write.format("carbondata").mode("overwrite").saveAsTable("MyTable")
As a result I have got the following exception:
java.lang.IllegalArgumentException: requirement failed: 'path' should
not be specified, the path to store carbon file is the 'storePath'
specified when creating CarbonContext
at scala.Predef$.require(Predef.scala:224)
at org.apache.spark.sql.CarbonSource.createRelation(CarbonSource.scala:90)
at org.apache.spark.sql.execution.datasources.DataSource.writeAndRead(DataSource.scala:449)
at org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectCommand.saveDataIntoTable(createDataSourceTables.scala:217)
at org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectCommand.run(createDataSourceTables.scala:177)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116)
at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92)
at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92)
at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:609)
at org.apache.spark.sql.DataFrameWriter.createTable(DataFrameWriter.scala:419)
at org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:398)
at org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:354)
... 54 elided
I am wondering now, if there is a way to save any spark data frame as
hive tables backed by carbon data format?
Am I doing smth wrong?
Best,
Michael