Login  Register

Storing Data Frame as CarbonData Table

Posted by Michael Shtelma on Mar 31, 2018; 10:15am
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Storing-Data-Frame-as-CarbonData-Table-tp43874.html

Hi Team,

I am new to CarbonData and wanted to test it using a couple of my test queries.
In my test I have used CarbonData 1.3.1 and Spark 2.2.1.

I have tried saving my data frame as carbon data table using the
following command :

myDF.write.format("carbondata").mode("overwrite").saveAsTable("MyTable")

As a result I have got the following exception:

java.lang.IllegalArgumentException: requirement failed: 'path' should
not be specified, the path to store carbon file is the 'storePath'
specified when creating CarbonContext

  at scala.Predef$.require(Predef.scala:224)

  at org.apache.spark.sql.CarbonSource.createRelation(CarbonSource.scala:90)

  at org.apache.spark.sql.execution.datasources.DataSource.writeAndRead(DataSource.scala:449)

  at org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectCommand.saveDataIntoTable(createDataSourceTables.scala:217)

  at org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectCommand.run(createDataSourceTables.scala:177)

  at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)

  at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)

  at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)

  at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)

  at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)

  at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138)

  at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)

  at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135)

  at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116)

  at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92)

  at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92)

  at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:609)

  at org.apache.spark.sql.DataFrameWriter.createTable(DataFrameWriter.scala:419)

  at org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:398)

  at org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:354)

  ... 54 elided

I am wondering now, if there is a way to save any spark data frame as
hive tables backed by carbon data format?
Am I doing smth wrong?

Best,
Michael