http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Storing-Data-Frame-as-CarbonData-Table-tp43874p44268.html
It has worked in this way.
comparable with parquet.
Now I am using default properties, actually no properties at all.
I have tried saving one table to carbon, and it took ages comparable to parquet.
case.
It is looking, that only one or two cores are actively used.
> Hi Michael
>
> Yes, it is very easy to save any spark data to carbondata.
> Just need to do small change based on your script, as below :
> myDF.write
> .format("carbondata")
> .option("tableName" "MyTable")
> .mode(SaveMode.Overwrite)
> .save()
>
> For more detail, you can refer to examples:
>
https://github.com/apache/carbondata/blob/master/examples/spark2/src/main/scala/org/apache/carbondata/examples/CarbonDataFrameExample.scala>
>
> HTH.
>
> Regards
> Liang
>
>
> 2018-03-31 18:15 GMT+08:00 Michael Shtelma <
[hidden email]>:
>
>> Hi Team,
>>
>> I am new to CarbonData and wanted to test it using a couple of my test
>> queries.
>> In my test I have used CarbonData 1.3.1 and Spark 2.2.1.
>>
>> I have tried saving my data frame as carbon data table using the
>> following command :
>>
>> myDF.write.format("carbondata").mode("overwrite").saveAsTable("MyTable")
>>
>> As a result I have got the following exception:
>>
>> java.lang.IllegalArgumentException: requirement failed: 'path' should
>> not be specified, the path to store carbon file is the 'storePath'
>> specified when creating CarbonContext
>>
>> at scala.Predef$.require(Predef.scala:224)
>>
>> at org.apache.spark.sql.CarbonSource.createRelation(
>> CarbonSource.scala:90)
>>
>> at org.apache.spark.sql.execution.datasources.DataSource.writeAndRead(
>> DataSource.scala:449)
>>
>> at org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectC
>> ommand.saveDataIntoTable(createDataSourceTables.scala:217)
>>
>> at org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectC
>> ommand.run(createDataSourceTables.scala:177)
>>
>> at org.apache.spark.sql.execution.command.ExecutedCommandExec.
>> sideEffectResult$lzycompute(commands.scala:58)
>>
>> at org.apache.spark.sql.execution.command.ExecutedCommandExec.
>> sideEffectResult(commands.scala:56)
>>
>> at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(
>> commands.scala:74)
>>
>> at org.apache.spark.sql.execution.SparkPlan$$anonfun$
>> execute$1.apply(SparkPlan.scala:117)
>>
>> at org.apache.spark.sql.execution.SparkPlan$$anonfun$
>> execute$1.apply(SparkPlan.scala:117)
>>
>> at org.apache.spark.sql.execution.SparkPlan$$anonfun$
>> executeQuery$1.apply(SparkPlan.scala:138)
>>
>> at org.apache.spark.rdd.RDDOperationScope$.withScope(
>> RDDOperationScope.scala:151)
>>
>> at org.apache.spark.sql.execution.SparkPlan.
>> executeQuery(SparkPlan.scala:135)
>>
>> at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116)
>>
>> at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(
>> QueryExecution.scala:92)
>>
>> at org.apache.spark.sql.execution.QueryExecution.
>> toRdd(QueryExecution.scala:92)
>>
>> at org.apache.spark.sql.DataFrameWriter.runCommand(
>> DataFrameWriter.scala:609)
>>
>> at org.apache.spark.sql.DataFrameWriter.createTable(
>> DataFrameWriter.scala:419)
>>
>> at org.apache.spark.sql.DataFrameWriter.saveAsTable(
>> DataFrameWriter.scala:398)
>>
>> at org.apache.spark.sql.DataFrameWriter.saveAsTable(
>> DataFrameWriter.scala:354)
>>
>> ... 54 elided
>>
>> I am wondering now, if there is a way to save any spark data frame as
>> hive tables backed by carbon data format?
>> Am I doing smth wrong?
>>
>> Best,
>> Michael
>>
>
>
>
> --
> Regards
> Liang