Apache CarbonData Dev Mailing List archive

Problem with creating a table in Spark 2.

Posted by Marek Wiewiorka on
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Problem-with-creating-a-table-in-Spark-2-tp9969.html

Hi All - I'm trying to follow an example from the quick start guide and in
spark-shell trying to create a carbondata table in the following way:

import org.apache.spark.sql.SparkSession import
org.apache.spark.sql.CarbonSession._val carbon =
SparkSession.builder().config(sc.getConf).getOrCreateCarbonSession("/user/hive/warehouse/carbon","/mnt/hadoop/data/ssd001/tmp/mwiewior")

and then just copy paste from the documentation:

scala> carbon.sql("CREATE TABLE IF NOT EXISTS test_table(id string, name
string, city string, age Int) STORED BY 'carbondata'")
AUDIT 03-04 13:20:25,534 - [c01][hive][Thread-1]Creating Table with
Database name [default] and Table name [test_table]
java.io.FileNotFoundException:
/user/hive/warehouse/carbon/default/test_table/Metadata/schema (No such
file or directory)
at java.io.FileOutputStream.open0(Native Method)
at java.io.FileOutputStream.open(FileOutputStream.java:270)
at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
at java.io.FileOutputStream.<init>(FileOutputStream.java:133)
at
org.apache.carbondata.core.datastore.impl.FileFactory.getDataOutputStream(FileFactory.java:207)

at
org.apache.carbondata.core.writer.ThriftWriter.open(ThriftWriter.java:76)
at
org.apache.spark.sql.hive.CarbonMetastore.createTableFromThrift(CarbonMetastore.scala:330)

at
org.apache.spark.sql.execution.command.CreateTable.run(carbonTableSchema.scala:162)

at
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)

at
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)

at
org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)

at
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)

at
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)

at
org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:135)

at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)

at
org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:132)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:113)
at
org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:87)

at
org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:87)

at org.apache.spark.sql.Dataset.<init>(Dataset.scala:185)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:592)
... 50 elided

Could you please help me to troubleshoot this problem?

Thanks!
Marek