Problem with creating a table in Spark 2.
Posted by Marek Wiewiorka on
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Problem-with-creating-a-table-in-Spark-2-tp9969.html
Hi All - I'm trying to follow an example from the quick start guide and in
spark-shell trying to create a carbondata table in the following way:
import org.apache.spark.sql.SparkSession import
org.apache.spark.sql.CarbonSession._val carbon =
SparkSession.builder().config(sc.getConf).getOrCreateCarbonSession("/user/hive/warehouse/carbon","/mnt/hadoop/data/ssd001/tmp/mwiewior")
and then just copy paste from the documentation:
scala> carbon.sql("CREATE TABLE IF NOT EXISTS test_table(id string, name
string, city string, age Int) STORED BY 'carbondata'")
AUDIT 03-04 13:20:25,534 - [c01][hive][Thread-1]Creating Table with
Database name [default] and Table name [test_table]
java.io.FileNotFoundException:
/user/hive/warehouse/carbon/default/test_table/Metadata/schema (No such
file or directory)
at java.io.FileOutputStream.open0(Native Method)
at java.io.FileOutputStream.open(FileOutputStream.java:270)
at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
at java.io.FileOutputStream.<init>(FileOutputStream.java:133)
at
org.apache.carbondata.core.datastore.impl.FileFactory.getDataOutputStream(FileFactory.java:207)
at
org.apache.carbondata.core.writer.ThriftWriter.open(ThriftWriter.java:76)
at
org.apache.spark.sql.hive.CarbonMetastore.createTableFromThrift(CarbonMetastore.scala:330)
at
org.apache.spark.sql.execution.command.CreateTable.run(carbonTableSchema.scala:162)
at
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
at
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
at
org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)
at
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
at
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
at
org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:135)
at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at
org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:132)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:113)
at
org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:87)
at
org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:87)
at org.apache.spark.sql.Dataset.<init>(Dataset.scala:185)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:592)
... 50 elided
Could you please help me to troubleshoot this problem?
Thanks!
Marek