Hello,
I have created a carbondata table from Spark 2.2.1 on Azure (Hive 1.2.1) via CarbonSession. The Spark code looks like this : val carbon = SparkSession.builder().config("spark.sql.warehouse.dir", warehouse).config("spark.sql.crossJoin.enabled", "true").enableHiveSupport().getOrCreateCarbonSession(storeLocation) val df = carbon.sparkContext.parallelize(1 to 1000000).map(x => .... <my object creation>) import carbon.implicits._ df.write.format("carbondata").option("tableName", "carbon_table_20180816_1m_50f_p8").mode(SaveMode.Overwrite).save() Note : I can query this carbondata table without problems in Spark, but I also need to query it in Hive... However I cannot query it from Hive. If I look at the table definition in Hive I see this : hive> show create table carbon_table_20180816_1m_50f_p8; OK CREATE EXTERNAL TABLE `carbon_table_20180816_1m_50f_p8`( `col` array<string> COMMENT 'from deserializer') ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' WITH SERDEPROPERTIES ( 'carbonSchemaPartsNo'='6', 'dbName'='default', 'isExternal'='false', 'isTransactional'='true', 'isVisible'='true', 'path'='hdfs://hostname.internal.cloudapp.net/tmp/store/default/carbon_table_20180816_1m_50f_p8', 'tableName'='carbon_table_20180816_1m_50f_p8', 'tablePath'='hdfs://hostname.internal.cloudapp.net/tmp/store/default/carbon_table_20180816_1m_50f_p8') STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.SequenceFileInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat' LOCATION 'wasb://[hidden email]/hive/warehouse/carbon_table_20180816_1m_50f_p8-__PLACEHOLDER__' TBLPROPERTIES ( 'spark.sql.sources.provider'='org.apache.spark.sql.CarbonSource', 'spark.sql.sources.schema.numParts'='1', 'spark.sql.sources.schema.part.0'='{\"type\":\"struct\",\"fields\":[{\"name\":\"c1\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c2\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c3\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c4\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c5\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c6\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c7\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c8\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c9\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c10\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c11\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c12\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c13\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c14\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c15\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c16\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c17\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c18\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c19\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c20\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c21\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c22\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c23\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c24\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c25\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c26\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c27\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c28\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c29\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c30\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c31\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c32\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c33\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c34\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c35\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c36\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c37\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c38\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c39\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c40\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c41\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c42\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c43\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c44\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c45\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c46\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c47\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c48\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c49\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c50\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}}]}', 'transient_lastDdlTime'='1534416196') It is clear that it is now using the Hive Carbondata SerDe. I applied the following modifications based on https://github.com/cenyuhai/incubator-carbondata/blob/CARBONDATA-727/integration/hive/hive-guide.md#alter-schema-in-hive hive> alter table carbon_table_20180816_1m_50f_p8 set FILEFORMAT > INPUTFORMAT "org.apache.carbondata.hive.MapredCarbonInputFormat" > OUTPUTFORMAT "org.apache.carbondata.hive.MapredCarbonOutputFormat" > SERDE "org.apache.carbondata.hive.CarbonHiveSerDe"; OK hive> alter table carbon_table_20180816_1m_50f_p8 set LOCATION 'hdfs://hostname.ax.internal.cloudapp.net/tmp/store/default/carbon_table_20180816_1m_50f_p8'; OK hive> alter table carbon_table_20180816_1m_50f_p8 change col c1 string; OK Following these alter operations, I obtain the following table structure : hive> show create table carbon_table_20180816_1m_50f_p8; OK CREATE EXTERNAL TABLE `carbon_table_20180816_1m_50f_p8`( `c1` string COMMENT '') ROW FORMAT SERDE 'org.apache.carbondata.hive.CarbonHiveSerDe' WITH SERDEPROPERTIES ( 'carbonSchemaPartsNo'='6', 'dbName'='default', 'isExternal'='false', 'isTransactional'='true', 'isVisible'='true', 'path'='hdfs://hostname.internal.cloudapp.net/tmp/store/default/carbon_table_20180816_1m_50f_p8', 'tableName'='carbon_table_20180816_1m_50f_p8', 'tablePath'='hdfs://hostname.internal.cloudapp.net/tmp/store/default/carbon_table_20180816_1m_50f_p8') STORED AS INPUTFORMAT 'org.apache.carbondata.hive.MapredCarbonInputFormat' OUTPUTFORMAT 'org.apache.carbondata.hive.MapredCarbonOutputFormat' LOCATION 'hdfs://hostname.internal.cloudapp.net/tmp/store/default/carbon_table_20180816_1m_50f_p8' TBLPROPERTIES ( 'last_modified_by'='sshuser', 'last_modified_time'='1534423308', 'numFiles'='7', 'spark.sql.sources.provider'='org.apache.spark.sql.CarbonSource', 'spark.sql.sources.schema.numParts'='1', 'spark.sql.sources.schema.part.0'='{\"type\":\"struct\",\"fields\":[{\"name\":\"c1\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c2\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c3\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c4\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c5\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c6\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c7\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c8\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c9\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c10\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c11\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c12\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c13\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c14\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c15\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c16\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c17\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c18\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c19\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c20\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c21\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c22\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c23\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c24\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c25\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c26\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c27\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c28\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c29\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c30\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c31\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c32\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c33\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c34\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c35\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c36\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c37\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c38\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c39\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c40\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c41\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c42\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c43\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c44\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c45\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c46\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c47\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c48\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c49\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c50\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}}]}', 'totalSize'='11446940', 'transient_lastDdlTime'='1534423308') Before adding the extra columns (c2 to c50) I tried a basic select query, and I am getting the following error : hive> select c1 from carbon_table_20180816_1m_50f_p8 limit 1; OK Failed with exception java.io.IOException:java.io.IOException: org.apache.carbondata.core.exception.InvalidConfigurationException: Database name is not set. The dataname seems to be well defined in 'dbName'='default' ... Is there a better, easier way to query Carbondata tables (created from Spark SQL) directly from Hive?? Your help would be appreciated. Thanks Yann -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ |
Hello,
I would really appreciate your help on this error InvalidConfigurationException: Database name is not set. As anyone tried to read Carbondata from Hive on Azure? I don't know whether this problem comes from Carbondata itself, the SerDe properties have the dbName defined, however the error message seems to indicate the SerDe cannot find the value. Am I missing something obvious? I also tried the example on https://github.com/apache/carbondata/blob/master/integration/hive/hive-guide.md without success. Could you please help me figure out thatis going wrong with the Hive integration? Yann -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ |
Ok, solved it.
I had to specify mapreduce.input.carboninputformat.databaseName as well as mapreduce.input.carboninputformat.tableName in serde properties... did not find this in the documentation by in the code itself (org/apache/carbondata/hadoop/api/CarbonInputFormat.java) Thanks anyway. Yann -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ |
Free forum by Nabble | Edit this page |