Posted by
yannv on
Aug 16, 2018; 1:04pm
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Query-Carbondata-table-from-Hive-InvalidConfigurationException-Database-name-is-not-set-tp59719.html
Hello,
I have created a carbondata table from Spark 2.2.1 on Azure (Hive 1.2.1) via
CarbonSession.
The Spark code looks like this :
val carbon = SparkSession.builder().config("spark.sql.warehouse.dir",
warehouse).config("spark.sql.crossJoin.enabled",
"true").enableHiveSupport().getOrCreateCarbonSession(storeLocation)
val df = carbon.sparkContext.parallelize(1 to 1000000).map(x => .... <my
object creation>)
import carbon.implicits._
df.write.format("carbondata").option("tableName",
"carbon_table_20180816_1m_50f_p8").mode(SaveMode.Overwrite).save()
Note : I can query this carbondata table without problems in Spark, but I
also need to query it in Hive...
However I cannot query it from Hive.
If I look at the table definition in Hive I see this :
hive> show create table carbon_table_20180816_1m_50f_p8;
OK
CREATE EXTERNAL TABLE `carbon_table_20180816_1m_50f_p8`(
`col` array<string> COMMENT 'from deserializer')
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
WITH SERDEPROPERTIES (
'carbonSchemaPartsNo'='6',
'dbName'='default',
'isExternal'='false',
'isTransactional'='true',
'isVisible'='true',
'path'='hdfs://hostname.internal.cloudapp.net/tmp/store/default/carbon_table_20180816_1m_50f_p8',
'tableName'='carbon_table_20180816_1m_50f_p8',
'tablePath'='hdfs://hostname.internal.cloudapp.net/tmp/store/default/carbon_table_20180816_1m_50f_p8')
STORED AS INPUTFORMAT
'org.apache.hadoop.mapred.SequenceFileInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat'
LOCATION
'wasb://
[hidden email]/hive/warehouse/carbon_table_20180816_1m_50f_p8-__PLACEHOLDER__'
TBLPROPERTIES (
'spark.sql.sources.provider'='org.apache.spark.sql.CarbonSource',
'spark.sql.sources.schema.numParts'='1',
'spark.sql.sources.schema.part.0'='{\"type\":\"struct\",\"fields\":[{\"name\":\"c1\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c2\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c3\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c4\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c5\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c6\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c7\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c8\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c9\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c10\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c11\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c12\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c13\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c14\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c15\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c16\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c17\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c18\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c19\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c20\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c21\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c22\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c23\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c24\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c25\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c26\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c27\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c28\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c29\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c30\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c31\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c32\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c33\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c34\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c35\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c36\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c37\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c38\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c39\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c40\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c41\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c42\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c43\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c44\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c45\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c46\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c47\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c48\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c49\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c50\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}}]}',
'transient_lastDdlTime'='1534416196')
It is clear that it is now using the Hive Carbondata SerDe.
I applied the following modifications based on
https://github.com/cenyuhai/incubator-carbondata/blob/CARBONDATA-727/integration/hive/hive-guide.md#alter-schema-in-hivehive> alter table carbon_table_20180816_1m_50f_p8 set FILEFORMAT
> INPUTFORMAT "org.apache.carbondata.hive.MapredCarbonInputFormat"
> OUTPUTFORMAT "org.apache.carbondata.hive.MapredCarbonOutputFormat"
> SERDE "org.apache.carbondata.hive.CarbonHiveSerDe";
OK
hive> alter table carbon_table_20180816_1m_50f_p8 set LOCATION
'hdfs://hostname.ax.internal.cloudapp.net/tmp/store/default/carbon_table_20180816_1m_50f_p8';
OK
hive> alter table carbon_table_20180816_1m_50f_p8 change col c1 string;
OK
Following these alter operations, I obtain the following table structure :
hive> show create table carbon_table_20180816_1m_50f_p8;
OK
CREATE EXTERNAL TABLE `carbon_table_20180816_1m_50f_p8`(
`c1` string COMMENT '')
ROW FORMAT SERDE
'org.apache.carbondata.hive.CarbonHiveSerDe'
WITH SERDEPROPERTIES (
'carbonSchemaPartsNo'='6',
'dbName'='default',
'isExternal'='false',
'isTransactional'='true',
'isVisible'='true',
'path'='hdfs://hostname.internal.cloudapp.net/tmp/store/default/carbon_table_20180816_1m_50f_p8',
'tableName'='carbon_table_20180816_1m_50f_p8',
'tablePath'='hdfs://hostname.internal.cloudapp.net/tmp/store/default/carbon_table_20180816_1m_50f_p8')
STORED AS INPUTFORMAT
'org.apache.carbondata.hive.MapredCarbonInputFormat'
OUTPUTFORMAT
'org.apache.carbondata.hive.MapredCarbonOutputFormat'
LOCATION
'hdfs://hostname.internal.cloudapp.net/tmp/store/default/carbon_table_20180816_1m_50f_p8'
TBLPROPERTIES (
'last_modified_by'='sshuser',
'last_modified_time'='1534423308',
'numFiles'='7',
'spark.sql.sources.provider'='org.apache.spark.sql.CarbonSource',
'spark.sql.sources.schema.numParts'='1',
'spark.sql.sources.schema.part.0'='{\"type\":\"struct\",\"fields\":[{\"name\":\"c1\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c2\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c3\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c4\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c5\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c6\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c7\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c8\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c9\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c10\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c11\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c12\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c13\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c14\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c15\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c16\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c17\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c18\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c19\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c20\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c21\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c22\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c23\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c24\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c25\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c26\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c27\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c28\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c29\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c30\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c31\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c32\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c33\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c34\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c35\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c36\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c37\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c38\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c39\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c40\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c41\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c42\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c43\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c44\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c45\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c46\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c47\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c48\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c49\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c50\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}}]}',
'totalSize'='11446940',
'transient_lastDdlTime'='1534423308')
Before adding the extra columns (c2 to c50) I tried a basic select query,
and I am getting the following error :
hive> select c1 from carbon_table_20180816_1m_50f_p8 limit 1;
OK
Failed with exception java.io.IOException:java.io.IOException:
org.apache.carbondata.core.exception.InvalidConfigurationException: Database
name is not set.
The dataname seems to be well defined in 'dbName'='default' ...
Is there a better, easier way to query Carbondata tables (created from Spark
SQL) directly from Hive??
Your help would be appreciated.
Thanks
Yann
--
Sent from:
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/