[jira] [Commented] (CARBONDATA-657) We are not able to create table with shared dictionary columns in spark 2.1

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (CARBONDATA-657) We are not able to create table with shared dictionary columns in spark 2.1

Akash R Nilugal (Jira)

    [ https://issues.apache.org/jira/browse/CARBONDATA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15831747#comment-15831747 ]

sounak chakraborty commented on CARBONDATA-657:
-----------------------------------------------

Please reverify. In latest code the output is proper.

1 row selected (0.204 seconds)
0: jdbc:hive2://localhost:10000> drop table uniqdata_date;
+---------+--+| Result  |
+---------+--+
+---------+--+
No rows selected (0.215 seconds)
0: jdbc:hive2://localhost:10000> CREATE TABLE uniqdata_date (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB date, DOJ date, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 int) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES ("TABLE_BLOCKSIZE"= "256 MB");
+---------+--+
| Result  |
+---------+--+
+---------+--+
No rows selected (0.115 seconds)
0: jdbc:hive2://localhost:10000>   LOAD DATA INPATH 'hdfs://localhost:54310/sc/one_row.csv' into table uniqdata_date OPTIONS('FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
+---------+--+
| Result  |
+---------+--+
+---------+--+
No rows selected (0.913 seconds)
0: jdbc:hive2://localhost:10000> select count(CUST_NAME) from uniqdata_date;
+------+--+| _c0  |
+------+--+
| 1    |
+------+--+
1 row selected (0.237 seconds)
0: jdbc:hive2://localhost:10000> show segments for table uniqdata_date;
+--------------------+----------+--------------------------+--------------------------+--+| SegmentSequenceId  |  Status  |     Load Start Time      |      Load End Time       |
+--------------------+----------+--------------------------+--------------------------+--+
| 0                  | Success  | 2017-01-20 19:00:48.798  | 2017-01-20 19:00:49.473  |
+--------------------+----------+--------------------------+--------------------------+--+
1 row selected (0.018 seconds)
0: jdbc:hive2://localhost:10000>   LOAD DATA INPATH 'hdfs://localhost:54310/sc/one_row.csv' into table uniqdata_date OPTIONS('FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');+---------+--+
| Result  |
+---------+--+
+---------+--+
No rows selected (0.363 seconds)
0: jdbc:hive2://localhost:10000> select count(CUST_NAME) from uniqdata_date;
+------+--+| _c0  |
+------+--+
| 2    |
+------+--+
1 row selected (0.219 seconds)
0: jdbc:hive2://localhost:10000> show segments for table uniqdata_date;
+--------------------+----------+--------------------------+--------------------------+--+
| SegmentSequenceId  |  Status  |     Load Start Time      |      Load End Time       |
+--------------------+----------+--------------------------+--------------------------+--+
| 1                  | Success  | 2017-01-20 19:01:01.961  | 2017-01-20 19:01:02.167  |
| 0                  | Success  | 2017-01-20 19:00:48.798  | 2017-01-20 19:00:49.473  |
+--------------------+----------+--------------------------+--------------------------+--+
2 rows selected (0.017 seconds)
0: jdbc:hive2://localhost:10000>   LOAD DATA INPATH 'hdfs://localhost:54310/sc/one_row.csv' into table uniqdata_date OPTIONS('FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
+---------+--+
| Result  |
+---------+--+
+---------+--+
No rows selected (0.396 seconds)
0: jdbc:hive2://localhost:10000> select count(CUST_NAME) from uniqdata_date;
+------+--+
| _c0  |
+------+--+
| 3    |
+------+--+
1 row selected (0.22 seconds)
0: jdbc:hive2://localhost:10000> show segments for table uniqdata_date;
+--------------------+----------+--------------------------+--------------------------+--+
| SegmentSequenceId  |  Status  |     Load Start Time      |      Load End Time       |
+--------------------+----------+--------------------------+--------------------------+--+
| 2                  | Success  | 2017-01-20 19:01:20.479  | 2017-01-20 19:01:20.734  |
| 1                  | Success  | 2017-01-20 19:01:01.961  | 2017-01-20 19:01:02.167  |
| 0                  | Success  | 2017-01-20 19:00:48.798  | 2017-01-20 19:00:49.473  |
+--------------------+----------+--------------------------+--------------------------+--+
3 rows selected (0.014 seconds)
0: jdbc:hive2://localhost:10000>   LOAD DATA INPATH 'hdfs://localhost:54310/sc/one_row.csv' into table uniqdata_date OPTIONS('FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
+---------+--+
| Result  |
+---------+--+
+---------+--+
No rows selected (0.357 seconds)
0: jdbc:hive2://localhost:10000> select count(CUST_NAME) from uniqdata_date;
+------+--+
| _c0  |
+------+--+
| 4    |
+------+--+
1 row selected (0.217 seconds)
0: jdbc:hive2://localhost:10000> show segments for table uniqdata_date;
+--------------------+----------+--------------------------+--------------------------+--+
| SegmentSequenceId  |  Status  |     Load Start Time      |      Load End Time       |
+--------------------+----------+--------------------------+--------------------------+--+
| 3                  | Success  | 2017-01-20 19:01:30.826  | 2017-01-20 19:01:31.04   |
| 2                  | Success  | 2017-01-20 19:01:20.479  | 2017-01-20 19:01:20.734  |
| 1                  | Success  | 2017-01-20 19:01:01.961  | 2017-01-20 19:01:02.167  |
| 0                  | Success  | 2017-01-20 19:00:48.798  | 2017-01-20 19:00:49.473  |
+--------------------+----------+--------------------------+--------------------------+--+
4 rows selected (0.022 seconds)
0: jdbc:hive2://localhost:10000>   LOAD DATA INPATH 'hdfs://localhost:54310/sc/one_row.csv' into table uniqdata_date OPTIONS('FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
+---------+--+
| Result  |
+---------+--+
+---------+--+
No rows selected (0.552 seconds)
0: jdbc:hive2://localhost:10000> select count(CUST_NAME) from uniqdata_date;
+------+--+
| _c0  |
+------+--+
| 5    |
+------+--+
1 row selected (0.233 seconds)
0: jdbc:hive2://localhost:10000> show segments for table uniqdata_date;
+--------------------+----------+--------------------------+--------------------------+--+
| SegmentSequenceId  |  Status  |     Load Start Time      |      Load End Time       |
+--------------------+----------+--------------------------+--------------------------+--+
| 4                  | Success  | 2017-01-20 19:01:48.578  | 2017-01-20 19:01:48.899  |
| 3                  | Success  | 2017-01-20 19:01:30.826  | 2017-01-20 19:01:31.04   |
| 2                  | Success  | 2017-01-20 19:01:20.479  | 2017-01-20 19:01:20.734  |
| 1                  | Success  | 2017-01-20 19:01:01.961  | 2017-01-20 19:01:02.167  |
| 0                  | Success  | 2017-01-20 19:00:48.798  | 2017-01-20 19:00:49.473  |
+--------------------+----------+--------------------------+--------------------------+--+
5 rows selected (0.021 seconds)
0: jdbc:hive2://localhost:10000>   LOAD DATA INPATH 'hdfs://localhost:54310/sc/one_row.csv' into table uniqdata_date OPTIONS('FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
+---------+--+
| Result  |
+---------+--+
+---------+--+
No rows selected (0.895 seconds)
0: jdbc:hive2://localhost:10000> select count(CUST_NAME) from uniqdata_date;
+------+--+
| _c0  |
+------+--+
| 6    |
+------+--+
1 row selected (0.203 seconds)
0: jdbc:hive2://localhost:10000> show segments for table uniqdata_date;
+--------------------+------------+--------------------------+--------------------------+--+
| SegmentSequenceId  |   Status   |     Load Start Time      |      Load End Time       |
+--------------------+------------+--------------------------+--------------------------+--+
| 5                  | Success    | 2017-01-20 19:01:59.212  | 2017-01-20 19:01:59.493  |
| 4                  | Success    | 2017-01-20 19:01:48.578  | 2017-01-20 19:01:48.899  |
| 3                  | Compacted  | 2017-01-20 19:01:30.826  | 2017-01-20 19:01:31.04   |
| 2                  | Compacted  | 2017-01-20 19:01:20.479  | 2017-01-20 19:01:20.734  |
| 1                  | Compacted  | 2017-01-20 19:01:01.961  | 2017-01-20 19:01:02.167  |
| 0.1                | Success    | 2017-01-20 19:01:59.545  | 2017-01-20 19:01:59.852  |
| 0                  | Compacted  | 2017-01-20 19:00:48.798  | 2017-01-20 19:00:49.473  |
+--------------------+------------+--------------------------+--------------------------+--+
7 rows selected (0.018 seconds)
0: jdbc:hive2://localhost:10000> select count(CUST_NAME) from uniqdata_date;
+------+--+
| _c0  |
+------+--+
| 6    |
+------+--+
1 row selected (0.219 seconds)


> We are not able to create table with shared dictionary columns in spark 2.1
> ---------------------------------------------------------------------------
>
>                 Key: CARBONDATA-657
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-657
>             Project: CarbonData
>          Issue Type: Bug
>          Components: sql
>    Affects Versions: 1.0.0-incubating
>         Environment: Spark-2.1
>            Reporter: Payal
>            Priority: Minor
>
> We are not able to create table with shared dictionary columns not working with spark-2.1 but  it is working fine with spark 1.6
> 0: jdbc:hive2://hadoop-master:10000> CREATE TABLE uniq_shared_dictionary (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 int) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES('DICTIONARY_INCLUDE'='CUST_ID,Double_COLUMN2,DECIMAL_COLUMN2','columnproperties.CUST_ID.shared_column'='shared.CUST_ID','columnproperties.decimal_column2.shared_column'='shared.decimal_column2');
> Error: org.apache.carbondata.spark.exception.MalformedCarbonCommandException: Invalid table properties columnproperties.cust_id.shared_column (state=,code=0)
> LOGS
> ERROR 18-01 13:31:18,147 - Error executing query, currentState RUNNING,
> org.apache.carbondata.spark.exception.MalformedCarbonCommandException: Invalid table properties columnproperties.cust_id.shared_column
> at org.apache.carbondata.spark.util.CommonUtil$$anonfun$validateTblProperties$1.apply(CommonUtil.scala:141)
> at org.apache.carbondata.spark.util.CommonUtil$$anonfun$validateTblProperties$1.apply(CommonUtil.scala:137)
> at scala.collection.Iterator$class.foreach(Iterator.scala:893)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
> at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
> at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
> at org.apache.carbondata.spark.util.CommonUtil$.validateTblProperties(CommonUtil.scala:137)
> at org.apache.spark.sql.parser.CarbonSqlAstBuilder.visitCreateTable(CarbonSparkSqlParser.scala:135)
> at org.apache.spark.sql.parser.CarbonSqlAstBuilder.visitCreateTable(CarbonSparkSqlParser.scala:60)
> at org.apache.spark.sql.catalyst.parser.SqlBaseParser$CreateTableContext.accept(SqlBaseParser.java:503)
> at org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visit(AbstractParseTreeVisitor.java:42)
> at org.apache.spark.sql.catalyst.parser.AstBuilder$$anonfun$visitSingleStatement$1.apply(AstBuilder.scala:66)
> at org.apache.spark.sql.catalyst.parser.AstBuilder$$anonfun$visitSingleStatement$1.apply(AstBuilder.scala:66)
> at org.apache.spark.sql.catalyst.parser.ParserUtils$.withOrigin(ParserUtils.scala:93)
> at org.apache.spark.sql.catalyst.parser.AstBuilder.visitSingleStatement(AstBuilder.scala:65)
> at org.apache.spark.sql.catalyst.parser.AbstractSqlParser$$anonfun$parsePlan$1.apply(ParseDriver.scala:54)
> at org.apache.spark.sql.catalyst.parser.AbstractSqlParser$$anonfun$parsePlan$1.apply(ParseDriver.scala:53)
> at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parse(ParseDriver.scala:82)
> at org.apache.spark.sql.parser.CarbonSparkSqlParser.parse(CarbonSparkSqlParser.scala:45)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)