[GitHub] [carbondata] akkio-97 opened a new pull request #3869: Exception added for index creation on long string columns

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] akkio-97 opened a new pull request #3869: Exception added for index creation on long string columns

GitBox

akkio-97 opened a new pull request #3869:
URL: https://github.com/apache/carbondata/pull/3869


    ### Why is this PR needed?
    Index creation for long string columns are not yet supported.
   
    ### What changes were proposed in this PR?
   Exceptions are thrown if user tries to create the same.
       
    ### Does this PR introduce any user interface change?
    - No
   
    ### Is any new testcase added?
    - Yes
   
       
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3869: Exception added for index creation on long string columns

GitBox

CarbonDataQA1 commented on pull request #3869:
URL: https://github.com/apache/carbondata/pull/3869#issuecomment-665263897






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3869: Exception added for index creation on long string columns

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3869:
URL: https://github.com/apache/carbondata/pull/3869#issuecomment-665577522






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] akashrn5 commented on a change in pull request #3869: Exception added for index creation on long string columns

GitBox
In reply to this post by GitBox

akashrn5 commented on a change in pull request #3869:
URL: https://github.com/apache/carbondata/pull/3869#discussion_r463434191



##########
File path: index/secondary-index/src/test/scala/org/apache/carbondata/spark/testsuite/secondaryindex/TestCreateIndexTable.scala
##########
@@ -370,6 +370,38 @@ class TestCreateIndexTable extends QueryTest with BeforeAndAfterAll {
     }
   }
 
+  test("index creation on long string columns") {
+    sql("drop table if exists si_table")

Review comment:
       just create a small table which has varchar column, no need to do any dataload here, as intension is just to throw an exception for long string columns.

##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/secondaryindex/command/SICreationCommand.scala
##########
@@ -253,6 +253,17 @@ private[sql] case class CarbonCreateSecondaryIndexCommand(
         throw new ErrorMessage(
           s"Table [$tableName] under database [$databaseName] is already an index table")
       }
+      // creation of index on long string columns are not supported
+      for (indexColName <- indexModel.columnNames) {
+        for (dim <- dims) {
+          if (indexColName.equals(dim.getColName) && dim.getDataType.getName.equals("VARCHAR")) {
+            throw new ErrorMessage(
+              s"one or more index columns specified cannot be of Long String property" +
+              s" column in table $databaseName.$tableName")
+          }
+        }
+      }
+

Review comment:
       instead of using a standard for loop, can change to this,
   
   > if (dims.filter(dimension => indexModel.columnNames
   >         .contains(dimension.getColName))
   >         .map(_.getDataType)
   >         .exists(dataType => dataType.equals(DataTypes.VARCHAR)) ) {
   >         throw new ErrorMessage(
   >           s"one or more index columns specified contains long string column" +
   >           s" in table $databaseName.$tableName. SI cannot be created on long string columns")
   >       }
   >
   
    in a more functional way

##########
File path: index/secondary-index/src/test/scala/org/apache/carbondata/spark/testsuite/secondaryindex/TestCreateIndexTable.scala
##########
@@ -370,6 +370,38 @@ class TestCreateIndexTable extends QueryTest with BeforeAndAfterAll {
     }
   }
 
+  test("index creation on long string columns") {
+    sql("drop table if exists si_table")
+    sql(
+      s"""
+         | CREATE TABLE si_table(
+         | id INT,
+         | name STRING,
+         | city STRING,
+         | salary FLOAT,
+         | tax DECIMAL(8,2),
+         | percent double,
+         | birthday DATE,
+         | register TIMESTAMP,
+         | updated TIMESTAMP,
+         | longstr STRING,
+         | file struct<school:array<string>, age:int>
+         | )
+         | STORED AS carbondata
+         | TBLPROPERTIES(
+         | 'LONG_STRING_COLUMNS'='longstr, name')
+         | """.stripMargin)
+    sql(
+      s"LOAD DATA LOCAL INPATH '$resourcesPath/streamSample_with_long_string.csv' INTO TABLE si_table OPTIONS" +
+        "('HEADER'='true','COMPLEX_DELIMITER_LEVEL_1'='$', 'COMPLEX_DELIMITER_LEVEL_2'=':')")
+
+    sql("drop index if exists temp_ind on si_table")
+    val thrown = intercept[Exception] {
+      sql("create index temp_ind on table si_table (longstr) AS 'carbondata'")
+    }
+    assert(thrown.getMessage === "one or more index columns specified cannot be of Long String property column in table default.si_table")

Review comment:
       check `thrown.getMessage.contains("")` instad of `==`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] akashrn5 commented on pull request #3869: Exception added for index creation on long string columns

GitBox
In reply to this post by GitBox

akashrn5 commented on pull request #3869:
URL: https://github.com/apache/carbondata/pull/3869#issuecomment-666965943


   @akkio-97 please create a jira and update PR description


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] akkio-97 commented on a change in pull request #3869: [CARBONDATA-3939]Exception added for index creation on long string columns

GitBox
In reply to this post by GitBox

akkio-97 commented on a change in pull request #3869:
URL: https://github.com/apache/carbondata/pull/3869#discussion_r464318621



##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/secondaryindex/command/SICreationCommand.scala
##########
@@ -253,6 +253,17 @@ private[sql] case class CarbonCreateSecondaryIndexCommand(
         throw new ErrorMessage(
           s"Table [$tableName] under database [$databaseName] is already an index table")
       }
+      // creation of index on long string columns are not supported
+      for (indexColName <- indexModel.columnNames) {
+        for (dim <- dims) {
+          if (indexColName.equals(dim.getColName) && dim.getDataType.getName.equals("VARCHAR")) {
+            throw new ErrorMessage(
+              s"one or more index columns specified cannot be of Long String property" +
+              s" column in table $databaseName.$tableName")
+          }
+        }
+      }
+

Review comment:
       done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] akkio-97 commented on a change in pull request #3869: [CARBONDATA-3939]Exception added for index creation on long string columns

GitBox
In reply to this post by GitBox

akkio-97 commented on a change in pull request #3869:
URL: https://github.com/apache/carbondata/pull/3869#discussion_r464318726



##########
File path: index/secondary-index/src/test/scala/org/apache/carbondata/spark/testsuite/secondaryindex/TestCreateIndexTable.scala
##########
@@ -370,6 +370,38 @@ class TestCreateIndexTable extends QueryTest with BeforeAndAfterAll {
     }
   }
 
+  test("index creation on long string columns") {
+    sql("drop table if exists si_table")

Review comment:
       done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] akkio-97 commented on a change in pull request #3869: [CARBONDATA-3939]Exception added for index creation on long string columns

GitBox
In reply to this post by GitBox

akkio-97 commented on a change in pull request #3869:
URL: https://github.com/apache/carbondata/pull/3869#discussion_r464318988



##########
File path: index/secondary-index/src/test/scala/org/apache/carbondata/spark/testsuite/secondaryindex/TestCreateIndexTable.scala
##########
@@ -370,6 +370,38 @@ class TestCreateIndexTable extends QueryTest with BeforeAndAfterAll {
     }
   }
 
+  test("index creation on long string columns") {
+    sql("drop table if exists si_table")
+    sql(
+      s"""
+         | CREATE TABLE si_table(
+         | id INT,
+         | name STRING,
+         | city STRING,
+         | salary FLOAT,
+         | tax DECIMAL(8,2),
+         | percent double,
+         | birthday DATE,
+         | register TIMESTAMP,
+         | updated TIMESTAMP,
+         | longstr STRING,
+         | file struct<school:array<string>, age:int>
+         | )
+         | STORED AS carbondata
+         | TBLPROPERTIES(
+         | 'LONG_STRING_COLUMNS'='longstr, name')
+         | """.stripMargin)
+    sql(
+      s"LOAD DATA LOCAL INPATH '$resourcesPath/streamSample_with_long_string.csv' INTO TABLE si_table OPTIONS" +
+        "('HEADER'='true','COMPLEX_DELIMITER_LEVEL_1'='$', 'COMPLEX_DELIMITER_LEVEL_2'=':')")
+
+    sql("drop index if exists temp_ind on si_table")
+    val thrown = intercept[Exception] {
+      sql("create index temp_ind on table si_table (longstr) AS 'carbondata'")
+    }
+    assert(thrown.getMessage === "one or more index columns specified cannot be of Long String property column in table default.si_table")

Review comment:
       done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3869: [CARBONDATA-3939]Exception added for index creation on long string columns

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3869:
URL: https://github.com/apache/carbondata/pull/3869#issuecomment-667996493


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3588/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3869: [CARBONDATA-3939]Exception added for index creation on long string columns

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3869:
URL: https://github.com/apache/carbondata/pull/3869#issuecomment-667998009


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1849/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] akashrn5 commented on pull request #3869: [CARBONDATA-3939]Exception added for index creation on long string columns

GitBox
In reply to this post by GitBox

akashrn5 commented on pull request #3869:
URL: https://github.com/apache/carbondata/pull/3869#issuecomment-668474607


   LGTM


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] asfgit closed pull request #3869: [CARBONDATA-3939]Exception added for index creation on long string columns

GitBox
In reply to this post by GitBox

asfgit closed pull request #3869:
URL: https://github.com/apache/carbondata/pull/3869


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]