akkio-97 opened a new pull request #3869: URL: https://github.com/apache/carbondata/pull/3869 ### Why is this PR needed? Index creation for long string columns are not yet supported. ### What changes were proposed in this PR? Exceptions are thrown if user tries to create the same. ### Does this PR introduce any user interface change? - No ### Is any new testcase added? - Yes ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
CarbonDataQA1 commented on pull request #3869: URL: https://github.com/apache/carbondata/pull/3869#issuecomment-665263897 ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3869: URL: https://github.com/apache/carbondata/pull/3869#issuecomment-665577522 ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
akashrn5 commented on a change in pull request #3869: URL: https://github.com/apache/carbondata/pull/3869#discussion_r463434191 ########## File path: index/secondary-index/src/test/scala/org/apache/carbondata/spark/testsuite/secondaryindex/TestCreateIndexTable.scala ########## @@ -370,6 +370,38 @@ class TestCreateIndexTable extends QueryTest with BeforeAndAfterAll { } } + test("index creation on long string columns") { + sql("drop table if exists si_table") Review comment: just create a small table which has varchar column, no need to do any dataload here, as intension is just to throw an exception for long string columns. ########## File path: integration/spark/src/main/scala/org/apache/spark/sql/secondaryindex/command/SICreationCommand.scala ########## @@ -253,6 +253,17 @@ private[sql] case class CarbonCreateSecondaryIndexCommand( throw new ErrorMessage( s"Table [$tableName] under database [$databaseName] is already an index table") } + // creation of index on long string columns are not supported + for (indexColName <- indexModel.columnNames) { + for (dim <- dims) { + if (indexColName.equals(dim.getColName) && dim.getDataType.getName.equals("VARCHAR")) { + throw new ErrorMessage( + s"one or more index columns specified cannot be of Long String property" + + s" column in table $databaseName.$tableName") + } + } + } + Review comment: instead of using a standard for loop, can change to this, > if (dims.filter(dimension => indexModel.columnNames > .contains(dimension.getColName)) > .map(_.getDataType) > .exists(dataType => dataType.equals(DataTypes.VARCHAR)) ) { > throw new ErrorMessage( > s"one or more index columns specified contains long string column" + > s" in table $databaseName.$tableName. SI cannot be created on long string columns") > } > in a more functional way ########## File path: index/secondary-index/src/test/scala/org/apache/carbondata/spark/testsuite/secondaryindex/TestCreateIndexTable.scala ########## @@ -370,6 +370,38 @@ class TestCreateIndexTable extends QueryTest with BeforeAndAfterAll { } } + test("index creation on long string columns") { + sql("drop table if exists si_table") + sql( + s""" + | CREATE TABLE si_table( + | id INT, + | name STRING, + | city STRING, + | salary FLOAT, + | tax DECIMAL(8,2), + | percent double, + | birthday DATE, + | register TIMESTAMP, + | updated TIMESTAMP, + | longstr STRING, + | file struct<school:array<string>, age:int> + | ) + | STORED AS carbondata + | TBLPROPERTIES( + | 'LONG_STRING_COLUMNS'='longstr, name') + | """.stripMargin) + sql( + s"LOAD DATA LOCAL INPATH '$resourcesPath/streamSample_with_long_string.csv' INTO TABLE si_table OPTIONS" + + "('HEADER'='true','COMPLEX_DELIMITER_LEVEL_1'='$', 'COMPLEX_DELIMITER_LEVEL_2'=':')") + + sql("drop index if exists temp_ind on si_table") + val thrown = intercept[Exception] { + sql("create index temp_ind on table si_table (longstr) AS 'carbondata'") + } + assert(thrown.getMessage === "one or more index columns specified cannot be of Long String property column in table default.si_table") Review comment: check `thrown.getMessage.contains("")` instad of `==` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
akashrn5 commented on pull request #3869: URL: https://github.com/apache/carbondata/pull/3869#issuecomment-666965943 @akkio-97 please create a jira and update PR description ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
akkio-97 commented on a change in pull request #3869: URL: https://github.com/apache/carbondata/pull/3869#discussion_r464318621 ########## File path: integration/spark/src/main/scala/org/apache/spark/sql/secondaryindex/command/SICreationCommand.scala ########## @@ -253,6 +253,17 @@ private[sql] case class CarbonCreateSecondaryIndexCommand( throw new ErrorMessage( s"Table [$tableName] under database [$databaseName] is already an index table") } + // creation of index on long string columns are not supported + for (indexColName <- indexModel.columnNames) { + for (dim <- dims) { + if (indexColName.equals(dim.getColName) && dim.getDataType.getName.equals("VARCHAR")) { + throw new ErrorMessage( + s"one or more index columns specified cannot be of Long String property" + + s" column in table $databaseName.$tableName") + } + } + } + Review comment: done ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
akkio-97 commented on a change in pull request #3869: URL: https://github.com/apache/carbondata/pull/3869#discussion_r464318726 ########## File path: index/secondary-index/src/test/scala/org/apache/carbondata/spark/testsuite/secondaryindex/TestCreateIndexTable.scala ########## @@ -370,6 +370,38 @@ class TestCreateIndexTable extends QueryTest with BeforeAndAfterAll { } } + test("index creation on long string columns") { + sql("drop table if exists si_table") Review comment: done ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
akkio-97 commented on a change in pull request #3869: URL: https://github.com/apache/carbondata/pull/3869#discussion_r464318988 ########## File path: index/secondary-index/src/test/scala/org/apache/carbondata/spark/testsuite/secondaryindex/TestCreateIndexTable.scala ########## @@ -370,6 +370,38 @@ class TestCreateIndexTable extends QueryTest with BeforeAndAfterAll { } } + test("index creation on long string columns") { + sql("drop table if exists si_table") + sql( + s""" + | CREATE TABLE si_table( + | id INT, + | name STRING, + | city STRING, + | salary FLOAT, + | tax DECIMAL(8,2), + | percent double, + | birthday DATE, + | register TIMESTAMP, + | updated TIMESTAMP, + | longstr STRING, + | file struct<school:array<string>, age:int> + | ) + | STORED AS carbondata + | TBLPROPERTIES( + | 'LONG_STRING_COLUMNS'='longstr, name') + | """.stripMargin) + sql( + s"LOAD DATA LOCAL INPATH '$resourcesPath/streamSample_with_long_string.csv' INTO TABLE si_table OPTIONS" + + "('HEADER'='true','COMPLEX_DELIMITER_LEVEL_1'='$', 'COMPLEX_DELIMITER_LEVEL_2'=':')") + + sql("drop index if exists temp_ind on si_table") + val thrown = intercept[Exception] { + sql("create index temp_ind on table si_table (longstr) AS 'carbondata'") + } + assert(thrown.getMessage === "one or more index columns specified cannot be of Long String property column in table default.si_table") Review comment: done ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3869: URL: https://github.com/apache/carbondata/pull/3869#issuecomment-667996493 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3588/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3869: URL: https://github.com/apache/carbondata/pull/3869#issuecomment-667998009 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1849/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
akashrn5 commented on pull request #3869: URL: https://github.com/apache/carbondata/pull/3869#issuecomment-668474607 LGTM ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
asfgit closed pull request #3869: URL: https://github.com/apache/carbondata/pull/3869 ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
Free forum by Nabble | Edit this page |