ShreelekhyaG opened a new pull request #4112: URL: https://github.com/apache/carbondata/pull/4112 ### Why is this PR needed? Query with SI after add partition based on empty location on partition table gives incorrect results. ### What changes were proposed in this PR? while creating blockid, get segment number from the file name for the external partition. This blockid will be added to SI and used for pruning. To identify as an external partition during the compaction process, instead of checking with loadmetapath, checking with filepath.startswith(tablepath) format. ### Does this PR introduce any user interface change? - No ### Is any new testcase added? - Yes -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
CarbonDataQA2 commented on pull request #4112: URL: https://github.com/apache/carbondata/pull/4112#issuecomment-804811963 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/3332/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA2 commented on pull request #4112: URL: https://github.com/apache/carbondata/pull/4112#issuecomment-804813300 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5084/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
Indhumathi27 commented on a change in pull request #4112: URL: https://github.com/apache/carbondata/pull/4112#discussion_r599523297 ########## File path: index/secondary-index/src/test/scala/org/apache/carbondata/spark/testsuite/secondaryindex/TestSIWithPartition.scala ########## @@ -414,6 +414,7 @@ class TestSIWithPartition extends QueryTest with BeforeAndAfterAll { checkAnswer(extSegmentQuery, Seq(Row(2, "red", "def"), Row(5, "red", "abc"))) assert(extSegmentQuery.queryExecution.executedPlan.isInstanceOf[BroadCastSIFilterPushJoin]) sql("drop table if exists partition_table") + FileFactory.deleteAllCarbonFilesOfDir(FileFactory.getCarbonFile(sdkWritePath)) Review comment: can you add insert into existing external partition also -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
Indhumathi27 commented on a change in pull request #4112: URL: https://github.com/apache/carbondata/pull/4112#discussion_r599530878 ########## File path: index/secondary-index/src/test/scala/org/apache/carbondata/spark/testsuite/secondaryindex/TestSIWithPartition.scala ########## @@ -460,6 +461,60 @@ class TestSIWithPartition extends QueryTest with BeforeAndAfterAll { Row(2, "red", "def2", 22), Row(5, "red", "abc", 22))) assert(extSegmentQuery.queryExecution.executedPlan.isInstanceOf[BroadCastSIFilterPushJoin]) sql("drop table if exists partition_table") + FileFactory.deleteAllCarbonFilesOfDir(FileFactory.getCarbonFile(sdkWritePath1)) + FileFactory.deleteAllCarbonFilesOfDir(FileFactory.getCarbonFile(sdkWritePath2)) + } + + test("test si with add partition based on empty location on partition table") { + sql("drop table if exists partitionTable") + sql( + """create table partition_table (id int,name String) partitioned by(email string) + stored as carbondata""".stripMargin) + sql("CREATE INDEX partitionTable_si on table partition_table (name) as 'carbondata'") + sql("insert into partition_table select 1,'blue','abc'") + val location = target + "/" + "def" + FileFactory.deleteAllCarbonFilesOfDir(FileFactory.getCarbonFile(location)) + sql(s"""alter table partition_table add partition (email='def') location '$location'""") + sql("insert into partition_table select 2,'red','def'") + var extSegmentQuery = sql("select * from partition_table where name = 'red'") + checkAnswer(extSegmentQuery, Seq(Row(2, "red", "def"))) + sql("insert into partition_table select 4,'grey','bcd'") + sql("insert into partition_table select 5,'red','abc'") + sql("alter table partition_table compact 'minor'") + extSegmentQuery = sql("select * from partition_table where name = 'red'") + checkAnswer(extSegmentQuery, Seq(Row(2, "red", "def"), Row(5, "red", "abc"))) + assert(extSegmentQuery.queryExecution.executedPlan.isInstanceOf[BroadCastSIFilterPushJoin]) + sql("drop table if exists partition_table") + FileFactory.deleteAllCarbonFilesOfDir(FileFactory.getCarbonFile(location)) + } + + test("test si with add multiple partitions based on empty location on partition table") { + sql("drop table if exists partition_table") + sql("create table partition_table (id int,name String) " + + "partitioned by(email string, age int) stored as carbondata") + sql("insert into partition_table select 1,'blue','abc', 20") + sql("CREATE INDEX partitionTable_si on table partition_table (name) as 'carbondata'") + val location1 = target + "/" + "def" + val location2 = target + "/" + "def2" + FileFactory.deleteAllCarbonFilesOfDir(FileFactory.getCarbonFile(location1)) + FileFactory.deleteAllCarbonFilesOfDir(FileFactory.getCarbonFile(location2)) + sql( Review comment: please move these changes to existing testcase and add drop external partition scenario also -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
ShreelekhyaG commented on a change in pull request #4112: URL: https://github.com/apache/carbondata/pull/4112#discussion_r599576453 ########## File path: index/secondary-index/src/test/scala/org/apache/carbondata/spark/testsuite/secondaryindex/TestSIWithPartition.scala ########## @@ -460,6 +461,60 @@ class TestSIWithPartition extends QueryTest with BeforeAndAfterAll { Row(2, "red", "def2", 22), Row(5, "red", "abc", 22))) assert(extSegmentQuery.queryExecution.executedPlan.isInstanceOf[BroadCastSIFilterPushJoin]) sql("drop table if exists partition_table") + FileFactory.deleteAllCarbonFilesOfDir(FileFactory.getCarbonFile(sdkWritePath1)) + FileFactory.deleteAllCarbonFilesOfDir(FileFactory.getCarbonFile(sdkWritePath2)) + } + + test("test si with add partition based on empty location on partition table") { + sql("drop table if exists partitionTable") + sql( + """create table partition_table (id int,name String) partitioned by(email string) + stored as carbondata""".stripMargin) + sql("CREATE INDEX partitionTable_si on table partition_table (name) as 'carbondata'") + sql("insert into partition_table select 1,'blue','abc'") + val location = target + "/" + "def" + FileFactory.deleteAllCarbonFilesOfDir(FileFactory.getCarbonFile(location)) + sql(s"""alter table partition_table add partition (email='def') location '$location'""") + sql("insert into partition_table select 2,'red','def'") + var extSegmentQuery = sql("select * from partition_table where name = 'red'") + checkAnswer(extSegmentQuery, Seq(Row(2, "red", "def"))) + sql("insert into partition_table select 4,'grey','bcd'") + sql("insert into partition_table select 5,'red','abc'") + sql("alter table partition_table compact 'minor'") + extSegmentQuery = sql("select * from partition_table where name = 'red'") + checkAnswer(extSegmentQuery, Seq(Row(2, "red", "def"), Row(5, "red", "abc"))) + assert(extSegmentQuery.queryExecution.executedPlan.isInstanceOf[BroadCastSIFilterPushJoin]) + sql("drop table if exists partition_table") + FileFactory.deleteAllCarbonFilesOfDir(FileFactory.getCarbonFile(location)) + } + + test("test si with add multiple partitions based on empty location on partition table") { + sql("drop table if exists partition_table") + sql("create table partition_table (id int,name String) " + + "partitioned by(email string, age int) stored as carbondata") + sql("insert into partition_table select 1,'blue','abc', 20") + sql("CREATE INDEX partitionTable_si on table partition_table (name) as 'carbondata'") + val location1 = target + "/" + "def" + val location2 = target + "/" + "def2" + FileFactory.deleteAllCarbonFilesOfDir(FileFactory.getCarbonFile(location1)) + FileFactory.deleteAllCarbonFilesOfDir(FileFactory.getCarbonFile(location2)) + sql( Review comment: Done ########## File path: index/secondary-index/src/test/scala/org/apache/carbondata/spark/testsuite/secondaryindex/TestSIWithPartition.scala ########## @@ -414,6 +414,7 @@ class TestSIWithPartition extends QueryTest with BeforeAndAfterAll { checkAnswer(extSegmentQuery, Seq(Row(2, "red", "def"), Row(5, "red", "abc"))) assert(extSegmentQuery.queryExecution.executedPlan.isInstanceOf[BroadCastSIFilterPushJoin]) sql("drop table if exists partition_table") + FileFactory.deleteAllCarbonFilesOfDir(FileFactory.getCarbonFile(sdkWritePath)) Review comment: Done -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA2 commented on pull request #4112: URL: https://github.com/apache/carbondata/pull/4112#issuecomment-804984214 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5089/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA2 commented on pull request #4112: URL: https://github.com/apache/carbondata/pull/4112#issuecomment-804987718 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/3337/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
Indhumathi27 commented on pull request #4112: URL: https://github.com/apache/carbondata/pull/4112#issuecomment-804996033 LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
asfgit closed pull request #4112: URL: https://github.com/apache/carbondata/pull/4112 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
Free forum by Nabble | Edit this page |