[GitHub] [carbondata] akkio-97 opened a new pull request #4050: [WIP] select count marked for delete segments

classic Classic list List threaded Threaded
31 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] akkio-97 commented on a change in pull request #4050: [CARBONDATA-4080] Wrong results for select count on invalid segments

GitBox

akkio-97 commented on a change in pull request #4050:
URL: https://github.com/apache/carbondata/pull/4050#discussion_r540038288



##########
File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/segmentreading/TestSegmentReading.scala
##########
@@ -420,4 +421,69 @@ class TestSegmentReading extends QueryTest with BeforeAndAfterAll {
 
     sql("set spark.sql.adaptive.enabled=false")
   }
+
+  test("Read marked for delete segments") {
+    sql("drop table if exists carbonTable")
+    sql(
+      "create table carbonTable(a string, b int, c string) stored as carbondata ")
+    sql("insert into carbonTable values ('k',1,'k'), ('k',1,'b')")
+    sql("insert into carbonTable values ('a',2,'a')")
+    sql("insert into carbonTable values ('b',2,'b'),('b',2,'b')")
+    sql("insert into carbonTable values ('c',2,'c')")
+
+    sql("delete from table carbonTable where segment.id in (0,3)")
+    sql("set carbon.input.segments.default.carbonTable = 0,2,3")
+
+    checkAnswer(sql("select count(*) from carbonTable"), Seq(Row(2)))
+  }
+
+  test("Read marked for delete segments after SI creation") {
+    sql("drop table if exists carbonTable")
+    sql(
+      "create table carbonTable(a string, b int, c string) stored as carbondata ")
+    sql("insert into carbonTable values ('k',1,'k'), ('k',1,'b')")
+    sql("insert into carbonTable values ('a',2,'a')")
+    sql("insert into carbonTable values ('b',2,'b'),('b',2,'b')")
+    sql("insert into carbonTable values ('c',2,'c')")
+
+    sql("drop index if exists indextable1 on carbonTable")
+    sql("create index indextable1 on table carbonTable (c) AS 'carbondata'")
+
+    sql("delete from table carbonTable where segment.id in (0,3)")
+    sql("set carbon.input.segments.default.carbonTable = 0,2,3")
+
+    checkAnswer(sql("select count(*) from carbonTable"), Seq(Row(2)))
+  }
+
+  test("Read compacted segments") {
+    sql("drop table if exists carbonTable")
+    sql(
+      "create table carbonTable(a string, b int, c string) stored as carbondata")
+    sql("insert into carbonTable values ('k',5,'k'), ('k',5,'b')")
+    sql("insert into carbonTable values ('a',1,'a')")
+    sql("insert into carbonTable values ('b',2,'b'),('b',2,'b')")
+    sql("insert into carbonTable values ('c',3,'c')")
+    sql("alter table carbonTable compact 'major'")
+
+    sql("set carbon.input.segments.default.carbonTable = 0,1,2,3,0.1")
+    checkAnswer(sql("select count(*) from carbonTable"), Seq(Row(6)))
+  }
+
+  test("Read compacted segments after SI creation") {
+    sql("drop table if exists carbonTable")
+    sql(
+      "create table carbonTable(a string, b int, c string) stored as carbondata")
+    sql("insert into carbonTable values ('k',5,'k'), ('k',5,'b')")
+    sql("insert into carbonTable values ('a',1,'a')")
+    sql("insert into carbonTable values ('b',2,'b'),('b',2,'b')")
+    sql("insert into carbonTable values ('c',3,'c')")
+
+    sql("drop index if exists indextable1 on carbonTable")
+    sql("create index indextable1 on table carbonTable (c) AS 'carbondata'")
+
+    sql("alter table carbonTable compact 'major'")
+    sql("set carbon.input.segments.default.carbonTable = 0,1,2,3,0.1")
+    checkAnswer(sql("select count(*) from carbonTable"), Seq(Row(6)))

Review comment:
       done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] akkio-97 commented on pull request #4050: [CARBONDATA-4080] Wrong results for select count on invalid segments

GitBox
In reply to this post by GitBox

akkio-97 commented on pull request #4050:
URL: https://github.com/apache/carbondata/pull/4050#issuecomment-742420618


   > @akkio-97 , please add a test case for partition table as well
   
   done


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4050: [CARBONDATA-4080] Wrong results for select count on invalid segments

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4050:
URL: https://github.com/apache/carbondata/pull/4050#issuecomment-742470938


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5140/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] akkio-97 commented on pull request #4050: [CARBONDATA-4080] Wrong results for select count on invalid segments

GitBox
In reply to this post by GitBox

akkio-97 commented on pull request #4050:
URL: https://github.com/apache/carbondata/pull/4050#issuecomment-742476298


   > > @akashrn5 Please check whether this fix will be ok for SI and MV.
   >
   > for SI , better to have a test case and for MV during query if set segments is done, then it will not hit query, but please add a test case which set segments and create MV. Please check in MVCreatetest class and add
   
   done


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4050: [CARBONDATA-4080] Wrong results for select count on invalid segments

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4050:
URL: https://github.com/apache/carbondata/pull/4050#issuecomment-742523658


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5142/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4050: [CARBONDATA-4080] Wrong results for select count on invalid segments

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4050:
URL: https://github.com/apache/carbondata/pull/4050#issuecomment-742529889


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3380/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] kunal642 commented on pull request #4050: [CARBONDATA-4080] Wrong results for select count on invalid segments

GitBox
In reply to this post by GitBox

kunal642 commented on pull request #4050:
URL: https://github.com/apache/carbondata/pull/4050#issuecomment-743011865


   LGTM


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] kunal642 commented on pull request #4050: [CARBONDATA-4080] Wrong results for select count on invalid segments

GitBox
In reply to this post by GitBox

kunal642 commented on pull request #4050:
URL: https://github.com/apache/carbondata/pull/4050#issuecomment-747391650


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4050: [CARBONDATA-4080] Wrong results for select count on invalid segments

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4050:
URL: https://github.com/apache/carbondata/pull/4050#issuecomment-747436515


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5197/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4050: [CARBONDATA-4080] Wrong results for select count on invalid segments

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4050:
URL: https://github.com/apache/carbondata/pull/4050#issuecomment-747439442


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3438/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] asfgit closed pull request #4050: [CARBONDATA-4080] Wrong results for select count on invalid segments

GitBox
In reply to this post by GitBox

asfgit closed pull request #4050:
URL: https://github.com/apache/carbondata/pull/4050


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


12