[GitHub] [carbondata] akkio-97 opened a new pull request #4050: [WIP] select count marked for delete segments

classic Classic list List threaded Threaded
31 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] akkio-97 opened a new pull request #4050: [WIP] select count marked for delete segments

GitBox

akkio-97 opened a new pull request #4050:
URL: https://github.com/apache/carbondata/pull/4050


    ### Why is this PR needed?
   
   
    ### What changes were proposed in this PR?
   
       
    ### Does this PR introduce any user interface change?
    - No
    - Yes. (please explain the change and update document)
   
    ### Is any new testcase added?
    - No
    - Yes
   
       
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4050: [WIP] select count marked for delete segments

GitBox

CarbonDataQA2 commented on pull request #4050:
URL: https://github.com/apache/carbondata/pull/4050#issuecomment-741709860


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5124/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4050: [WIP] select count marked for delete segments

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4050:
URL: https://github.com/apache/carbondata/pull/4050#issuecomment-741712233


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3362/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4050: [WIP] select count marked for delete segments

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4050:
URL: https://github.com/apache/carbondata/pull/4050#issuecomment-741807884


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3367/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4050: [WIP] select count marked for delete segments

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4050:
URL: https://github.com/apache/carbondata/pull/4050#issuecomment-741809268


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5129/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] akkio-97 commented on pull request #4050: [WIP] select count marked for delete segments

GitBox
In reply to this post by GitBox

akkio-97 commented on pull request #4050:
URL: https://github.com/apache/carbondata/pull/4050#issuecomment-741815756


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4050: [WIP] select count marked for delete segments

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4050:
URL: https://github.com/apache/carbondata/pull/4050#issuecomment-741818543


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5130/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4050: [WIP] select count marked for delete segments

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4050:
URL: https://github.com/apache/carbondata/pull/4050#issuecomment-741820892


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3368/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4050: [WIP] select count marked for delete segments

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4050:
URL: https://github.com/apache/carbondata/pull/4050#issuecomment-742011060


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5132/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] akkio-97 commented on pull request #4050: [WIP] select count marked for delete segments

GitBox
In reply to this post by GitBox

akkio-97 commented on pull request #4050:
URL: https://github.com/apache/carbondata/pull/4050#issuecomment-742016131


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4050: [WIP] select count marked for delete segments

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4050:
URL: https://github.com/apache/carbondata/pull/4050#issuecomment-742081476


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5133/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4050: [WIP] select count marked for delete segments

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4050:
URL: https://github.com/apache/carbondata/pull/4050#issuecomment-742082179


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3371/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] kunal642 commented on pull request #4050: [WIP] select count marked for delete segments

GitBox
In reply to this post by GitBox

kunal642 commented on pull request #4050:
URL: https://github.com/apache/carbondata/pull/4050#issuecomment-742258819


   @akashrn5 Please check whether this fix will be ok for SI and MV.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] kunal642 commented on a change in pull request #4050: [WIP] select count marked for delete segments

GitBox
In reply to this post by GitBox

kunal642 commented on a change in pull request #4050:
URL: https://github.com/apache/carbondata/pull/4050#discussion_r539864844



##########
File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/segmentreading/TestSegmentReading.scala
##########
@@ -420,4 +421,40 @@ class TestSegmentReading extends QueryTest with BeforeAndAfterAll {
 
     sql("set spark.sql.adaptive.enabled=false")
   }
+
+  test("Read marked for delete segments") {
+    sql("drop table if exists carbonTable")
+    sql(
+      "create table carbonTable(a string, b int, c string) stored as carbondata tblproperties" +
+      "('carbon.clean.file.force.allowed' = true)")
+    sql("insert into carbonTable values ('k',1,'k'), ('k',1,'b')")
+    sql("insert into carbonTable values ('a',2,'a')")
+    sql("insert into carbonTable values ('b',2,'b'),('b',2,'b')")
+    sql("insert into carbonTable values ('c',2,'c')")
+
+    sql("delete from table carbonTable where segment.id in (0,3)")
+    sql("set carbon.input.segments.default.carbonTable = 0,2,3")
+    sql("show segments for table carbonTable").show()

Review comment:
       please remove this

##########
File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/segmentreading/TestSegmentReading.scala
##########
@@ -420,4 +421,40 @@ class TestSegmentReading extends QueryTest with BeforeAndAfterAll {
 
     sql("set spark.sql.adaptive.enabled=false")
   }
+
+  test("Read marked for delete segments") {
+    sql("drop table if exists carbonTable")
+    sql(
+      "create table carbonTable(a string, b int, c string) stored as carbondata tblproperties" +
+      "('carbon.clean.file.force.allowed' = true)")
+    sql("insert into carbonTable values ('k',1,'k'), ('k',1,'b')")
+    sql("insert into carbonTable values ('a',2,'a')")
+    sql("insert into carbonTable values ('b',2,'b'),('b',2,'b')")
+    sql("insert into carbonTable values ('c',2,'c')")
+
+    sql("delete from table carbonTable where segment.id in (0,3)")
+    sql("set carbon.input.segments.default.carbonTable = 0,2,3")
+    sql("show segments for table carbonTable").show()
+
+    checkAnswer(sql("select count(*) from carbonTable"), Seq(Row(2)))
+
+  }
+
+  test("Test compacted segments") {
+    sql("drop table if exists carbonTable")
+    sql(
+      "create table carbonTable(a string, b int, c string) stored as carbondata tblproperties" +
+      "('carbon.clean.file.force.allowed' = true)")
+    sql("insert into carbonTable values ('k',1,'k'), ('k',1,'b')")
+    sql("insert into carbonTable values ('a',2,'a')")
+    sql("insert into carbonTable values ('b',2,'b'),('b',2,'b')")
+    sql("insert into carbonTable values ('c',2,'c')")
+    sql("alter table carbonTable compact 'major'")
+
+    sql("set carbon.input.segments.default.carbonTable = 0,1,2,3,0.1")
+    sql("show segments for table carbonTable").show()

Review comment:
       please remove this




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] akkio-97 commented on a change in pull request #4050: [WIP] select count marked for delete segments

GitBox
In reply to this post by GitBox

akkio-97 commented on a change in pull request #4050:
URL: https://github.com/apache/carbondata/pull/4050#discussion_r539955200



##########
File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/segmentreading/TestSegmentReading.scala
##########
@@ -420,4 +421,40 @@ class TestSegmentReading extends QueryTest with BeforeAndAfterAll {
 
     sql("set spark.sql.adaptive.enabled=false")
   }
+
+  test("Read marked for delete segments") {
+    sql("drop table if exists carbonTable")
+    sql(
+      "create table carbonTable(a string, b int, c string) stored as carbondata tblproperties" +
+      "('carbon.clean.file.force.allowed' = true)")
+    sql("insert into carbonTable values ('k',1,'k'), ('k',1,'b')")
+    sql("insert into carbonTable values ('a',2,'a')")
+    sql("insert into carbonTable values ('b',2,'b'),('b',2,'b')")
+    sql("insert into carbonTable values ('c',2,'c')")
+
+    sql("delete from table carbonTable where segment.id in (0,3)")
+    sql("set carbon.input.segments.default.carbonTable = 0,2,3")
+    sql("show segments for table carbonTable").show()

Review comment:
       done

##########
File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/segmentreading/TestSegmentReading.scala
##########
@@ -420,4 +421,40 @@ class TestSegmentReading extends QueryTest with BeforeAndAfterAll {
 
     sql("set spark.sql.adaptive.enabled=false")
   }
+
+  test("Read marked for delete segments") {
+    sql("drop table if exists carbonTable")
+    sql(
+      "create table carbonTable(a string, b int, c string) stored as carbondata tblproperties" +
+      "('carbon.clean.file.force.allowed' = true)")
+    sql("insert into carbonTable values ('k',1,'k'), ('k',1,'b')")
+    sql("insert into carbonTable values ('a',2,'a')")
+    sql("insert into carbonTable values ('b',2,'b'),('b',2,'b')")
+    sql("insert into carbonTable values ('c',2,'c')")
+
+    sql("delete from table carbonTable where segment.id in (0,3)")
+    sql("set carbon.input.segments.default.carbonTable = 0,2,3")
+    sql("show segments for table carbonTable").show()
+
+    checkAnswer(sql("select count(*) from carbonTable"), Seq(Row(2)))
+
+  }
+
+  test("Test compacted segments") {
+    sql("drop table if exists carbonTable")
+    sql(
+      "create table carbonTable(a string, b int, c string) stored as carbondata tblproperties" +
+      "('carbon.clean.file.force.allowed' = true)")
+    sql("insert into carbonTable values ('k',1,'k'), ('k',1,'b')")
+    sql("insert into carbonTable values ('a',2,'a')")
+    sql("insert into carbonTable values ('b',2,'b'),('b',2,'b')")
+    sql("insert into carbonTable values ('c',2,'c')")
+    sql("alter table carbonTable compact 'major'")
+
+    sql("set carbon.input.segments.default.carbonTable = 0,1,2,3,0.1")
+    sql("show segments for table carbonTable").show()

Review comment:
       done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] vikramahuja1001 commented on pull request #4050: [CARBONDATA-4080] Wrong results for select count on invalid segments

GitBox
In reply to this post by GitBox

vikramahuja1001 commented on pull request #4050:
URL: https://github.com/apache/carbondata/pull/4050#issuecomment-742371418


   @akkio-97 , please add a test case for partition table as well


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #4050: [CARBONDATA-4080] Wrong results for select count on invalid segments

GitBox
In reply to this post by GitBox

Indhumathi27 commented on a change in pull request #4050:
URL: https://github.com/apache/carbondata/pull/4050#discussion_r540006038



##########
File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/segmentreading/TestSegmentReading.scala
##########
@@ -420,4 +421,69 @@ class TestSegmentReading extends QueryTest with BeforeAndAfterAll {
 
     sql("set spark.sql.adaptive.enabled=false")
   }
+
+  test("Read marked for delete segments") {
+    sql("drop table if exists carbonTable")
+    sql(
+      "create table carbonTable(a string, b int, c string) stored as carbondata ")
+    sql("insert into carbonTable values ('k',1,'k'), ('k',1,'b')")
+    sql("insert into carbonTable values ('a',2,'a')")
+    sql("insert into carbonTable values ('b',2,'b'),('b',2,'b')")
+    sql("insert into carbonTable values ('c',2,'c')")
+
+    sql("delete from table carbonTable where segment.id in (0,3)")
+    sql("set carbon.input.segments.default.carbonTable = 0,2,3")
+
+    checkAnswer(sql("select count(*) from carbonTable"), Seq(Row(2)))
+  }
+
+  test("Read marked for delete segments after SI creation") {
+    sql("drop table if exists carbonTable")
+    sql(
+      "create table carbonTable(a string, b int, c string) stored as carbondata ")
+    sql("insert into carbonTable values ('k',1,'k'), ('k',1,'b')")
+    sql("insert into carbonTable values ('a',2,'a')")
+    sql("insert into carbonTable values ('b',2,'b'),('b',2,'b')")
+    sql("insert into carbonTable values ('c',2,'c')")
+
+    sql("drop index if exists indextable1 on carbonTable")
+    sql("create index indextable1 on table carbonTable (c) AS 'carbondata'")
+
+    sql("delete from table carbonTable where segment.id in (0,3)")
+    sql("set carbon.input.segments.default.carbonTable = 0,2,3")
+
+    checkAnswer(sql("select count(*) from carbonTable"), Seq(Row(2)))
+  }
+
+  test("Read compacted segments") {
+    sql("drop table if exists carbonTable")
+    sql(
+      "create table carbonTable(a string, b int, c string) stored as carbondata")
+    sql("insert into carbonTable values ('k',5,'k'), ('k',5,'b')")
+    sql("insert into carbonTable values ('a',1,'a')")
+    sql("insert into carbonTable values ('b',2,'b'),('b',2,'b')")
+    sql("insert into carbonTable values ('c',3,'c')")
+    sql("alter table carbonTable compact 'major'")
+
+    sql("set carbon.input.segments.default.carbonTable = 0,1,2,3,0.1")
+    checkAnswer(sql("select count(*) from carbonTable"), Seq(Row(6)))
+  }
+
+  test("Read compacted segments after SI creation") {
+    sql("drop table if exists carbonTable")
+    sql(
+      "create table carbonTable(a string, b int, c string) stored as carbondata")
+    sql("insert into carbonTable values ('k',5,'k'), ('k',5,'b')")
+    sql("insert into carbonTable values ('a',1,'a')")
+    sql("insert into carbonTable values ('b',2,'b'),('b',2,'b')")
+    sql("insert into carbonTable values ('c',3,'c')")
+
+    sql("drop index if exists indextable1 on carbonTable")
+    sql("create index indextable1 on table carbonTable (c) AS 'carbondata'")
+
+    sql("alter table carbonTable compact 'major'")
+    sql("set carbon.input.segments.default.carbonTable = 0,1,2,3,0.1")
+    checkAnswer(sql("select count(*) from carbonTable"), Seq(Row(6)))

Review comment:
       add a query check which has filter on SI column




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] akashrn5 commented on pull request #4050: [CARBONDATA-4080] Wrong results for select count on invalid segments

GitBox
In reply to this post by GitBox

akashrn5 commented on pull request #4050:
URL: https://github.com/apache/carbondata/pull/4050#issuecomment-742404913


   > @akashrn5 Please check whether this fix will be ok for SI and MV.
   
   for SI , better to have a test case and for MV during query if set segments is done, then it will not hit query, but please add a test case which set segments and create MV. Please check in MVCreatetest class and add


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4050: [CARBONDATA-4080] Wrong results for select count on invalid segments

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4050:
URL: https://github.com/apache/carbondata/pull/4050#issuecomment-742406446


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5136/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4050: [CARBONDATA-4080] Wrong results for select count on invalid segments

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4050:
URL: https://github.com/apache/carbondata/pull/4050#issuecomment-742407711


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3374/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


12