[GitHub] [carbondata] IceMimosa opened a new pull request #3848: [CARBONDATA-3891] Fix loading data will update all segments updateDeltaEndTimestamp

classic Classic list List threaded Threaded
27 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] IceMimosa opened a new pull request #3848: [CARBONDATA-3891] Fix loading data will update all segments updateDeltaEndTimestamp

GitBox

IceMimosa opened a new pull request #3848:
URL: https://github.com/apache/carbondata/pull/3848


    ### Why is this PR needed?
    Loading Data to the partitioned table will update all segments updateDeltaEndTimestamp,that will cause the driver to clear all segments cache when doing the query.
   
    ### What changes were proposed in this PR?
   
       
    ### Does this PR introduce any user interface change?
    - No
   
    ### Is any new testcase added?
    - Yes TODO
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] IceMimosa commented on pull request #3848: [CARBONDATA-3891] Fix loading data will update all segments updateDeltaEndTimestamp

GitBox

IceMimosa commented on pull request #3848:
URL: https://github.com/apache/carbondata/pull/3848#issuecomment-659312101


   reset please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3848: [CARBONDATA-3891] Fix loading data will update all segments updateDeltaEndTimestamp

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3848:
URL: https://github.com/apache/carbondata/pull/3848#issuecomment-659376770


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1662/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3848: [CARBONDATA-3891] Fix loading data will update all segments updateDeltaEndTimestamp

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3848:
URL: https://github.com/apache/carbondata/pull/3848#issuecomment-659380516


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3404/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] Indhumathi27 commented on pull request #3848: [CARBONDATA-3891] Fix loading data will update all segments updateDeltaEndTimestamp

GitBox
In reply to this post by GitBox

Indhumathi27 commented on pull request #3848:
URL: https://github.com/apache/carbondata/pull/3848#issuecomment-660840922


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3848: [CARBONDATA-3891] Fix loading data will update all segments updateDeltaEndTimestamp

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3848:
URL: https://github.com/apache/carbondata/pull/3848#issuecomment-660918396


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1692/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3848: [CARBONDATA-3891] Fix loading data will update all segments updateDeltaEndTimestamp

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3848:
URL: https://github.com/apache/carbondata/pull/3848#issuecomment-660922590


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3434/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3848: [CARBONDATA-3891] Fix loading data will update all segments updateDeltaEndTimestamp

GitBox
In reply to this post by GitBox

Indhumathi27 commented on a change in pull request #3848:
URL: https://github.com/apache/carbondata/pull/3848#discussion_r457401996



##########
File path: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonOutputCommitter.java
##########
@@ -234,13 +234,7 @@ private void commitJobFinal(JobContext context, CarbonLoadModel loadModel,
     String segmentsToBeDeleted =
         context.getConfiguration().get(CarbonTableOutputFormat.SEGMENTS_TO_BE_DELETED, "");
     List<Segment> segmentDeleteList = Segment.toSegmentList(segmentsToBeDeleted.split(","), null);
-    Set<Segment> segmentSet = new HashSet<>();
-    if (updateTime != null || uniqueId != null) {
-      segmentSet = new HashSet<>(
-          new SegmentStatusManager(carbonTable.getAbsoluteTableIdentifier(),
-              context.getConfiguration()).getValidAndInvalidSegments(carbonTable.isMV())
-                  .getValidSegments());
-    }
+    Set<Segment> segmentSet = Collections.singleton(loadModel.getSegment());

Review comment:
       segmentSet can be moved inside `  if (updateTime != null || uniqueId != null) {` check. If updateTime and uniqueId is null, this value will be unused




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3848: [CARBONDATA-3891] Fix loading data will update all segments updateDeltaEndTimestamp

GitBox
In reply to this post by GitBox

Indhumathi27 commented on a change in pull request #3848:
URL: https://github.com/apache/carbondata/pull/3848#discussion_r457401996



##########
File path: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonOutputCommitter.java
##########
@@ -234,13 +234,7 @@ private void commitJobFinal(JobContext context, CarbonLoadModel loadModel,
     String segmentsToBeDeleted =
         context.getConfiguration().get(CarbonTableOutputFormat.SEGMENTS_TO_BE_DELETED, "");
     List<Segment> segmentDeleteList = Segment.toSegmentList(segmentsToBeDeleted.split(","), null);
-    Set<Segment> segmentSet = new HashSet<>();
-    if (updateTime != null || uniqueId != null) {
-      segmentSet = new HashSet<>(
-          new SegmentStatusManager(carbonTable.getAbsoluteTableIdentifier(),
-              context.getConfiguration()).getValidAndInvalidSegments(carbonTable.isMV())
-                  .getValidSegments());
-    }
+    Set<Segment> segmentSet = Collections.singleton(loadModel.getSegment());

Review comment:
       segmentSet can be moved inside `  if (updateTime != null || uniqueId != null) {` check. If updateTime and uniqueId is null, this value will be left unused.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] IceMimosa commented on a change in pull request #3848: [CARBONDATA-3891] Fix loading data will update all segments updateDeltaEndTimestamp

GitBox
In reply to this post by GitBox

IceMimosa commented on a change in pull request #3848:
URL: https://github.com/apache/carbondata/pull/3848#discussion_r457424849



##########
File path: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonOutputCommitter.java
##########
@@ -234,13 +234,7 @@ private void commitJobFinal(JobContext context, CarbonLoadModel loadModel,
     String segmentsToBeDeleted =
         context.getConfiguration().get(CarbonTableOutputFormat.SEGMENTS_TO_BE_DELETED, "");
     List<Segment> segmentDeleteList = Segment.toSegmentList(segmentsToBeDeleted.split(","), null);
-    Set<Segment> segmentSet = new HashSet<>();
-    if (updateTime != null || uniqueId != null) {
-      segmentSet = new HashSet<>(
-          new SegmentStatusManager(carbonTable.getAbsoluteTableIdentifier(),
-              context.getConfiguration()).getValidAndInvalidSegments(carbonTable.isMV())
-                  .getValidSegments());
-    }
+    Set<Segment> segmentSet = Collections.singleton(loadModel.getSegment());

Review comment:
       @Indhumathi27  Thanks for reviewing, I will improve this.
   In fact, I have not fully grasped the scope of influence, so I'm working on this and improving the unit tests.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3848: [CARBONDATA-3891] Fix loading data will update all segments updateDeltaEndTimestamp

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3848:
URL: https://github.com/apache/carbondata/pull/3848#issuecomment-661238447


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3446/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3848: [CARBONDATA-3891] Fix loading data will update all segments updateDeltaEndTimestamp

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3848:
URL: https://github.com/apache/carbondata/pull/3848#issuecomment-661245090


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1704/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3848: [CARBONDATA-3891] Fix loading data will update all segments updateDeltaEndTimestamp

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3848:
URL: https://github.com/apache/carbondata/pull/3848#issuecomment-664820504






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] Indhumathi27 commented on pull request #3848: [CARBONDATA-3891] Fix loading data will update all segments updateDeltaEndTimestamp

GitBox
In reply to this post by GitBox

Indhumathi27 commented on pull request #3848:
URL: https://github.com/apache/carbondata/pull/3848#issuecomment-664772987






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] IceMimosa commented on pull request #3848: [CARBONDATA-3891] Fix loading data will update all segments updateDeltaEndTimestamp

GitBox
In reply to this post by GitBox

IceMimosa commented on pull request #3848:
URL: https://github.com/apache/carbondata/pull/3848#issuecomment-667908308


   Add unit test, cc @Indhumathi27


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3848: [CARBONDATA-3891] Fix loading data will update all segments updateDeltaEndTimestamp

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3848:
URL: https://github.com/apache/carbondata/pull/3848#issuecomment-667974119


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3586/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3848: [CARBONDATA-3891] Fix loading data will update all segments updateDeltaEndTimestamp

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3848:
URL: https://github.com/apache/carbondata/pull/3848#issuecomment-667977155


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1847/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3848: [CARBONDATA-3891] Fix loading data will update all segments updateDeltaEndTimestamp

GitBox
In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #3848:
URL: https://github.com/apache/carbondata/pull/3848#discussion_r489988970



##########
File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/segment/ShowSegmentTestCase.scala
##########
@@ -224,6 +224,37 @@ class ShowSegmentTestCase extends QueryTest with BeforeAndAfterAll {
     sql("drop table if exists a")
   }
 
+  test("test loading data into partitioned table with segment's updateDeltaEndTimestamp not change") {

Review comment:
       Thanks for working on it. Just a small suggestion. This showSegment testcase is meant for only segment display related test case.
   
   can you move this test case to existing `InsertIntoCarbonTableTestCase` file ?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] IceMimosa commented on a change in pull request #3848: [CARBONDATA-3891] Fix loading data will update all segments updateDeltaEndTimestamp

GitBox
In reply to this post by GitBox

IceMimosa commented on a change in pull request #3848:
URL: https://github.com/apache/carbondata/pull/3848#discussion_r490043882



##########
File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/segment/ShowSegmentTestCase.scala
##########
@@ -224,6 +224,37 @@ class ShowSegmentTestCase extends QueryTest with BeforeAndAfterAll {
     sql("drop table if exists a")
   }
 
+  test("test loading data into partitioned table with segment's updateDeltaEndTimestamp not change") {

Review comment:
       Thanks for reviewing, done.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on pull request #3848: [CARBONDATA-3891] Fix loading data will update all segments updateDeltaEndTimestamp

GitBox
In reply to this post by GitBox

ajantha-bhat commented on pull request #3848:
URL: https://github.com/apache/carbondata/pull/3848#issuecomment-694122754


   LGTM.
   I will merge today once the build passes. Thank you for the contribution.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


12