[GitHub] [carbondata] vikramahuja1001 opened a new pull request #4109: [WIP] Fix various concurrent issues with clean files

classic Classic list List threaded Threaded
27 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] vikramahuja1001 opened a new pull request #4109: [WIP] Fix various concurrent issues with clean files

GitBox

vikramahuja1001 opened a new pull request #4109:
URL: https://github.com/apache/carbondata/pull/4109


    ### Why is this PR needed?
   
   
    ### What changes were proposed in this PR?
   
       
    ### Does this PR introduce any user interface change?
    - No
    - Yes. (please explain the change and update document)
   
    ### Is any new testcase added?
    - No
    - Yes
   
       
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4109: [WIP] Fix various concurrent issues with clean files

GitBox

CarbonDataQA2 commented on pull request #4109:
URL: https://github.com/apache/carbondata/pull/4109#issuecomment-800965166


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3808/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4109: [WIP] Fix various concurrent issues with clean files

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4109:
URL: https://github.com/apache/carbondata/pull/4109#issuecomment-800966836


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5574/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] kunal642 commented on a change in pull request #4109: [WIP] Fix various concurrent issues with clean files

GitBox
In reply to this post by GitBox

kunal642 commented on a change in pull request #4109:
URL: https://github.com/apache/carbondata/pull/4109#discussion_r596681621



##########
File path: core/src/main/java/org/apache/carbondata/core/util/DeleteLoadFolders.java
##########
@@ -118,14 +118,16 @@ private static void physicalFactAndMeasureMetadataDeletion(CarbonTable carbonTab
       if (canDeleteThisLoad(oneLoad, isForceDelete, cleanStaleInProgress)) {
         try {
           if (oneLoad.getSegmentFile() != null) {
-            String tablePath = carbonTable.getAbsoluteTableIdentifier().getTablePath();
-            Segment segment = new Segment(oneLoad.getLoadName(), oneLoad.getSegmentFile());
-            // No need to delete physical data for external segments.
-            if (oneLoad.getPath() == null || oneLoad.getPath().equalsIgnoreCase("NA")) {
-              SegmentFileStore.deleteSegment(tablePath, segment, specs, updateStatusManager);
+            if (canSegmentLockBeAcquired(oneLoad, carbonTable.getAbsoluteTableIdentifier())) {

Review comment:
       move this method call in canDeleteThisLoad().. and refactor the other places as well




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4109: [WIP] Fix various concurrent issues with clean files

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4109:
URL: https://github.com/apache/carbondata/pull/4109#issuecomment-801880420


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5584/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4109: [WIP] Fix various concurrent issues with clean files

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4109:
URL: https://github.com/apache/carbondata/pull/4109#issuecomment-801886882


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3818/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] vikramahuja1001 commented on a change in pull request #4109: [CARBONDATA-4154] Fix various concurrent issues with clean files

GitBox
In reply to this post by GitBox

vikramahuja1001 commented on a change in pull request #4109:
URL: https://github.com/apache/carbondata/pull/4109#discussion_r596825541



##########
File path: core/src/main/java/org/apache/carbondata/core/util/DeleteLoadFolders.java
##########
@@ -118,14 +118,16 @@ private static void physicalFactAndMeasureMetadataDeletion(CarbonTable carbonTab
       if (canDeleteThisLoad(oneLoad, isForceDelete, cleanStaleInProgress)) {
         try {
           if (oneLoad.getSegmentFile() != null) {
-            String tablePath = carbonTable.getAbsoluteTableIdentifier().getTablePath();
-            Segment segment = new Segment(oneLoad.getLoadName(), oneLoad.getSegmentFile());
-            // No need to delete physical data for external segments.
-            if (oneLoad.getPath() == null || oneLoad.getPath().equalsIgnoreCase("NA")) {
-              SegmentFileStore.deleteSegment(tablePath, segment, specs, updateStatusManager);
+            if (canSegmentLockBeAcquired(oneLoad, carbonTable.getAbsoluteTableIdentifier())) {

Review comment:
       done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] nihal0107 commented on pull request #4109: [CARBONDATA-4154] Fix various concurrent issues with clean files

GitBox
In reply to this post by GitBox

nihal0107 commented on pull request #4109:
URL: https://github.com/apache/carbondata/pull/4109#issuecomment-801956708


   retest this please.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] kunal642 commented on a change in pull request #4109: [CARBONDATA-4154] Fix various concurrent issues with clean files

GitBox
In reply to this post by GitBox

kunal642 commented on a change in pull request #4109:
URL: https://github.com/apache/carbondata/pull/4109#discussion_r596950277



##########
File path: core/src/main/java/org/apache/carbondata/core/util/DeleteLoadFolders.java
##########
@@ -115,7 +121,8 @@ private static void physicalFactAndMeasureMetadataDeletion(CarbonTable carbonTab
     SegmentUpdateStatusManager updateStatusManager =
         new SegmentUpdateStatusManager(carbonTable, currLoadDetails);
     for (final LoadMetadataDetails oneLoad : loadDetails) {
-      if (canDeleteThisLoad(oneLoad, isForceDelete, cleanStaleInProgress)) {
+      if (loadsToDelete.contains(oneLoad.getLoadName()) && canDeleteThisLoad(oneLoad,

Review comment:
       canDeleteThisLoad would be redundant here, as loadsToDelete is absolute




--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] kunal642 commented on a change in pull request #4109: [CARBONDATA-4154] Fix various concurrent issues with clean files

GitBox
In reply to this post by GitBox

kunal642 commented on a change in pull request #4109:
URL: https://github.com/apache/carbondata/pull/4109#discussion_r596951104



##########
File path: core/src/main/java/org/apache/carbondata/core/util/DeleteLoadFolders.java
##########
@@ -230,45 +241,46 @@ private static LoadMetadataDetails getCurrentLoadStatusOfSegment(String segmentI
     return null;
   }
 
-  public static boolean deleteLoadFoldersFromFileSystem(
+  public static Set<String> deleteLoadFoldersFromFileSystem(
       AbsoluteTableIdentifier absoluteTableIdentifier, boolean isForceDelete, LoadMetadataDetails[]
       details, String metadataPath, boolean cleanStaleInProgress) {
-    boolean isDeleted = false;
+    Set<String> loadsToDelete = new HashSet<>();
     if (details != null && details.length != 0) {
       for (LoadMetadataDetails oneLoad : details) {
-        if (checkIfLoadCanBeDeleted(oneLoad, isForceDelete, cleanStaleInProgress)) {
-          ICarbonLock segmentLock = CarbonLockFactory.getCarbonLockObj(absoluteTableIdentifier,
-              CarbonTablePath.addSegmentPrefix(oneLoad.getLoadName()) + LockUsage.LOCK);
-          try {
-            if (oneLoad.getSegmentStatus() == SegmentStatus.INSERT_OVERWRITE_IN_PROGRESS
-                || oneLoad.getSegmentStatus() == SegmentStatus.INSERT_IN_PROGRESS) {
-              if (segmentLock.lockWithRetries(1, 5)) {
-                LOGGER.info("Info: Acquired segment lock on segment:" + oneLoad.getLoadName());
-                LoadMetadataDetails currentDetails =
-                    getCurrentLoadStatusOfSegment(oneLoad.getLoadName(), metadataPath);
-                if (currentDetails != null && checkIfLoadCanBeDeleted(currentDetails,
-                    isForceDelete, cleanStaleInProgress)) {
-                  oneLoad.setVisibility("false");
-                  isDeleted = true;
-                  LOGGER.info("Info: Deleted the load " + oneLoad.getLoadName());
-                }
-              } else {
-                LOGGER.info("Info: Load in progress for segment" + oneLoad.getLoadName());
-                return isDeleted;
-              }
-            } else {
+        if (checkIfLoadCanBeDeleted(oneLoad, isForceDelete, cleanStaleInProgress,
+            absoluteTableIdentifier)) {
+          if (oneLoad.getSegmentStatus() == SegmentStatus.INSERT_OVERWRITE_IN_PROGRESS
+              || oneLoad.getSegmentStatus() == SegmentStatus.INSERT_IN_PROGRESS) {
+            LoadMetadataDetails currentDetails =
+                getCurrentLoadStatusOfSegment(oneLoad.getLoadName(), metadataPath);
+            if (currentDetails != null && checkIfLoadCanBeDeleted(currentDetails,
+                isForceDelete, cleanStaleInProgress, absoluteTableIdentifier)) {
               oneLoad.setVisibility("false");
-              isDeleted = true;
+              loadsToDelete.add(oneLoad.getLoadName());
               LOGGER.info("Info: Deleted the load " + oneLoad.getLoadName());
             }
-          } finally {
-            segmentLock.unlock();
-            LOGGER.info("Info: Segment lock on segment:" + oneLoad.getLoadName() + " is released");
+          } else {
+            oneLoad.setVisibility("false");
+            loadsToDelete.add(oneLoad.getLoadName());
+            LOGGER.info("Info: Deleted the load " + oneLoad.getLoadName());
           }
         }
       }
     }
-    return isDeleted;
+    return loadsToDelete;
   }
 
+  private static boolean canSegmentLockBeAcquired(LoadMetadataDetails oneLoad,
+      AbsoluteTableIdentifier absoluteTableIdentifier) {
+    ICarbonLock segmentLock = CarbonLockFactory.getCarbonLockObj(absoluteTableIdentifier,
+        CarbonTablePath.addSegmentPrefix(oneLoad.getLoadName()) + LockUsage.LOCK);
+    if (segmentLock.lockWithRetries()) {
+      LOGGER.info("INFO: Segment Lock on segment: " + oneLoad.getLoadName() + "can be acquired.");

Review comment:
       remove "INFO:" from the log message




--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] vikramahuja1001 commented on a change in pull request #4109: [CARBONDATA-4154] Fix various concurrent issues with clean files

GitBox
In reply to this post by GitBox

vikramahuja1001 commented on a change in pull request #4109:
URL: https://github.com/apache/carbondata/pull/4109#discussion_r596964122



##########
File path: core/src/main/java/org/apache/carbondata/core/util/DeleteLoadFolders.java
##########
@@ -230,45 +241,46 @@ private static LoadMetadataDetails getCurrentLoadStatusOfSegment(String segmentI
     return null;
   }
 
-  public static boolean deleteLoadFoldersFromFileSystem(
+  public static Set<String> deleteLoadFoldersFromFileSystem(
       AbsoluteTableIdentifier absoluteTableIdentifier, boolean isForceDelete, LoadMetadataDetails[]
       details, String metadataPath, boolean cleanStaleInProgress) {
-    boolean isDeleted = false;
+    Set<String> loadsToDelete = new HashSet<>();
     if (details != null && details.length != 0) {
       for (LoadMetadataDetails oneLoad : details) {
-        if (checkIfLoadCanBeDeleted(oneLoad, isForceDelete, cleanStaleInProgress)) {
-          ICarbonLock segmentLock = CarbonLockFactory.getCarbonLockObj(absoluteTableIdentifier,
-              CarbonTablePath.addSegmentPrefix(oneLoad.getLoadName()) + LockUsage.LOCK);
-          try {
-            if (oneLoad.getSegmentStatus() == SegmentStatus.INSERT_OVERWRITE_IN_PROGRESS
-                || oneLoad.getSegmentStatus() == SegmentStatus.INSERT_IN_PROGRESS) {
-              if (segmentLock.lockWithRetries(1, 5)) {
-                LOGGER.info("Info: Acquired segment lock on segment:" + oneLoad.getLoadName());
-                LoadMetadataDetails currentDetails =
-                    getCurrentLoadStatusOfSegment(oneLoad.getLoadName(), metadataPath);
-                if (currentDetails != null && checkIfLoadCanBeDeleted(currentDetails,
-                    isForceDelete, cleanStaleInProgress)) {
-                  oneLoad.setVisibility("false");
-                  isDeleted = true;
-                  LOGGER.info("Info: Deleted the load " + oneLoad.getLoadName());
-                }
-              } else {
-                LOGGER.info("Info: Load in progress for segment" + oneLoad.getLoadName());
-                return isDeleted;
-              }
-            } else {
+        if (checkIfLoadCanBeDeleted(oneLoad, isForceDelete, cleanStaleInProgress,
+            absoluteTableIdentifier)) {
+          if (oneLoad.getSegmentStatus() == SegmentStatus.INSERT_OVERWRITE_IN_PROGRESS
+              || oneLoad.getSegmentStatus() == SegmentStatus.INSERT_IN_PROGRESS) {
+            LoadMetadataDetails currentDetails =
+                getCurrentLoadStatusOfSegment(oneLoad.getLoadName(), metadataPath);
+            if (currentDetails != null && checkIfLoadCanBeDeleted(currentDetails,
+                isForceDelete, cleanStaleInProgress, absoluteTableIdentifier)) {
               oneLoad.setVisibility("false");
-              isDeleted = true;
+              loadsToDelete.add(oneLoad.getLoadName());
               LOGGER.info("Info: Deleted the load " + oneLoad.getLoadName());
             }
-          } finally {
-            segmentLock.unlock();
-            LOGGER.info("Info: Segment lock on segment:" + oneLoad.getLoadName() + " is released");
+          } else {
+            oneLoad.setVisibility("false");
+            loadsToDelete.add(oneLoad.getLoadName());
+            LOGGER.info("Info: Deleted the load " + oneLoad.getLoadName());
           }
         }
       }
     }
-    return isDeleted;
+    return loadsToDelete;
   }
 
+  private static boolean canSegmentLockBeAcquired(LoadMetadataDetails oneLoad,
+      AbsoluteTableIdentifier absoluteTableIdentifier) {
+    ICarbonLock segmentLock = CarbonLockFactory.getCarbonLockObj(absoluteTableIdentifier,
+        CarbonTablePath.addSegmentPrefix(oneLoad.getLoadName()) + LockUsage.LOCK);
+    if (segmentLock.lockWithRetries()) {
+      LOGGER.info("INFO: Segment Lock on segment: " + oneLoad.getLoadName() + "can be acquired.");

Review comment:
       done




--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] vikramahuja1001 commented on a change in pull request #4109: [CARBONDATA-4154] Fix various concurrent issues with clean files

GitBox
In reply to this post by GitBox

vikramahuja1001 commented on a change in pull request #4109:
URL: https://github.com/apache/carbondata/pull/4109#discussion_r596964731



##########
File path: core/src/main/java/org/apache/carbondata/core/util/DeleteLoadFolders.java
##########
@@ -115,7 +121,8 @@ private static void physicalFactAndMeasureMetadataDeletion(CarbonTable carbonTab
     SegmentUpdateStatusManager updateStatusManager =
         new SegmentUpdateStatusManager(carbonTable, currLoadDetails);
     for (final LoadMetadataDetails oneLoad : loadDetails) {
-      if (canDeleteThisLoad(oneLoad, isForceDelete, cleanStaleInProgress)) {
+      if (loadsToDelete.contains(oneLoad.getLoadName()) && canDeleteThisLoad(oneLoad,

Review comment:
       removed




--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4109: [CARBONDATA-4154] Fix various concurrent issues with clean files

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4109:
URL: https://github.com/apache/carbondata/pull/4109#issuecomment-802093767


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5588/
   


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4109: [CARBONDATA-4154] Fix various concurrent issues with clean files

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4109:
URL: https://github.com/apache/carbondata/pull/4109#issuecomment-802100561


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3822/
   


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] kunal642 commented on pull request #4109: [CARBONDATA-4154] Fix various concurrent issues with clean files

GitBox
In reply to this post by GitBox

kunal642 commented on pull request #4109:
URL: https://github.com/apache/carbondata/pull/4109#issuecomment-802144670


   LGTM


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4109: [CARBONDATA-4154] Fix various concurrent issues with clean files

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4109:
URL: https://github.com/apache/carbondata/pull/4109#issuecomment-802153204


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5592/
   


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4109: [CARBONDATA-4154] Fix various concurrent issues with clean files

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4109:
URL: https://github.com/apache/carbondata/pull/4109#issuecomment-802159613


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3826/
   


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4109: [CARBONDATA-4154] Fix various concurrent issues with clean files

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4109:
URL: https://github.com/apache/carbondata/pull/4109#issuecomment-803623924


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/5057/
   


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4109: [CARBONDATA-4154] Fix various concurrent issues with clean files

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4109:
URL: https://github.com/apache/carbondata/pull/4109#issuecomment-803625694


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/3305/
   


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] brijoobopanna commented on pull request #4109: [CARBONDATA-4154] Fix various concurrent issues with clean files

GitBox
In reply to this post by GitBox

brijoobopanna commented on pull request #4109:
URL: https://github.com/apache/carbondata/pull/4109#issuecomment-803804878


   Retest this please
   


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


12