Login  Register

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #4059: [CARBONDATA-4092] Fix concurrent issues in delete segment API's and MV flow

Posted by GitBox on Dec 21, 2020; 6:52am
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/GitHub-carbondata-vikramahuja1001-opened-a-new-pull-request-4059-WIP-Fix-concurrent-issues-in-deletes-tp104954p105001.html


ajantha-bhat commented on a change in pull request #4059:
URL: https://github.com/apache/carbondata/pull/4059#discussion_r546537345



##########
File path: core/src/main/java/org/apache/carbondata/core/statusmanager/SegmentStatusManager.java
##########
@@ -445,38 +445,28 @@ private static Integer compareDateValues(Long loadValue, Long userValue) {
           LOG.error("Load metadata file is not present.");
           return loadIds;
         }
-        // read existing metadata details in load metadata.
-        listOfLoadFolderDetailsArray = readLoadMetadata(tableFolderPath);
-        if (listOfLoadFolderDetailsArray.length != 0) {
-          updateDeletionStatus(identifier, loadIds, listOfLoadFolderDetailsArray, invalidLoadIds);
-          if (invalidLoadIds.isEmpty()) {
-            // All or None , if anything fails then don't write
-            if (carbonTableStatusLock.lockWithRetries()) {
-              LOG.info("Table status lock has been successfully acquired");
-              // To handle concurrency scenarios, always take latest metadata before writing
-              // into status file.
-              LoadMetadataDetails[] latestLoadMetadataDetails = readLoadMetadata(tableFolderPath);
-              updateLatestTableStatusDetails(listOfLoadFolderDetailsArray,
-                  latestLoadMetadataDetails);
+        if (carbonTableStatusLock.lockWithRetries()) {
+          LOG.info("Table status lock has been successfully acquired.");
+          listOfLoadFolderDetailsArray = readLoadMetadata(tableFolderPath);
+          if (listOfLoadFolderDetailsArray.length != 0) {
+            updateDeletionStatus(identifier, loadIds, listOfLoadFolderDetailsArray, invalidLoadIds);

Review comment:
       If the table status has 200K segments, finding and marking the matching segments to delete will take lot of time. Holding table status lock for quite a long time will impact concurrent query running. That is why previously they didn't had this logic in the lock.
   
   If some user reports issue because of this PR change, we have to optimize this logic may be.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]