[GitHub] [carbondata] akashrn5 commented on a change in pull request #3988: [CARBONDATA-4037] Improve the table status and segment file writing

Posted by GitBox on
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/GitHub-carbondata-ShreelekhyaG-opened-a-new-pull-request-3988-WIP-Clean-index-files-when-clean-filesd-tp102150p107650.html


akashrn5 commented on a change in pull request #3988:
URL: https://github.com/apache/carbondata/pull/3988#discussion_r614596639



##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/secondaryindex/util/SecondaryIndexUtil.scala
##########
@@ -353,16 +342,38 @@ object SecondaryIndexUtil {
     }
   }
 
+  /**
+   * This method returns the list of index/merge index files for a segment in carbonTable.
+   */
+  @throws[IOException]
+  private def getIndexFilesListForSegment(segment: Segment, tablePath: String): util.Set[String] = {
+    var indexFiles : util.Set[String] = new util.HashSet[String]
+    val segmentFileStore = new SegmentFileStore(tablePath,
+      segment.getSegmentFileName)
+    val segmentPath = CarbonTablePath.getSegmentPath(tablePath, segment.getSegmentNo)
+    if (segmentFileStore.getSegmentFile == null)  {
+      indexFiles = new SegmentIndexFileStore()
+        .getMergeOrIndexFilesFromSegment(segmentPath).keySet
+    }
+    else {
+      indexFiles = segmentFileStore.getIndexAndMergeFiles.keySet
+    }
+    indexFiles
+  }
+
   /**
    * This method delete the carbondata files present in pertition of during small
    * datafile merge after loading a segment to SI table. It should be deleted after
    * data file merge operation, else, concurrency can cause file not found issues.
    */
-  private def deleteOldCarbonDataFiles(partition: CarbonSparkPartition): Unit = {
+  private def deleteOldCarbonDataFiles(partition: CarbonSparkPartition,
+      validSegmentsToUse: List[Segment]): Unit = {
     val splitList = partition.split.value.getAllSplits
     splitList.asScala.foreach { split =>
-      val carbonFile = FileFactory.getCarbonFile(split.getFilePath)
-      carbonFile.delete()
+      if (validSegmentsToUse.contains(split.getSegment)) {
+        val carbonFile = FileFactory.getCarbonFile(split.getFilePath)

Review comment:
       ```suggestion
           val mergedCarbonDataFile = FileFactory.getCarbonFile(split.getFilePath)
   ```




--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]