[GitHub] [carbondata] akashrn5 opened a new pull request #3676: [WIP]Clean up the data file and index files after SI rebuild

classic Classic list List threaded Threaded
23 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] akashrn5 commented on a change in pull request #3676: [CARBONDATA-3754]Clean up the data file and index files after SI rebuild

GitBox
akashrn5 commented on a change in pull request #3676: [CARBONDATA-3754]Clean up the data file and index files after SI rebuild
URL: https://github.com/apache/carbondata/pull/3676#discussion_r405341211
 
 

 ##########
 File path: integration/spark/src/main/scala/org/apache/spark/sql/secondaryindex/rdd/CarbonSIRebuildRDD.scala
 ##########
 @@ -321,6 +324,26 @@ class CarbonSIRebuildRDD[K, V](
           LOGGER.info("Closing compaction processor instance to clean up loading resources")
           processor.close()
         }
+
+        // delete all the old data files which are used for merging
+        splits.asScala.foreach { split =>
+          val carbonFile = FileFactory.getCarbonFile(split.getFilePath)
+          carbonFile.delete()
+        }
+
+        // delete the indexfile/merge index carbonFile of old data files
+        val segmentPath = FileFactory.getCarbonFile(indexTable.getSegmentPath(segmentId))
+        val indexFiles = segmentPath.listFiles(new CarbonFileFilter {
 
 Review comment:
   we will have the list of data files in the task but not the index files, i will try and fix in anothr PR

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3676: [CARBONDATA-3754]Clean up the data file and index files after SI rebuild

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3676: [CARBONDATA-3754]Clean up the data file and index files after SI rebuild
URL: https://github.com/apache/carbondata/pull/3676#issuecomment-610878800
 
 
   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2679/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3676: [CARBONDATA-3754]Clean up the data file and index files after SI rebuild

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3676: [CARBONDATA-3754]Clean up the data file and index files after SI rebuild
URL: https://github.com/apache/carbondata/pull/3676#issuecomment-610886067
 
 
   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/968/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
12