QiangCai commented on a change in pull request #3355: [HOTFIX] Improve select query after Update/Delete operation.
URL:
https://github.com/apache/carbondata/pull/3355#discussion_r313206037
##########
File path: core/src/main/java/org/apache/carbondata/core/statusmanager/SegmentUpdateStatusManager.java
##########
@@ -358,40 +333,40 @@ public boolean isBlockValid(String segName, String blockName) {
return deleteFileList;
}
- private List<String> getFilePaths(CarbonFile blockDir, final String blockNameFromTuple,
+ private List<String> getFilePaths(String blockDir, final String blockNameFromTuple,
final String extension, List<String> deleteFileList, final long deltaStartTimestamp,
final long deltaEndTimeStamp) throws IOException {
- if (null != blockDir.getParentFile()) {
- CarbonFile[] files = blockDir.getParentFile().listFiles(new CarbonFileFilter() {
-
- @Override
- public boolean accept(CarbonFile pathName) {
+ List<String> deltaList = segmentDeleteDeltaListMap.get(blockDir);
+ if (deltaList == null) {
+ CarbonFile[] files = FileFactory.getCarbonFile(blockDir).listFiles(new CarbonFileFilter() {
Review comment:
For each query, it will still list files once for each IUD segment.
In the cloud scenario, the once call of the listFiles method will take 1.5 seconds for an IUD segment.
If the total 24 segments are all the IUD segments, it will still take more 36 seconds (the old situation is more than 100 seconds).
Maybe we still need to improve it.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[hidden email]
With regards,
Apache Git Services