Karan980 opened a new pull request #4099: URL: https://github.com/apache/carbondata/pull/4099 ### Why is this PR needed? Indexes cached in Executor cache are not dropped when drop table is called for external table with SDK segments. Because, external tables with sdk segments will not have metadata like table status file. So in drop table command we send zero segments to indexServer clearIndexes job, which clears nothing from executor side. ### What changes were proposed in this PR? Prepared the validSegments from indexFiles present at external table location and send it to IndexServer clearIndexes job through IndexInputFormat. ### Does this PR introduce any user interface change? - No ### Is any new testcase added? - No ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
CarbonDataQA2 commented on pull request #4099: URL: https://github.com/apache/carbondata/pull/4099#issuecomment-788903537 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/5106/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA2 commented on pull request #4099: URL: https://github.com/apache/carbondata/pull/4099#issuecomment-788908019 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/3347/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
kunal642 commented on a change in pull request #4099: URL: https://github.com/apache/carbondata/pull/4099#discussion_r589176617 ########## File path: core/src/main/java/org/apache/carbondata/core/index/IndexUtil.java ########## @@ -112,15 +116,31 @@ public static IndexJob getIndexJob(Configuration configuration) throws IOExcepti */ private static void executeClearIndexJob(IndexJob indexJob, CarbonTable carbonTable, String indexToClear) throws IOException { - SegmentStatusManager.ValidAndInvalidSegmentsInfo validAndInvalidSegmentsInfo = - getValidAndInvalidSegments(carbonTable, FileFactory.getConfiguration()); - List<String> invalidSegment = new ArrayList<>(); - for (Segment segment : validAndInvalidSegmentsInfo.getInvalidSegments()) { - invalidSegment.add(segment.getSegmentNo()); + IndexInputFormat indexInputFormat; + if (!carbonTable.isTransactionalTable()) { Review comment: Please reuse the LatestFilesReadCommittedScope class to take a snapshot of the non-transactional table. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
kunal642 commented on a change in pull request #4099: URL: https://github.com/apache/carbondata/pull/4099#discussion_r589176617 ########## File path: core/src/main/java/org/apache/carbondata/core/index/IndexUtil.java ########## @@ -112,15 +116,31 @@ public static IndexJob getIndexJob(Configuration configuration) throws IOExcepti */ private static void executeClearIndexJob(IndexJob indexJob, CarbonTable carbonTable, String indexToClear) throws IOException { - SegmentStatusManager.ValidAndInvalidSegmentsInfo validAndInvalidSegmentsInfo = - getValidAndInvalidSegments(carbonTable, FileFactory.getConfiguration()); - List<String> invalidSegment = new ArrayList<>(); - for (Segment segment : validAndInvalidSegmentsInfo.getInvalidSegments()) { - invalidSegment.add(segment.getSegmentNo()); + IndexInputFormat indexInputFormat; + if (!carbonTable.isTransactionalTable()) { Review comment: Please reuse the LatestFilesReadCommittedScope class to take a snapshot of the non-transactional table. No need to create a separate method ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
Karan980 commented on a change in pull request #4099: URL: https://github.com/apache/carbondata/pull/4099#discussion_r589323912 ########## File path: core/src/main/java/org/apache/carbondata/core/index/IndexUtil.java ########## @@ -112,15 +116,31 @@ public static IndexJob getIndexJob(Configuration configuration) throws IOExcepti */ private static void executeClearIndexJob(IndexJob indexJob, CarbonTable carbonTable, String indexToClear) throws IOException { - SegmentStatusManager.ValidAndInvalidSegmentsInfo validAndInvalidSegmentsInfo = - getValidAndInvalidSegments(carbonTable, FileFactory.getConfiguration()); - List<String> invalidSegment = new ArrayList<>(); - for (Segment segment : validAndInvalidSegmentsInfo.getInvalidSegments()) { - invalidSegment.add(segment.getSegmentNo()); + IndexInputFormat indexInputFormat; + if (!carbonTable.isTransactionalTable()) { Review comment: Done ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA2 commented on pull request #4099: URL: https://github.com/apache/carbondata/pull/4099#issuecomment-792733994 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/5113/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA2 commented on pull request #4099: URL: https://github.com/apache/carbondata/pull/4099#issuecomment-792735786 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/3354/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
vikramahuja1001 commented on pull request #4099: URL: https://github.com/apache/carbondata/pull/4099#issuecomment-793477512 please correct the splits path of getSplits in case of External Tables in DistributedPruneRDD as well ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
Karan980 commented on pull request #4099: URL: https://github.com/apache/carbondata/pull/4099#issuecomment-793527054 > please correct the splits path of getSplits in case of External Tables in DistributedPruneRDD as well Done ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
vikramahuja1001 commented on a change in pull request #4099: URL: https://github.com/apache/carbondata/pull/4099#discussion_r590253814 ########## File path: core/src/main/java/org/apache/carbondata/core/index/IndexUtil.java ########## @@ -112,15 +115,32 @@ public static IndexJob getIndexJob(Configuration configuration) throws IOExcepti */ private static void executeClearIndexJob(IndexJob indexJob, CarbonTable carbonTable, String indexToClear) throws IOException { - SegmentStatusManager.ValidAndInvalidSegmentsInfo validAndInvalidSegmentsInfo = - getValidAndInvalidSegments(carbonTable, FileFactory.getConfiguration()); - List<String> invalidSegment = new ArrayList<>(); - for (Segment segment : validAndInvalidSegmentsInfo.getInvalidSegments()) { - invalidSegment.add(segment.getSegmentNo()); + IndexInputFormat indexInputFormat; + if (!carbonTable.isTransactionalTable()) { + ReadCommittedScope readCommittedScope = + new LatestFilesReadCommittedScope(carbonTable.getTablePath(), + FileFactory.getConfiguration()); + LoadMetadataDetails[] loadMetadataDetails = readCommittedScope.getSegmentList(); + List<Segment> listOfValidSegments = new ArrayList<>(); + for (LoadMetadataDetails segment : loadMetadataDetails) { + Segment seg = new Segment(segment.getLoadName(), segment.getSegmentFile()); + seg.setLoadMetadataDetails(segment); + listOfValidSegments.add(seg); + } + indexInputFormat = + new IndexInputFormat(carbonTable, listOfValidSegments, new ArrayList<>(0), true, + indexToClear); + } else { + SegmentStatusManager.ValidAndInvalidSegmentsInfo validAndInvalidSegmentsInfo = + getValidAndInvalidSegments(carbonTable, FileFactory.getConfiguration()); + List<String> invalidSegment = new ArrayList<>(); + for (Segment segment : validAndInvalidSegmentsInfo.getInvalidSegments()) { Review comment: Use Java stream here instead of a for loop ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
vikramahuja1001 commented on a change in pull request #4099: URL: https://github.com/apache/carbondata/pull/4099#discussion_r590255909 ########## File path: core/src/main/java/org/apache/carbondata/core/index/IndexUtil.java ########## @@ -112,15 +115,32 @@ public static IndexJob getIndexJob(Configuration configuration) throws IOExcepti */ private static void executeClearIndexJob(IndexJob indexJob, CarbonTable carbonTable, String indexToClear) throws IOException { - SegmentStatusManager.ValidAndInvalidSegmentsInfo validAndInvalidSegmentsInfo = - getValidAndInvalidSegments(carbonTable, FileFactory.getConfiguration()); - List<String> invalidSegment = new ArrayList<>(); - for (Segment segment : validAndInvalidSegmentsInfo.getInvalidSegments()) { - invalidSegment.add(segment.getSegmentNo()); + IndexInputFormat indexInputFormat; + if (!carbonTable.isTransactionalTable()) { + ReadCommittedScope readCommittedScope = + new LatestFilesReadCommittedScope(carbonTable.getTablePath(), + FileFactory.getConfiguration()); + LoadMetadataDetails[] loadMetadataDetails = readCommittedScope.getSegmentList(); + List<Segment> listOfValidSegments = new ArrayList<>(); Review comment: give arrayList limit equals to loadMetadataDetails size, because it will never increase that size ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
vikramahuja1001 commented on a change in pull request #4099: URL: https://github.com/apache/carbondata/pull/4099#discussion_r590256333 ########## File path: core/src/main/java/org/apache/carbondata/core/index/IndexUtil.java ########## @@ -112,15 +115,32 @@ public static IndexJob getIndexJob(Configuration configuration) throws IOExcepti */ private static void executeClearIndexJob(IndexJob indexJob, CarbonTable carbonTable, String indexToClear) throws IOException { - SegmentStatusManager.ValidAndInvalidSegmentsInfo validAndInvalidSegmentsInfo = - getValidAndInvalidSegments(carbonTable, FileFactory.getConfiguration()); - List<String> invalidSegment = new ArrayList<>(); - for (Segment segment : validAndInvalidSegmentsInfo.getInvalidSegments()) { - invalidSegment.add(segment.getSegmentNo()); + IndexInputFormat indexInputFormat; + if (!carbonTable.isTransactionalTable()) { + ReadCommittedScope readCommittedScope = + new LatestFilesReadCommittedScope(carbonTable.getTablePath(), + FileFactory.getConfiguration()); + LoadMetadataDetails[] loadMetadataDetails = readCommittedScope.getSegmentList(); + List<Segment> listOfValidSegments = new ArrayList<>(); + for (LoadMetadataDetails segment : loadMetadataDetails) { Review comment: try using java stream api here instead of a for loop wherever possible ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
Karan980 commented on a change in pull request #4099: URL: https://github.com/apache/carbondata/pull/4099#discussion_r590318352 ########## File path: core/src/main/java/org/apache/carbondata/core/index/IndexUtil.java ########## @@ -112,15 +115,32 @@ public static IndexJob getIndexJob(Configuration configuration) throws IOExcepti */ private static void executeClearIndexJob(IndexJob indexJob, CarbonTable carbonTable, String indexToClear) throws IOException { - SegmentStatusManager.ValidAndInvalidSegmentsInfo validAndInvalidSegmentsInfo = - getValidAndInvalidSegments(carbonTable, FileFactory.getConfiguration()); - List<String> invalidSegment = new ArrayList<>(); - for (Segment segment : validAndInvalidSegmentsInfo.getInvalidSegments()) { - invalidSegment.add(segment.getSegmentNo()); + IndexInputFormat indexInputFormat; + if (!carbonTable.isTransactionalTable()) { + ReadCommittedScope readCommittedScope = + new LatestFilesReadCommittedScope(carbonTable.getTablePath(), + FileFactory.getConfiguration()); + LoadMetadataDetails[] loadMetadataDetails = readCommittedScope.getSegmentList(); + List<Segment> listOfValidSegments = new ArrayList<>(); + for (LoadMetadataDetails segment : loadMetadataDetails) { + Segment seg = new Segment(segment.getLoadName(), segment.getSegmentFile()); + seg.setLoadMetadataDetails(segment); + listOfValidSegments.add(seg); + } + indexInputFormat = + new IndexInputFormat(carbonTable, listOfValidSegments, new ArrayList<>(0), true, + indexToClear); + } else { + SegmentStatusManager.ValidAndInvalidSegmentsInfo validAndInvalidSegmentsInfo = + getValidAndInvalidSegments(carbonTable, FileFactory.getConfiguration()); + List<String> invalidSegment = new ArrayList<>(); + for (Segment segment : validAndInvalidSegmentsInfo.getInvalidSegments()) { Review comment: Done ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
Karan980 commented on a change in pull request #4099: URL: https://github.com/apache/carbondata/pull/4099#discussion_r590318643 ########## File path: core/src/main/java/org/apache/carbondata/core/index/IndexUtil.java ########## @@ -112,15 +115,32 @@ public static IndexJob getIndexJob(Configuration configuration) throws IOExcepti */ private static void executeClearIndexJob(IndexJob indexJob, CarbonTable carbonTable, String indexToClear) throws IOException { - SegmentStatusManager.ValidAndInvalidSegmentsInfo validAndInvalidSegmentsInfo = - getValidAndInvalidSegments(carbonTable, FileFactory.getConfiguration()); - List<String> invalidSegment = new ArrayList<>(); - for (Segment segment : validAndInvalidSegmentsInfo.getInvalidSegments()) { - invalidSegment.add(segment.getSegmentNo()); + IndexInputFormat indexInputFormat; + if (!carbonTable.isTransactionalTable()) { + ReadCommittedScope readCommittedScope = + new LatestFilesReadCommittedScope(carbonTable.getTablePath(), + FileFactory.getConfiguration()); + LoadMetadataDetails[] loadMetadataDetails = readCommittedScope.getSegmentList(); + List<Segment> listOfValidSegments = new ArrayList<>(); Review comment: Done ########## File path: core/src/main/java/org/apache/carbondata/core/index/IndexUtil.java ########## @@ -112,15 +115,32 @@ public static IndexJob getIndexJob(Configuration configuration) throws IOExcepti */ private static void executeClearIndexJob(IndexJob indexJob, CarbonTable carbonTable, String indexToClear) throws IOException { - SegmentStatusManager.ValidAndInvalidSegmentsInfo validAndInvalidSegmentsInfo = - getValidAndInvalidSegments(carbonTable, FileFactory.getConfiguration()); - List<String> invalidSegment = new ArrayList<>(); - for (Segment segment : validAndInvalidSegmentsInfo.getInvalidSegments()) { - invalidSegment.add(segment.getSegmentNo()); + IndexInputFormat indexInputFormat; + if (!carbonTable.isTransactionalTable()) { + ReadCommittedScope readCommittedScope = + new LatestFilesReadCommittedScope(carbonTable.getTablePath(), + FileFactory.getConfiguration()); + LoadMetadataDetails[] loadMetadataDetails = readCommittedScope.getSegmentList(); + List<Segment> listOfValidSegments = new ArrayList<>(); + for (LoadMetadataDetails segment : loadMetadataDetails) { Review comment: Done ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
kunal642 commented on pull request #4099: URL: https://github.com/apache/carbondata/pull/4099#issuecomment-793873234 LGTM ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA2 commented on pull request #4099: URL: https://github.com/apache/carbondata/pull/4099#issuecomment-794000275 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/3363/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA2 commented on pull request #4099: URL: https://github.com/apache/carbondata/pull/4099#issuecomment-794004052 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/5123/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
asfgit closed pull request #4099: URL: https://github.com/apache/carbondata/pull/4099 ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
Free forum by Nabble | Edit this page |