kevinjmh opened a new pull request #3804: URL: https://github.com/apache/carbondata/pull/3804 ### Why is this PR needed? Currently carbon uses priority queue to sort holders of sorted rows. It first polls a holder from the heap, and adds it back if holder is not empty. This will cause two times heap maintainance. We can reduce half of that. ### What changes were proposed in this PR? What will be done when poll item from priority queue is: 1. remove first item. 2. move the last item to the position of first item, siftDown the new first item. In the case in carbon, we can peek(without removing from heap) the first item and get a row, and siftDown the holder to a proper position if the holder is not empty. ### Does this PR introduce any user interface change? - No - Yes. (please explain the change and update document) ### Is any new testcase added? - No - Yes ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
CarbonDataQA1 commented on pull request #3804: URL: https://github.com/apache/carbondata/pull/3804#issuecomment-648767888 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3204/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3804: URL: https://github.com/apache/carbondata/pull/3804#issuecomment-648768762 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1478/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3804: URL: https://github.com/apache/carbondata/pull/3804#issuecomment-648946730 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3214/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3804: URL: https://github.com/apache/carbondata/pull/3804#issuecomment-648949038 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1486/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3804: URL: https://github.com/apache/carbondata/pull/3804#issuecomment-650724384 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3234/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3804: URL: https://github.com/apache/carbondata/pull/3804#issuecomment-650724764 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1506/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
kevinjmh commented on pull request #3804: URL: https://github.com/apache/carbondata/pull/3804#issuecomment-651002898 retest this please ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
QiangCai commented on a change in pull request #3804: URL: https://github.com/apache/carbondata/pull/3804#discussion_r446839075 ########## File path: processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/merger/UnsafeInMemoryIntermediateDataMerger.java ########## @@ -141,25 +140,26 @@ private UnsafeCarbonRowForMerge getSortedRecordFromMemory() { // be based on comparator we are passing the heap // when will call poll it will always delete root of the tree and then // it does trickel down operation complexity is log(n) - UnsafeInmemoryMergeHolder poll = this.recordHolderHeap.poll(); + UnsafeInmemoryMergeHolder poll = this.recordHolderHeap.peek(); // get the row from chunk row = poll.getRow(); // check if there no entry present if (!poll.hasNext()) { + this.recordHolderHeap.poll(); Review comment: to release resource, invoke this.recordHolderHeap.close() ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
QiangCai commented on pull request #3804: URL: https://github.com/apache/carbondata/pull/3804#issuecomment-651010686 how about to change RowResultMergerProcessor? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
kevinjmh commented on a change in pull request #3804: URL: https://github.com/apache/carbondata/pull/3804#discussion_r446854580 ########## File path: processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/merger/UnsafeInMemoryIntermediateDataMerger.java ########## @@ -141,25 +140,26 @@ private UnsafeCarbonRowForMerge getSortedRecordFromMemory() { // be based on comparator we are passing the heap // when will call poll it will always delete root of the tree and then // it does trickel down operation complexity is log(n) - UnsafeInmemoryMergeHolder poll = this.recordHolderHeap.poll(); + UnsafeInmemoryMergeHolder poll = this.recordHolderHeap.peek(); // get the row from chunk row = poll.getRow(); // check if there no entry present if (!poll.hasNext()) { + this.recordHolderHeap.poll(); Review comment: ok ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
QiangCai edited a comment on pull request #3804: URL: https://github.com/apache/carbondata/pull/3804#issuecomment-651010686 can you change RowResultMergerProcessor? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
QiangCai commented on a change in pull request #3804: URL: https://github.com/apache/carbondata/pull/3804#discussion_r446910728 ########## File path: processing/src/main/java/org/apache/carbondata/processing/merger/RowResultMergerProcessor.java ########## @@ -126,7 +125,7 @@ public boolean execute(List<RawResultIterator> unsortedResultIteratorList, RawResultIterator iterator = null; while (index > 1) { // iterator the top record - iterator = this.recordHolderHeap.poll(); + iterator = this.recordHolderHeap.peek(); Object[] convertedRow = iterator.next(); if (null == convertedRow) { index--; Review comment: insert code: this.recordHolderHeap.poll(); ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3804: URL: https://github.com/apache/carbondata/pull/3804#issuecomment-651078587 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1518/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
kevinjmh commented on a change in pull request #3804: URL: https://github.com/apache/carbondata/pull/3804#discussion_r446924645 ########## File path: processing/src/main/java/org/apache/carbondata/processing/merger/RowResultMergerProcessor.java ########## @@ -126,7 +125,7 @@ public boolean execute(List<RawResultIterator> unsortedResultIteratorList, RawResultIterator iterator = null; while (index > 1) { // iterator the top record - iterator = this.recordHolderHeap.poll(); + iterator = this.recordHolderHeap.peek(); Object[] convertedRow = iterator.next(); if (null == convertedRow) { index--; Review comment: done ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3804: URL: https://github.com/apache/carbondata/pull/3804#issuecomment-651144733 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3251/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3804: URL: https://github.com/apache/carbondata/pull/3804#issuecomment-651160152 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3257/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3804: URL: https://github.com/apache/carbondata/pull/3804#issuecomment-651162149 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1521/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
QiangCai commented on pull request #3804: URL: https://github.com/apache/carbondata/pull/3804#issuecomment-651450852 LGTM ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
asfgit closed pull request #3804: URL: https://github.com/apache/carbondata/pull/3804 ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
Free forum by Nabble | Edit this page |