GitHub user kumarvishal09 opened a pull request:
https://github.com/apache/carbondata/pull/2611 [WIP]Fixed data loading performance issue Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [ ] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/kumarvishal09/incubator-carbondata dataloadPerFix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2611.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2611 ---- commit 5a2ebf3d056794387f2622818c9cf7be7ec4ec61 Author: kumarvishal09 <kumarvishal1802@...> Date: 2018-08-06T13:30:27Z Fixed data loading performance issue ---- --- |
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2611 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7801/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2611 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6525/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2611 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6180/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2611 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6181/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2611 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7802/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2611 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6526/ --- |
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on the issue:
https://github.com/apache/carbondata/pull/2611 @kumarvishal09 can you explain this modification? In previous implementation, we split a record to 'dict-sort', 'nodict-sort' and 'noSortDims & measures'. 'noSortDims & measures' is packed to bytes to avoid serialization-deserialization for them during reading/writing records to sort temp. In previous implementation, we can see about 8% enhancement in data loading. --- |
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on the issue:
https://github.com/apache/carbondata/pull/2611 @xuchuanyin Please check description --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2611#discussion_r208145664 --- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/row/IntermediateSortTempRow.java --- @@ -31,6 +25,7 @@ private int[] dictSortDims; private byte[][] noDictSortDims; private byte[] noSortDimsAndMeasures; + private Object[] measures; --- End diff -- Add comment why it is needed --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2611#discussion_r208146114 --- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/sort/SortStepRowHandler.java --- @@ -160,35 +160,45 @@ public SortStepRowHandler(SortParameters sortParameters) { * @return 3-parted row */ public Object[] convertIntermediateSortTempRowTo3Parted(IntermediateSortTempRow sortTempRow) { - int[] dictDims = new int[this.dictSortDimCnt + this.dictNoSortDimCnt]; - byte[][] noDictArray = new byte[this.noDictSortDimCnt + this.noDictNoSortDimCnt - + this.varcharDimCnt + this.complexDimCnt][]; - - int[] dictNoSortDims = new int[this.dictNoSortDimCnt]; - byte[][] noDictNoSortAndVarcharComplexDims - = new byte[this.noDictNoSortDimCnt + this.varcharDimCnt + this.complexDimCnt][]; - Object[] measures = new Object[this.measureCnt]; + Object[] out = new Object[3]; + NonDictionaryUtil + .prepareOutObj(out, sortTempRow.getDictSortDims(), sortTempRow.getNoDictSortDims(), + sortTempRow.getMeasures()); + return out; + } - sortTempRow.unpackNoSortFromBytes(dictNoSortDims, noDictNoSortAndVarcharComplexDims, measures, - this.dataTypes, this.varcharDimCnt, this.complexDimCnt); + /** + * Read intermediate sort temp row from InputStream. --- End diff -- Update the comment --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2611 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6186/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2611 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7809/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2611 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6533/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2611 LGTM --- |
In reply to this post by qiuchenjian-2
|
Free forum by Nabble | Edit this page |