GitHub user xuchuanyin opened a pull request:
https://github.com/apache/carbondata/pull/1606 WIP:[CARBONDATA-1839] Fix bugs in compressing sort temp files Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [X] Any interfaces changed? `YES, ONLY CHANGE INTERNAL INTERFACES` - [X] Any backward compatibility impacted? `NO` - [X] Document update required? `YES` - [X] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? `ADDED TESTS` - How it is tested? Please attach test report. `TESTED IN LOCAL CLUSTER` - Is it a performance related change? Please attach the performance test report. `YES` - Any additional information to help reviewers in testing this change. `There are some duplicate code in write temp sort files found during this bug fixing and I plan to optimize it in successive PR not in this one.` - [X] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. `NOT RELATED` RESOLVE === 1. fix bugs in compressing sort temp files 2. optimize parameters 3. add tests You can merge this pull request into a Git repository by running: $ git pull https://github.com/xuchuanyin/carbondata bug_compress_sort_temp Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1606.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1606 ---- commit 83e109b6d332dafcc10e243e8bc5a21ac3617ac6 Author: xuchuanyin <[hidden email]> Date: 2017-12-01T11:32:00Z fix bugs in compressing sort temp files 1. fix bugs in compressing sort temp files 2. optimize parameters 3. add tests ---- --- |
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1606 Build Failed with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/401/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1606 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2061/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1606 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1674/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1606 Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/429/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1606 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2085/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1606 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1700/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1606 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2095/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1606 Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/445/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1606 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1714/ --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1606#discussion_r154999120 --- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/sort/impl/ParallelReadMergeSorterImpl.java --- @@ -80,11 +81,7 @@ public void initialize(SortParameters sortParameters) { File.separator, CarbonCommonConstants.SORT_TEMP_FILE_LOCATION); finalMerger = new SingleThreadFinalSortFilesMerger(dataFolderLocations, sortParameters.getTableName(), - sortParameters.getDimColCount(), - sortParameters.getComplexDimColCount(), sortParameters.getMeasureColCount(), - sortParameters.getNoDictionaryCount(), sortParameters.getMeasureDataType(), - sortParameters.getNoDictionaryDimnesionColumn(), - sortParameters.getNoDictionarySortColumn()); + new TableFieldStat(sortParameters)); --- End diff -- I think the indentation is not correct --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1606#discussion_r154999525 --- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/holder/UnsafeSortTempFileChunkHolder.java --- @@ -352,6 +330,49 @@ private void fillDataForPrefetch() { } } + private Object[][] getBatchedRowFromStream(int expected) throws CarbonSortKeyAndGroupByException { --- End diff -- please add comment for this function --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1606#discussion_r154999809 --- Diff: processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/SortTempFileChunkHolder.java --- @@ -381,6 +351,51 @@ private void fillDataForPrefetch() { return holder; } + private Object[][] getBatchedRowFromStream(int expected) throws CarbonSortKeyAndGroupByException { --- End diff -- please add comment for this function --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1606#discussion_r155000018 --- Diff: processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/TableFieldStat.java --- @@ -0,0 +1,139 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.processing.sort.sortdata; + +import java.util.Objects; + +import org.apache.carbondata.core.metadata.datatype.DataType; + +public class TableFieldStat { --- End diff -- Add comment --- |
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on the issue:
https://github.com/apache/carbondata/pull/1606 @jackylk Thanks for your review, since another PR #1594 has been merged, I'll optimize this PR to reduce duplicate codes in writing sort temp files. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1606 Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/495/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1606 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1757/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1606 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2141/ --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on the issue:
https://github.com/apache/carbondata/pull/1606 Is the rework completed for this PR? --- |
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on the issue:
https://github.com/apache/carbondata/pull/1606 No, it is a little complex to reduce the duplicate code. I'll make a full test as possible as I can. --- |
Free forum by Nabble | Edit this page |