GitHub user ajantha-bhat opened a pull request:
https://github.com/apache/carbondata/pull/2949 [WIP] support parallel block pruning for non-default datamaps [WIP] support parallel block pruning for non-default datamaps This PR dependent on #2936 Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [ ] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ajantha-bhat/carbondata working_backup Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2949.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2949 ---- commit 6237d69fcc0ddc1a08c74579762b721108a251fe Author: ajantha-bhat <ajanthabhat@...> Date: 2018-11-20T16:45:06Z parllelize block pruning commit e8e912daf3ada357352e006ec9ce435d4c4b1625 Author: ajantha-bhat <ajanthabhat@...> Date: 2018-11-22T11:01:53Z reveiw comment fix commit d0bf82f276618f6fa09cbce65f714394b5fa4e0c Author: ajantha-bhat <ajanthabhat@...> Date: 2018-11-23T13:22:07Z support parallel pruning for non-default datamaps ---- --- |
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2949 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1526/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2949 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1527/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2949 Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9785/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2949 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1737/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2949 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1530/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2949 Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9788/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2949 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1740/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2949 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1536/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2949 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1747/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2949 Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9795/ --- |
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2949#discussion_r236571984 --- Diff: core/src/main/java/org/apache/carbondata/core/datamap/dev/DataMap.java --- @@ -70,4 +70,6 @@ void init(DataMapModel dataMapModel) */ void finish(); + // can return , number of records information that are stored in datamap. --- End diff -- "can return"? What does this mean? --- |
In reply to this post by qiuchenjian-2
Github user ajantha-bhat commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2949#discussion_r236746764 --- Diff: core/src/main/java/org/apache/carbondata/core/datamap/dev/DataMap.java --- @@ -70,4 +70,6 @@ void init(DataMapModel dataMapModel) */ void finish(); + // can return , number of records information that are stored in datamap. --- End diff -- ok, changed to just "returns" --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2949 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1560/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2949 Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9818/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2949 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1771/ --- |
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2949#discussion_r236907320 --- Diff: core/src/main/java/org/apache/carbondata/core/datamap/TableDataMap.java --- @@ -205,26 +195,53 @@ public BlockletDetailsFetcher getBlockletDetailsFetcher() { final FilterResolverIntf filterExp, final List<PartitionSpec> partitions, List<ExtendedBlocklet> blocklets, final Map<Segment, List<DataMap>> dataMaps, int totalFiles) { + /* + ********************************************************************************* + * Below is the example of how this part of code works. + * consider a scenario of having 5 segments, 10 datamaps in each segment, --- End diff -- Also what does the 'record' mean below? --- |
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2949#discussion_r236907065 --- Diff: core/src/main/java/org/apache/carbondata/core/datamap/TableDataMap.java --- @@ -205,26 +195,53 @@ public BlockletDetailsFetcher getBlockletDetailsFetcher() { final FilterResolverIntf filterExp, final List<PartitionSpec> partitions, List<ExtendedBlocklet> blocklets, final Map<Segment, List<DataMap>> dataMaps, int totalFiles) { + /* + ********************************************************************************* + * Below is the example of how this part of code works. + * consider a scenario of having 5 segments, 10 datamaps in each segment, --- End diff -- What do you mean by saying '10 datamaps in each segment'? Do you mean '10 index files or merged index files or blocklet or something else?' --- |
In reply to this post by qiuchenjian-2
Github user ajantha-bhat commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2949#discussion_r240900313 --- Diff: core/src/main/java/org/apache/carbondata/core/datamap/TableDataMap.java --- @@ -205,26 +195,53 @@ public BlockletDetailsFetcher getBlockletDetailsFetcher() { final FilterResolverIntf filterExp, final List<PartitionSpec> partitions, List<ExtendedBlocklet> blocklets, final Map<Segment, List<DataMap>> dataMaps, int totalFiles) { + /* + ********************************************************************************* + * Below is the example of how this part of code works. + * consider a scenario of having 5 segments, 10 datamaps in each segment, --- End diff -- BlockDatamap and blockletDatamap can store multiple files information. Each file is one row in that datamap. But non-default datamaps are not like that, so default datamaps distribution in multithread happens based on number of entries in datamaps, for non-default datamps distribution is based on number of datamaps (one datamap is considered as one record for non-default datamaps) ALso 10 datamap in a segment means, one merge index file has info of 10 index files --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2949#discussion_r241279625 --- Diff: datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMap.java --- @@ -436,4 +436,9 @@ public String toString() { public void finish() { } + + @Override public int getNumberOfEntries() { --- End diff -- Move this method to available abstract class . --- |
Free forum by Nabble | Edit this page |