GitHub user manishgupta88 opened a pull request:
https://github.com/apache/carbondata/pull/1624 [WIP][CARBONDATA-1867] Add support for task/segment level pruning Added support for task/segment level pruning. Added code to compute task level min/max which can be helpful for task/segment level pruning Note: Dependent on PR #1619 ..Need to be rebased once PR 1619 is merged. - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [ ] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/manishgupta88/carbondata add_task_level_min_max Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1624.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1624 ---- commit 5ada233b0e96baf9864407d6c9364495315ca6bc Author: manishgupta88 <[hidden email]> Date: 2017-12-06T11:41:51Z Added support for task/segment level pruning. Added code to compute task level min/max which can be helpful for task/segment level pruning ---- --- |
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1624 Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/499/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1624 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1762/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1624#discussion_r155232361 --- Diff: core/src/main/java/org/apache/carbondata/core/datamap/TableDataMap.java --- @@ -165,4 +165,22 @@ public DataMapFactory getDataMapFactory() { @Override public void onEvent(Event event, OperationContext opContext) { dataMapFactory.fireEvent(event); } + + /** + * Method to check whether the current segment need to be scanned or not for the queried data + * + * @param segmentId + * @param filterExp + * @return + * @throws IOException + */ + public boolean isScanRequired(String segmentId, FilterResolverIntf filterExp) throws IOException { --- End diff -- It should not be `isScanRequired`, it should be pruneSegments that should take segmentslist and filterexp, and returns the pruned list of segments --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1624#discussion_r155232944 --- Diff: core/src/main/java/org/apache/carbondata/core/indexstore/UnsafeMemoryDMStore.java --- @@ -101,6 +103,25 @@ public void addIndexRowToUnsafe(DataMapRow indexRow) throws MemoryException { pointers[rowCount++] = pointer; } + /** + * Add the task min/max row to unsafe. + * + * @param taskMinMaxRow + * @return + */ + public void addTaskMinMaxRowToUnsafe(DataMapRow taskMinMaxRow, List<Integer> indexesToAccess) --- End diff -- why it is required a new method? why don't use `addToUnsafe` --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1624#discussion_r155234625 --- Diff: core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMap.java --- @@ -269,6 +302,47 @@ private DataMapRow addMinMax(int[] minMaxLen, CarbonRowSchema carbonRowSchema, return minRow; } + /** + * This method will compute min/max values at task level + * + * @param taskMinMaxRow + * @param minMaxLen + * @param carbonRowSchema + * @param minMaxValue + * @param ordinal + * @param isMinValueComparison + */ + private void addTaskMinMaxValues(DataMapRow taskMinMaxRow, int[] minMaxLen, + CarbonRowSchema carbonRowSchema, byte[][] minMaxValue, int ordinal, + boolean isMinValueComparison) { + DataMapRow row = taskMinMaxRow.getRow(ordinal); + byte[][] updatedMinMaxValues = minMaxValue; + if (null == row) { + CarbonRowSchema[] minSchemas = + ((CarbonRowSchema.StructCarbonRowSchema) carbonRowSchema).getChildSchemas(); + row = new DataMapRowImpl(minSchemas); + } else { + byte[][] existingMinMaxValues = getMinMaxValue(taskMinMaxRow, ordinal); + // Compare and update min max values + SerializableComparator comparator = --- End diff -- I think you can use `UnsafeComparer.compare` --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1624#discussion_r155236681 --- Diff: core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMap.java --- @@ -178,6 +191,26 @@ private void loadToUnsafe(DataFileFooter fileFooter, SegmentProperties segmentPr throw new RuntimeException(e); } } + // write the task level min/max row to unsafe memory store + if (null != taskMinMaxRow) { + addRowToUnsafeMemoryStore(taskMinMaxRow); + } + } + + private void addRowToUnsafeMemoryStore(DataMapRow row) { --- End diff -- I think better set some junk or default values exept for the minmax in case of last row. so that it will be easy to read and write --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1624 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2145/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1624 Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/523/ --- |
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on the issue:
https://github.com/apache/carbondata/pull/1624 @ravipesala ..handled review comments..kindly review and merge --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1624 Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/524/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1624 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1780/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1624 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2155/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1624 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2156/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1624 Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/549/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1624 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1807/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1624 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2179/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1624 LGTM --- |
In reply to this post by qiuchenjian-2
|
Free forum by Nabble | Edit this page |