GitHub user manishgupta88 opened a pull request:
https://github.com/apache/carbondata/pull/1875 [CARBONDATA-2092] Fix compaction bug to prevent the compaction flow from going through the restructure compaction flow **Problem and analysis:** During data load current schema timestamp is written to the carbondata fileHeader. This is used during compaction to decide whether the block is a restructured block or the block is according to the latest schema. As the blocklet information is now stored in the index file, while laoding it in memory the carbondata file header is not read and due to this the schema timestamp is not getting set to the blocklet information. Due to this during compaction flow there is a mismatch on comparing the current schema time stamp with the timestamp stored in the block and the flow goes through the restructure compaction flow instead of normal compaction flow. **Impact:** Compaction performance degradation as restructure compaction flow involves sorting of data again. **Solution:** Modified code to fix compaction bug to prevent the compaction flow from going through the restructure compaction flow until and unless and restructure add or drop column operation has not been performed - [ ] Any interfaces changed? No - [ ] Any backward compatibility impacted? No - [ ] Document update required? No - [ ] Testing done Manual testing - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. NA You can merge this pull request into a Git repository by running: $ git pull https://github.com/manishgupta88/carbondata compaction_bug_fix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1875.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1875 ---- commit 18064e5b17649376169211b151375801c4dfca34 Author: manishgupta88 <tomanishgupta18@...> Date: 2018-01-23T15:42:39Z Modified code to fix compaction bug to prevent the compaction flow from going through the restructure compaction flow until and unless and restructure add or drop column operation has not been performed ---- --- |
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1875 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1986/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1875 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3220/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1875 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3180/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1875#discussion_r164653466 --- Diff: core/src/main/java/org/apache/carbondata/core/util/AbstractDataFileFooterConverter.java --- @@ -165,6 +165,7 @@ private static BitSet getPresenceMeta( dataFileFooter.setBlockInfo(new BlockInfo(tableBlockInfo)); dataFileFooter.setSegmentInfo(segmentInfo); dataFileFooter.setVersionId(tableBlockInfo.getVersion()); + dataFileFooter.setSchemaUpdatedTimeStamp(readIndexHeader.getSchema_time_stamp()); --- End diff -- What if it is from old store? better check `readIndexHeader.isSetSchema_time_stamp` before setting it to footer --- |
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on the issue:
https://github.com/apache/carbondata/pull/1875 @ravipesala ..handled review comments and fixed failing test case..kindly review and merge --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1875 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2049/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1875 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3289/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1875 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3234/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1875 retest this please --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1875 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3343/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1875 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2107/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1875 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3365/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1875 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2129/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1875 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3278/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1875 LGTM --- |
In reply to this post by qiuchenjian-2
|
Free forum by Nabble | Edit this page |