[GitHub] carbondata pull request #1875: [CARBONDATA-2092] Fix compaction bug to preve...

classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1875: [CARBONDATA-2092] Fix compaction bug to preve...

qiuchenjian-2
GitHub user manishgupta88 opened a pull request:

    https://github.com/apache/carbondata/pull/1875

    [CARBONDATA-2092] Fix compaction bug to prevent the compaction flow from going through the restructure compaction flow

    **Problem and analysis:**
    During data load current schema timestamp is written to the carbondata fileHeader. This is used during compaction to decide whether the block is a restructured block or the block is according to the latest schema.
    As the blocklet information is now stored in the index file, while laoding it in memory the carbondata file header is not read and due to this the schema timestamp is not getting set to the blocklet information. Due to this during compaction flow there is a mismatch on comparing the current schema time stamp with the timestamp stored in the block and the flow goes through the restructure compaction flow instead of normal compaction flow.
   
    **Impact:**
    Compaction performance degradation as restructure compaction flow involves sorting of data again.
   
    **Solution:**
    Modified code to fix compaction bug to prevent the compaction flow from going through the restructure compaction flow until and unless and restructure add or drop column operation has not been performed
   
     - [ ] Any interfaces changed?
    No
     - [ ] Any backward compatibility impacted?
     No
     - [ ] Document update required?
    No
     - [ ] Testing done
    Manual testing      
     - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
    NA


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/manishgupta88/carbondata compaction_bug_fix

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/1875.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1875
   
----
commit 18064e5b17649376169211b151375801c4dfca34
Author: manishgupta88 <tomanishgupta18@...>
Date:   2018-01-23T15:42:39Z

    Modified code to fix compaction bug to prevent the compaction flow from going through the restructure compaction flow until and unless and
    restructure add or drop column operation has not been performed

----


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1875: [CARBONDATA-2092] Fix compaction bug to prevent the ...

qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1875
 
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1986/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1875: [CARBONDATA-2092] Fix compaction bug to prevent the ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1875
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3220/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1875: [CARBONDATA-2092] Fix compaction bug to prevent the ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1875
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3180/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1875: [CARBONDATA-2092] Fix compaction bug to preve...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1875#discussion_r164653466
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/util/AbstractDataFileFooterConverter.java ---
    @@ -165,6 +165,7 @@ private static BitSet getPresenceMeta(
             dataFileFooter.setBlockInfo(new BlockInfo(tableBlockInfo));
             dataFileFooter.setSegmentInfo(segmentInfo);
             dataFileFooter.setVersionId(tableBlockInfo.getVersion());
    +        dataFileFooter.setSchemaUpdatedTimeStamp(readIndexHeader.getSchema_time_stamp());
    --- End diff --
   
    What if it is from old store? better check `readIndexHeader.isSetSchema_time_stamp` before setting it to footer


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1875: [CARBONDATA-2092] Fix compaction bug to prevent the ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on the issue:

    https://github.com/apache/carbondata/pull/1875
 
    @ravipesala ..handled review comments and fixed failing test case..kindly review and merge


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1875: [CARBONDATA-2092] Fix compaction bug to prevent the ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1875
 
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2049/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1875: [CARBONDATA-2092] Fix compaction bug to prevent the ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1875
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3289/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1875: [CARBONDATA-2092] Fix compaction bug to prevent the ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1875
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3234/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1875: [CARBONDATA-2092] Fix compaction bug to prevent the ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1875
 
    retest this please


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1875: [CARBONDATA-2092] Fix compaction bug to prevent the ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1875
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3343/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1875: [CARBONDATA-2092] Fix compaction bug to prevent the ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1875
 
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2107/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1875: [CARBONDATA-2092] Fix compaction bug to prevent the ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1875
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3365/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1875: [CARBONDATA-2092] Fix compaction bug to prevent the ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1875
 
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2129/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1875: [CARBONDATA-2092] Fix compaction bug to prevent the ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1875
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3278/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1875: [CARBONDATA-2092] Fix compaction bug to prevent the ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1875
 
    LGTM


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1875: [CARBONDATA-2092] Fix compaction bug to preve...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user asfgit closed the pull request at:

    https://github.com/apache/carbondata/pull/1875


---