[GitHub] carbondata pull request #1013: [WIP]IUD Performance Changes

classic Classic list List threaded Threaded
19 messages Options
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1013: [WIP]IUD Performance Changes

qiuchenjian-2
GitHub user sounakr opened a pull request:

    https://github.com/apache/carbondata/pull/1013

    [WIP]IUD Performance Changes

    IUD Performance Changes
   
    1. Get invalid blocks ony when there is a Update Performed in the Table.
   
    2. As UpdateVO is per segment basis no need to call it for each blocks.
    ---


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/sounakr/incubator-carbondata IUD_Performance

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/1013.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1013
   
----
commit a67ed7b89f321cb14e29d828fe2bd6f2554dc38a
Author: sounakr <[hidden email]>
Date:   2017-06-08T14:58:50Z

    IUD Performance Changes

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1013: [WIP]IUD Performance Changes

qiuchenjian-2
Github user asfgit commented on the issue:

    https://github.com/apache/carbondata/pull/1013
 
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1013: [WIP]IUD Performance Changes

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1013
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2321/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1013: [WIP]IUD Performance Changes

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user asfgit commented on the issue:

    https://github.com/apache/carbondata/pull/1013
 
   
    Refer to this link for build results (access rights to CI server needed):
    https://builds.apache.org/job/carbondata-pr-spark-1.6/197/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1013: [WIP]IUD Performance Changes

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1013
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2346/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1013: [WIP]IUD Performance Changes

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user asfgit commented on the issue:

    https://github.com/apache/carbondata/pull/1013
 
   
    Refer to this link for build results (access rights to CI server needed):
    https://builds.apache.org/job/carbondata-pr-spark-1.6/225/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1013: [CARBONDATA-1154] IUD Performance Changes

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user asfgit commented on the issue:

    https://github.com/apache/carbondata/pull/1013
 
   
    Refer to this link for build results (access rights to CI server needed):
    https://builds.apache.org/job/carbondata-pr-spark-1.6/286/<h2>Failed Tests: <span class='status-failure'>2</span></h2><h3><a name='carbondata-pr-spark-1.6/org.apache.carbondata:carbondata-spark-common-test' /><a href='https://builds.apache.org/job/carbondata-pr-spark-1.6/286/org.apache.carbondata$carbondata-spark-common-test/testReport'>carbondata-pr-spark-1.6/org.apache.carbondata:carbondata-spark-common-test</a>: <span class='status-failure'>2</span></h3><ul><li><a href='https://builds.apache.org/job/carbondata-pr-spark-1.6/286/org.apache.carbondata$carbondata-spark-common-test/testReport/org.apache.carbondata.spark.testsuite.dataload/TestLoadDataFrame/test_load_dataframe_with_single_pass_enabled/'><strong>org.apache.carbondata.spark.testsuite.dataload.TestLoadDataFrame.test load dataframe with single pass enabled</strong></a></li><li><a href='https://builds.apache.org/job/carbondata-pr-spark-1.6/286/org.apache.carbondata$carbondata-spark-common-test/testReport/org.apache.carbo
 ndata.spark.testsuite.dataload/TestLoadDataFrame/test_load_dataframe_with_single_pass_disabled/'><strong>org.apache.carbondata.spark.testsuite.dataload.TestLoadDataFrame.test load dataframe with single pass disabled</strong></a></li></ul>



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1013: [CARBONDATA-1154] IUD Performance Changes

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user asfgit commented on the issue:

    https://github.com/apache/carbondata/pull/1013
 
   
    Refer to this link for build results (access rights to CI server needed):
    https://builds.apache.org/job/carbondata-pr-spark-1.6/289/<h2>Failed Tests: <span class='status-failure'>1</span></h2><h3><a name='carbondata-pr-spark-1.6/org.apache.carbondata:carbondata-spark-common-test' /><a href='https://builds.apache.org/job/carbondata-pr-spark-1.6/289/org.apache.carbondata$carbondata-spark-common-test/testReport'>carbondata-pr-spark-1.6/org.apache.carbondata:carbondata-spark-common-test</a>: <span class='status-failure'>1</span></h3><ul><li><a href='https://builds.apache.org/job/carbondata-pr-spark-1.6/289/org.apache.carbondata$carbondata-spark-common-test/testReport/org.apache.carbondata.spark.testsuite.allqueries/InsertIntoCarbonTableTestCase/insert_into_carbon_table_from_carbon_table_union_query/'><strong>org.apache.carbondata.spark.testsuite.allqueries.InsertIntoCarbonTableTestCase.insert into carbon table from carbon table union query</strong></a></li></ul>



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1013: [CARBONDATA-1154] IUD Performance Changes

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1013
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2407/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1013: [CARBONDATA-1154] IUD Performance Changes

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1013
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2410/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1013: [CARBONDATA-1154] IUD Performance Changes

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1013#discussion_r121485733
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/SegmentTaskIndexStore.java ---
    @@ -184,22 +184,29 @@ private SegmentTaskIndexWrapper loadAndGetTaskIdToSegmentsMap(
         SegmentUpdateStatusManager updateStatusManager =
             new SegmentUpdateStatusManager(absoluteTableIdentifier);
         String segmentId = null;
    +    UpdateVO updateVO = null;
         TaskBucketHolder taskBucketHolder = null;
         try {
           while (iteratorOverSegmentBlocksInfos.hasNext()) {
    +        // Initialize the UpdateVO to Null for each segment.
    +        updateVO = null;
    --- End diff --
   
    Better move the line `UpdateVO updateVO = null;` to here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1013: [CARBONDATA-1154] IUD Performance Changes

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1013#discussion_r121486113
 
    --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/CarbonInputFormat.java ---
    @@ -340,23 +340,37 @@ private static AbsoluteTableIdentifier getAbsoluteTableIdentifier(Configuration
     
         List<InputSplit> result = new LinkedList<InputSplit>();
         FilterExpressionProcessor filterExpressionProcessor = new FilterExpressionProcessor();
    +    UpdateVO invalidBlockVOForSegmentId = null;
    +    Boolean  IUDTable = false;
    --- End diff --
   
    Use `isIUDTable`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1013: [CARBONDATA-1154] IUD Performance Changes

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user sounakr commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1013#discussion_r121572267
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/SegmentTaskIndexStore.java ---
    @@ -184,22 +184,29 @@ private SegmentTaskIndexWrapper loadAndGetTaskIdToSegmentsMap(
         SegmentUpdateStatusManager updateStatusManager =
             new SegmentUpdateStatusManager(absoluteTableIdentifier);
         String segmentId = null;
    +    UpdateVO updateVO = null;
         TaskBucketHolder taskBucketHolder = null;
         try {
           while (iteratorOverSegmentBlocksInfos.hasNext()) {
    +        // Initialize the UpdateVO to Null for each segment.
    +        updateVO = null;
    --- End diff --
   
    Done.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1013: [CARBONDATA-1154] IUD Performance Changes

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user sounakr commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1013#discussion_r121572395
 
    --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/CarbonInputFormat.java ---
    @@ -340,23 +340,37 @@ private static AbsoluteTableIdentifier getAbsoluteTableIdentifier(Configuration
     
         List<InputSplit> result = new LinkedList<InputSplit>();
         FilterExpressionProcessor filterExpressionProcessor = new FilterExpressionProcessor();
    +    UpdateVO invalidBlockVOForSegmentId = null;
    +    Boolean  IUDTable = false;
    --- End diff --
   
    Done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1013: [CARBONDATA-1154] IUD Performance Changes

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user asfgit commented on the issue:

    https://github.com/apache/carbondata/pull/1013
 
   
    Refer to this link for build results (access rights to CI server needed):
    https://builds.apache.org/job/carbondata-pr-spark-1.6/299/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1013: [CARBONDATA-1154] IUD Performance Changes

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1013
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2420/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1013: [CARBONDATA-1154] IUD Performance Changes

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1013
 
    LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1013: [CARBONDATA-1154] IUD Performance Changes

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user asfgit closed the pull request at:

    https://github.com/apache/carbondata/pull/1013


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1013: [CARBONDATA-1154] IUD Performance Changes

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user sehriff commented on the issue:

    https://github.com/apache/carbondata/pull/1013
 
    Which version did you base?I apply this pr to 1.1.0 and can not find some class ,like org.apache.carbondata.core.metadata.schema.PartitionInfo;
    org.apache.carbondata.core.scan.partition.PartitionUtil;
    ...
    and type incompatible
    carbondata-apache-carbondata-1.1.0/core/src/main/java/org/apache/carbondata/core/datastore/SegmentTaskIndexStore.java:[225,93] 不兼容的类型: java.util.Map<org.apache.carbondata.core.datastore.SegmentTaskIndexStore1.TaskBucketHolder,org.apache.carbondata.core.datastore.block.AbstractIndex>无法转换为java.util.Map<org.apache.carbondata.core.datastore.SegmentTaskIndexStore.TaskBucketHolder,org.apache.carbondata.core.datastore.block.AbstractIndex>


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---