Apache CarbonData Dev Mailing List archive › Apache CarbonData JIRA issues

[GitHub] carbondata pull request #1013: [WIP]IUD Performance Changes

Classic

List

19 messages Options

Options

[GitHub] carbondata pull request #1013: [WIP]IUD Performance Changes

GitHub user sounakr opened a pull request:

https://github.com/apache/carbondata/pull/1013

[WIP]IUD Performance Changes

IUD Performance Changes

1. Get invalid blocks ony when there is a Update Performed in the Table.

2. As UpdateVO is per segment basis no need to call it for each blocks.
---

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sounakr/incubator-carbondata IUD_Performance

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1013.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1013

----
commit a67ed7b89f321cb14e29d828fe2bd6f2554dc38a
Author: sounakr <[hidden email]>
Date: 2017-06-08T14:58:50Z

IUD Performance Changes

----

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1013: [WIP]IUD Performance Changes

Github user asfgit commented on the issue:

https://github.com/apache/carbondata/pull/1013

Can one of the admins verify this patch?

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1013: [WIP]IUD Performance Changes

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1013

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2321/

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1013: [WIP]IUD Performance Changes

In reply to this post by qiuchenjian-2

Github user asfgit commented on the issue:

https://github.com/apache/carbondata/pull/1013

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/carbondata-pr-spark-1.6/197/

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1013: [WIP]IUD Performance Changes

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1013

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2346/

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1013: [WIP]IUD Performance Changes

In reply to this post by qiuchenjian-2

Github user asfgit commented on the issue:

https://github.com/apache/carbondata/pull/1013

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/carbondata-pr-spark-1.6/225/

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1013: [CARBONDATA-1154] IUD Performance Changes

In reply to this post by qiuchenjian-2

Github user asfgit commented on the issue:

https://github.com/apache/carbondata/pull/1013

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/carbondata-pr-spark-1.6/286/<h2>Failed Tests: <span class='status-failure'>2</span></h2><h3><a name='carbondata-pr-spark-1.6/org.apache.carbondata:carbondata-spark-common-test' /><a href='https://builds.apache.org/job/carbondata-pr-spark-1.6/286/org.apache.carbondata$carbondata-spark-common-test/testReport'>carbondata-pr-spark-1.6/org.apache.carbondata:carbondata-spark-common-test</a>: <span class='status-failure'>2</span></h3><ul><li><a href='https://builds.apache.org/job/carbondata-pr-spark-1.6/286/org.apache.carbondata$carbondata-spark-common-test/testReport/org.apache.carbondata.spark.testsuite.dataload/TestLoadDataFrame/test_load_dataframe_with_single_pass_enabled/'><strong>org.apache.carbondata.spark.testsuite.dataload.TestLoadDataFrame.test load dataframe with single pass enabled</strong></a></li><li><a href='https://builds.apache.org/job/carbondata-pr-spark-1.6/286/org.apache.carbondata$carbondata-spark-common-test/testReport/org.apache.carbo
ndata.spark.testsuite.dataload/TestLoadDataFrame/test_load_dataframe_with_single_pass_disabled/'><strong>org.apache.carbondata.spark.testsuite.dataload.TestLoadDataFrame.test load dataframe with single pass disabled</strong></a></li></ul>

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1013: [CARBONDATA-1154] IUD Performance Changes

In reply to this post by qiuchenjian-2

Github user asfgit commented on the issue:

https://github.com/apache/carbondata/pull/1013

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/carbondata-pr-spark-1.6/289/<h2>Failed Tests: <span class='status-failure'>1</span></h2><h3><a name='carbondata-pr-spark-1.6/org.apache.carbondata:carbondata-spark-common-test' /><a href='https://builds.apache.org/job/carbondata-pr-spark-1.6/289/org.apache.carbondata$carbondata-spark-common-test/testReport'>carbondata-pr-spark-1.6/org.apache.carbondata:carbondata-spark-common-test</a>: <span class='status-failure'>1</span></h3><ul><li><a href='https://builds.apache.org/job/carbondata-pr-spark-1.6/289/org.apache.carbondata$carbondata-spark-common-test/testReport/org.apache.carbondata.spark.testsuite.allqueries/InsertIntoCarbonTableTestCase/insert_into_carbon_table_from_carbon_table_union_query/'><strong>org.apache.carbondata.spark.testsuite.allqueries.InsertIntoCarbonTableTestCase.insert into carbon table from carbon table union query</strong></a></li></ul>

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1013: [CARBONDATA-1154] IUD Performance Changes

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1013

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2407/

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1013: [CARBONDATA-1154] IUD Performance Changes

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1013

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2410/

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1013: [CARBONDATA-1154] IUD Performance Changes

In reply to this post by qiuchenjian-2

Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1013#discussion_r121485733

--- Diff: core/src/main/java/org/apache/carbondata/core/datastore/SegmentTaskIndexStore.java ---
@@ -184,22 +184,29 @@ private SegmentTaskIndexWrapper loadAndGetTaskIdToSegmentsMap(
SegmentUpdateStatusManager updateStatusManager =
new SegmentUpdateStatusManager(absoluteTableIdentifier);
String segmentId = null;
+ UpdateVO updateVO = null;
TaskBucketHolder taskBucketHolder = null;
try {
while (iteratorOverSegmentBlocksInfos.hasNext()) {
+ // Initialize the UpdateVO to Null for each segment.
+ updateVO = null;
--- End diff --

Better move the line `UpdateVO updateVO = null;` to here.

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1013: [CARBONDATA-1154] IUD Performance Changes

In reply to this post by qiuchenjian-2

Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1013#discussion_r121486113

--- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/CarbonInputFormat.java ---
@@ -340,23 +340,37 @@ private static AbsoluteTableIdentifier getAbsoluteTableIdentifier(Configuration

List<InputSplit> result = new LinkedList<InputSplit>();
FilterExpressionProcessor filterExpressionProcessor = new FilterExpressionProcessor();
+ UpdateVO invalidBlockVOForSegmentId = null;
+ Boolean IUDTable = false;
--- End diff --

Use `isIUDTable`

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1013: [CARBONDATA-1154] IUD Performance Changes

In reply to this post by qiuchenjian-2

Github user sounakr commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1013#discussion_r121572267

--- Diff: core/src/main/java/org/apache/carbondata/core/datastore/SegmentTaskIndexStore.java ---
@@ -184,22 +184,29 @@ private SegmentTaskIndexWrapper loadAndGetTaskIdToSegmentsMap(
SegmentUpdateStatusManager updateStatusManager =
new SegmentUpdateStatusManager(absoluteTableIdentifier);
String segmentId = null;
+ UpdateVO updateVO = null;
TaskBucketHolder taskBucketHolder = null;
try {
while (iteratorOverSegmentBlocksInfos.hasNext()) {
+ // Initialize the UpdateVO to Null for each segment.
+ updateVO = null;
--- End diff --

Done.

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1013: [CARBONDATA-1154] IUD Performance Changes

In reply to this post by qiuchenjian-2

Github user sounakr commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1013#discussion_r121572395

--- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/CarbonInputFormat.java ---
@@ -340,23 +340,37 @@ private static AbsoluteTableIdentifier getAbsoluteTableIdentifier(Configuration

List<InputSplit> result = new LinkedList<InputSplit>();
FilterExpressionProcessor filterExpressionProcessor = new FilterExpressionProcessor();
+ UpdateVO invalidBlockVOForSegmentId = null;
+ Boolean IUDTable = false;
--- End diff --

Done

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1013: [CARBONDATA-1154] IUD Performance Changes

In reply to this post by qiuchenjian-2

Github user asfgit commented on the issue:

https://github.com/apache/carbondata/pull/1013

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/carbondata-pr-spark-1.6/299/

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1013: [CARBONDATA-1154] IUD Performance Changes

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1013

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2420/

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1013: [CARBONDATA-1154] IUD Performance Changes

In reply to this post by qiuchenjian-2

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1013

LGTM

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1013: [CARBONDATA-1154] IUD Performance Changes

In reply to this post by qiuchenjian-2

Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/1013

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1013: [CARBONDATA-1154] IUD Performance Changes

In reply to this post by qiuchenjian-2

Github user sehriff commented on the issue:

https://github.com/apache/carbondata/pull/1013

Which version did you base?I apply this pr to 1.1.0 and can not find some class ,like org.apache.carbondata.core.metadata.schema.PartitionInfo;
org.apache.carbondata.core.scan.partition.PartitionUtil;
...
and type incompatible
carbondata-apache-carbondata-1.1.0/core/src/main/java/org/apache/carbondata/core/datastore/SegmentTaskIndexStore.java:[225,93] ä¸å¼å®¹çç±»å: java.util.Map<org.apache.carbondata.core.datastore.SegmentTaskIndexStore1.TaskBucketHolder,org.apache.carbondata.core.datastore.block.AbstractIndex>æ æ³è½¬æ¢ä¸ºjava.util.Map<org.apache.carbondata.core.datastore.SegmentTaskIndexStore.TaskBucketHolder,org.apache.carbondata.core.datastore.block.AbstractIndex>

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---