Apache CarbonData Dev Mailing List archive › Apache CarbonData JIRA issues

[GitHub] carbondata pull request #2985: [HOTFIX] Fixed Query performance issue

Classic

List

15 messages Options

Options

[GitHub] carbondata pull request #2985: [HOTFIX] Fixed Query performance issue

GitHub user kumarvishal09 opened a pull request:

https://github.com/apache/carbondata/pull/2985

[HOTFIX] Fixed Query performance issue

### Problem
When some pages is giving 0 rows, then also BlockletScanResult is uncompressing all the pages. When compression is high and one blocklet contains more number of pages in this case page uncompression is taking more time and impacting query perfornace.

### Solution
Added check if number of record after filtering is 0 then no need to uncompress that page

- [ ] Any interfaces changed?

- [ ] Any backward compatibility impacted?

- [ ] Document update required?

- [ ] Testing done
Please provide details on
- Whether new unit test cases have been added or why no new tests are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance test report.
- Any additional information to help reviewers in testing this change.

- [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kumarvishal09/incubator-carbondata fixQueryPerf

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/2985.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2985

----
commit 77474b31cfa984fd371870176eceb678f8cb1bb4
Author: kumarvishal09 <kumarvishal1802@...>
Date: 2018-12-13T06:10:18Z

Fixed Query performance issue

----

---

[GitHub] carbondata issue #2985: [HOTFIX] Fixed Query performance issue

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2985

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1729/

---

[GitHub] carbondata issue #2985: [HOTFIX] Fixed Query performance issue

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2985

Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9989/

---

[GitHub] carbondata issue #2985: [HOTFIX] Fixed Query performance issue

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2985

Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1941/

---

[GitHub] carbondata pull request #2985: [HOTFIX] Fixed Query performance issue

In reply to this post by qiuchenjian-2

Github user qiuchenjian commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2985#discussion_r241285610

--- Diff: core/src/main/java/org/apache/carbondata/core/scan/result/BlockletScannedResult.java ---
@@ -663,6 +663,12 @@ public boolean hasNext() {
return true;
} else if (pageCounter < pageFilteredRowCount.length) {
pageCounter++;
+ if(pageCounter >= pageFilteredRowCount.length) {
--- End diff --

fillDataChunks calls freeDataChunkMemory(), but your changes skip this operation, i think "filteredRowCount == 0" has no problem, but the last page need call this method to free memory

---

[GitHub] carbondata issue #2985: [HOTFIX] Fixed Query performance issue

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2985

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1733/

---

[GitHub] carbondata issue #2985: [HOTFIX] Fixed Query performance issue

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2985

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1945/

---

[GitHub] carbondata issue #2985: [HOTFIX] Fixed Query performance issue

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2985

Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9993/

---

[GitHub] carbondata issue #2985: [HOTFIX] Fixed Query performance issue

In reply to this post by qiuchenjian-2

Github user kumarvishal09 commented on the issue:

https://github.com/apache/carbondata/pull/2985

retest this please

---

[GitHub] carbondata pull request #2985: [HOTFIX] Fixed Query performance issue

In reply to this post by qiuchenjian-2

Github user kumarvishal09 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2985#discussion_r241978586

--- Diff: core/src/main/java/org/apache/carbondata/core/scan/result/BlockletScannedResult.java ---
@@ -663,6 +663,12 @@ public boolean hasNext() {
return true;
} else if (pageCounter < pageFilteredRowCount.length) {
pageCounter++;
+ if(pageCounter >= pageFilteredRowCount.length) {
--- End diff --

last page will be cleared from org.apache.carbondata.core.scan.processor.DataBlockIterator#scannedResult

---

[GitHub] carbondata issue #2985: [HOTFIX] Fixed Query performance issue

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2985

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1781/

---

[GitHub] carbondata issue #2985: [HOTFIX] Fixed Query performance issue

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2985

Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10041/

---

[GitHub] carbondata issue #2985: [HOTFIX] Fixed Query performance issue

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2985

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1994/

---

[GitHub] carbondata issue #2985: [HOTFIX] Fixed Query performance issue

In reply to this post by qiuchenjian-2

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2985

LGTM

---

[GitHub] carbondata pull request #2985: [HOTFIX] Fixed Query performance issue

In reply to this post by qiuchenjian-2

Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/2985

---