Apache CarbonData Dev Mailing List archive › Apache CarbonData JIRA issues

[GitHub] carbondata pull request #2322: [CARBONDATA-2472] Fixed:Refactor NonTransacti...

Classic

List

17 messages Options

Options

[GitHub] carbondata pull request #2322: [CARBONDATA-2472] Fixed:Refactor NonTransacti...

GitHub user ajantha-bhat opened a pull request:

https://github.com/apache/carbondata/pull/2322

[CARBONDATA-2472] Fixed:Refactor NonTransactional table code for Index file IO performance

[CARBONDATA-2472] Fixed:Refactor NonTransactional table code for Index file IO performance

problem: For NonTransactional table, for each query, 2 times IO happens (one time for actual blocklet load and one time for validation of schema) This results in more IO time.

Solution: At the time of blocklet load itself do the validation. So, IO will be only one time.

- [ ] Any interfaces changed? NA

- [ ] Any backward compatibility impacted? NA

- [ ] Document update required? NA

- [ ] Testing done
Already UT for this scenarios
- [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. NA

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ajantha-bhat/carbondata unmanaged_table

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/2322.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2322

----
commit 591966f2e4a4ea50f11fc2b591366a155b204be1
Author: ajantha-bhat <ajanthabhat@...>
Date: 2018-05-18T13:37:46Z

[CARBONDATA-2472] Fixed:Refactor NonTransactional table code for Index file IO performance

----

---

[GitHub] carbondata issue #2322: [CARBONDATA-2472] Fixed:Refactor NonTransactional ta...

Github user ajantha-bhat commented on the issue:

https://github.com/apache/carbondata/pull/2322

@gvramana : please review.

---

[GitHub] carbondata issue #2322: [CARBONDATA-2472] Fixed:Refactor NonTransactional ta...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2322

Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4816/

---

[GitHub] carbondata issue #2322: [CARBONDATA-2472] Fixed:Refactor NonTransactional ta...

In reply to this post by qiuchenjian-2

Github user ajantha-bhat commented on the issue:

https://github.com/apache/carbondata/pull/2322

retest this please

---

[GitHub] carbondata issue #2322: [CARBONDATA-2472] Fixed:Refactor NonTransactional ta...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2322

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5972/

---

[GitHub] carbondata issue #2322: [CARBONDATA-2472] Fixed:Refactor NonTransactional ta...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2322

Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4817/

---

[GitHub] carbondata issue #2322: [CARBONDATA-2472] Fixed:Refactor NonTransactional ta...

In reply to this post by qiuchenjian-2

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2322

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4998/

---

[GitHub] carbondata issue #2322: [CARBONDATA-2472] Fixed:Refactor NonTransactional ta...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2322

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5976/

---

[GitHub] carbondata issue #2322: [CARBONDATA-2472] Fixed:Refactor NonTransactional ta...

In reply to this post by qiuchenjian-2

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2322

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4999/

---

[GitHub] carbondata pull request #2322: [CARBONDATA-2472] Fixed:Refactor NonTransacti...

In reply to this post by qiuchenjian-2

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2322#discussion_r189477789

--- Diff: core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java ---
@@ -191,7 +143,7 @@ private boolean isSameColumnSchemaList(List<ColumnSchema> indexFileColumnList,
return detailedBlocklets;
}
Set<TableBlockIndexUniqueIdentifier> identifiers =
- getTableBlockIndexUniqueIdentifiers(segment);
+ getTableBlockIndexUniqueIdentifiers(segment, this.getCarbonTable().getTableUniqueName());
--- End diff --

I think it is better to move getTableBlockIndexUniqueIdentifiers function to `CarbonTable`

---

[GitHub] carbondata pull request #2322: [CARBONDATA-2472] Fixed:Refactor NonTransacti...

In reply to this post by qiuchenjian-2

Github user manishgupta88 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2322#discussion_r189495300

--- Diff: core/src/main/java/org/apache/carbondata/core/util/BlockletDataMapUtil.java ---
@@ -67,7 +76,22 @@
List<DataFileFooter> indexInfo = fileFooterConverter.getIndexInfo(
identifier.getIndexFilePath() + CarbonCommonConstants.FILE_SEPARATOR + identifier
.getIndexFileName(), indexFileStore.getFileData(identifier.getIndexFileName()));
+ CarbonTable carbonTable =
+ CarbonMetadata.getInstance().getCarbonTable(identifier.getTableUniqueName());
--- End diff --

Avoid getting carbonTable using CarbonMetadata.getInstance as it can create problem in concurrent scenarios

---

[GitHub] carbondata pull request #2322: [CARBONDATA-2472] Fixed:Refactor NonTransacti...

In reply to this post by qiuchenjian-2

Github user ajantha-bhat commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2322#discussion_r189502471

--- Diff: core/src/main/java/org/apache/carbondata/core/util/BlockletDataMapUtil.java ---
@@ -67,7 +76,22 @@
List<DataFileFooter> indexInfo = fileFooterConverter.getIndexInfo(
identifier.getIndexFilePath() + CarbonCommonConstants.FILE_SEPARATOR + identifier
.getIndexFileName(), indexFileStore.getFileData(identifier.getIndexFileName()));
+ CarbonTable carbonTable =
+ CarbonMetadata.getInstance().getCarbonTable(identifier.getTableUniqueName());
--- End diff --

@manishgupta88 : If your concern is that schema can be modified. I can make this carbon table access only for NonTransactional tables as it is required only for NonTransactional table and schema modify (IUD) is not supported for NonTransactional table

---

[GitHub] carbondata pull request #2322: [CARBONDATA-2472] Fixed:Refactor NonTransacti...

In reply to this post by qiuchenjian-2

Github user ajantha-bhat commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2322#discussion_r189509573

--- Diff: core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java ---
@@ -191,7 +143,7 @@ private boolean isSameColumnSchemaList(List<ColumnSchema> indexFileColumnList,
return detailedBlocklets;
}
Set<TableBlockIndexUniqueIdentifier> identifiers =
- getTableBlockIndexUniqueIdentifiers(segment);
+ getTableBlockIndexUniqueIdentifiers(segment, this.getCarbonTable().getTableUniqueName());
--- End diff --

Can do that. But..

getTableBlockIndexUniqueIdentifiers is depednent on segmentMap of BlockletDataMapFactory. DeCoupling segmentMap from BlockletDataMapFactory needs more effrot. can handle in separate PR.

---

[GitHub] carbondata issue #2322: [CARBONDATA-2472] Fixed:Refactor NonTransactional ta...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2322

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6010/

---

[GitHub] carbondata issue #2322: [CARBONDATA-2472] Fixed:Refactor NonTransactional ta...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2322

Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4852/

---

[GitHub] carbondata issue #2322: [CARBONDATA-2472] Fixed:Refactor NonTransactional ta...

In reply to this post by qiuchenjian-2

Github user ajantha-bhat commented on the issue:

https://github.com/apache/carbondata/pull/2322

Hanldled in #2435

---

[GitHub] carbondata pull request #2322: [CARBONDATA-2472] Fixed:Refactor NonTransacti...

In reply to this post by qiuchenjian-2

Github user ajantha-bhat closed the pull request at:

https://github.com/apache/carbondata/pull/2322

---