GitHub user ajantha-bhat opened a pull request:
https://github.com/apache/carbondata/pull/2322 [CARBONDATA-2472] Fixed:Refactor NonTransactional table code for Index file IO performance [CARBONDATA-2472] Fixed:Refactor NonTransactional table code for Index file IO performance problem: For NonTransactional table, for each query, 2 times IO happens (one time for actual blocklet load and one time for validation of schema) This results in more IO time. Solution: At the time of blocklet load itself do the validation. So, IO will be only one time. - [ ] Any interfaces changed? NA - [ ] Any backward compatibility impacted? NA - [ ] Document update required? NA - [ ] Testing done Already UT for this scenarios - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. NA You can merge this pull request into a Git repository by running: $ git pull https://github.com/ajantha-bhat/carbondata unmanaged_table Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2322.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2322 ---- commit 591966f2e4a4ea50f11fc2b591366a155b204be1 Author: ajantha-bhat <ajanthabhat@...> Date: 2018-05-18T13:37:46Z [CARBONDATA-2472] Fixed:Refactor NonTransactional table code for Index file IO performance ---- --- |
Github user ajantha-bhat commented on the issue:
https://github.com/apache/carbondata/pull/2322 @gvramana : please review. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2322 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4816/ --- |
In reply to this post by qiuchenjian-2
Github user ajantha-bhat commented on the issue:
https://github.com/apache/carbondata/pull/2322 retest this please --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2322 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5972/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2322 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4817/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2322 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4998/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2322 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5976/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2322 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4999/ --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2322#discussion_r189477789 --- Diff: core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java --- @@ -191,7 +143,7 @@ private boolean isSameColumnSchemaList(List<ColumnSchema> indexFileColumnList, return detailedBlocklets; } Set<TableBlockIndexUniqueIdentifier> identifiers = - getTableBlockIndexUniqueIdentifiers(segment); + getTableBlockIndexUniqueIdentifiers(segment, this.getCarbonTable().getTableUniqueName()); --- End diff -- I think it is better to move getTableBlockIndexUniqueIdentifiers function to `CarbonTable` --- |
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2322#discussion_r189495300 --- Diff: core/src/main/java/org/apache/carbondata/core/util/BlockletDataMapUtil.java --- @@ -67,7 +76,22 @@ List<DataFileFooter> indexInfo = fileFooterConverter.getIndexInfo( identifier.getIndexFilePath() + CarbonCommonConstants.FILE_SEPARATOR + identifier .getIndexFileName(), indexFileStore.getFileData(identifier.getIndexFileName())); + CarbonTable carbonTable = + CarbonMetadata.getInstance().getCarbonTable(identifier.getTableUniqueName()); --- End diff -- Avoid getting carbonTable using CarbonMetadata.getInstance as it can create problem in concurrent scenarios --- |
In reply to this post by qiuchenjian-2
Github user ajantha-bhat commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2322#discussion_r189502471 --- Diff: core/src/main/java/org/apache/carbondata/core/util/BlockletDataMapUtil.java --- @@ -67,7 +76,22 @@ List<DataFileFooter> indexInfo = fileFooterConverter.getIndexInfo( identifier.getIndexFilePath() + CarbonCommonConstants.FILE_SEPARATOR + identifier .getIndexFileName(), indexFileStore.getFileData(identifier.getIndexFileName())); + CarbonTable carbonTable = + CarbonMetadata.getInstance().getCarbonTable(identifier.getTableUniqueName()); --- End diff -- @manishgupta88 : If your concern is that schema can be modified. I can make this carbon table access only for NonTransactional tables as it is required only for NonTransactional table and schema modify (IUD) is not supported for NonTransactional table --- |
In reply to this post by qiuchenjian-2
Github user ajantha-bhat commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2322#discussion_r189509573 --- Diff: core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java --- @@ -191,7 +143,7 @@ private boolean isSameColumnSchemaList(List<ColumnSchema> indexFileColumnList, return detailedBlocklets; } Set<TableBlockIndexUniqueIdentifier> identifiers = - getTableBlockIndexUniqueIdentifiers(segment); + getTableBlockIndexUniqueIdentifiers(segment, this.getCarbonTable().getTableUniqueName()); --- End diff -- Can do that. But.. getTableBlockIndexUniqueIdentifiers is depednent on segmentMap of BlockletDataMapFactory. DeCoupling segmentMap from BlockletDataMapFactory needs more effrot. can handle in separate PR. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2322 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6010/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2322 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4852/ --- |
In reply to this post by qiuchenjian-2
Github user ajantha-bhat commented on the issue:
https://github.com/apache/carbondata/pull/2322 Hanldled in #2435 --- |
In reply to this post by qiuchenjian-2
Github user ajantha-bhat closed the pull request at:
https://github.com/apache/carbondata/pull/2322 --- |
Free forum by Nabble | Edit this page |