GitHub user akashrn5 opened a pull request:
https://github.com/apache/carbondata/pull/2269 [WIP][LUCENE]close the lucene index reader after every task and clean the resource and other functional issues Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [ ] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/akashrn5/incubator-carbondata dfx Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2269.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2269 ---- commit 792ed11c176b47e95b63f9c6c81e39f852eec61d Author: akashrn5 <akashnilugal@...> Date: 2018-05-04T04:59:12Z close the lucene index reader after every task and clean the resource and other functional issues ---- --- |
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2269 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4475/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2269 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5635/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2269 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5697/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2269 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4537/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2269 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5710/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2269 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4548/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2269 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4775/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2269 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4789/ --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2269#discussion_r186685351 --- Diff: core/src/main/java/org/apache/carbondata/core/datamap/TableDataMap.java --- @@ -133,6 +133,10 @@ return distributables; } + public List<DataMap> getTableDataMaps(DataMapDistributable distributable) throws IOException { --- End diff -- please add comment --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2269#discussion_r186685606 --- Diff: datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMap.java --- @@ -302,9 +302,20 @@ public boolean isScanRequired(FilterResolverIntf filterExp) { /** * Clear complete index table and release memory. */ - @Override - public void clear() { - + @Override public void clear() { --- End diff -- move @Override to previous line --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2269#discussion_r186685739 --- Diff: datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMap.java --- @@ -302,9 +302,20 @@ public boolean isScanRequired(FilterResolverIntf filterExp) { /** * Clear complete index table and release memory. */ - @Override - public void clear() { - + @Override public void clear() { + if (null != indexReader) { + try { + int referenceCount = indexReader.getRefCount(); + if (referenceCount > 0) { + indexReader.decRef(); --- End diff -- Is it a thread safe operation? --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2269#discussion_r186685941 --- Diff: datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMap.java --- @@ -302,9 +302,20 @@ public boolean isScanRequired(FilterResolverIntf filterExp) { /** * Clear complete index table and release memory. */ - @Override - public void clear() { - + @Override public void clear() { + if (null != indexReader) { + try { + int referenceCount = indexReader.getRefCount(); + if (referenceCount > 0) { + indexReader.decRef(); --- End diff -- what if it is equal to 0, who will close it? --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2269#discussion_r186686363 --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/AbstractDataMapJob.java --- @@ -36,7 +36,7 @@ } @Override public List<ExtendedBlocklet> execute(DistributableDataMapFormat dataMapFormat, - FilterResolverIntf resolverIntf) { + FilterResolverIntf resolverIntf, CarbonTable carbonTable) { --- End diff -- why carbonTable is added? I could not find where it is used --- |
In reply to this post by qiuchenjian-2
Github user akashrn5 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2269#discussion_r186694178 --- Diff: datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMap.java --- @@ -302,9 +302,20 @@ public boolean isScanRequired(FilterResolverIntf filterExp) { /** * Clear complete index table and release memory. */ - @Override - public void clear() { - + @Override public void clear() { + if (null != indexReader) { + try { + int referenceCount = indexReader.getRefCount(); + if (referenceCount > 0) { + indexReader.decRef(); --- End diff -- close index reader will internally call decRef() where it will check for reference count and once all the references comes decreases to zero it closes the reader. If you directly call the close , it will call decRef() which can throw "this IndexReader is already closed" Exception if the reference count is zero, they it is recommended from lucene that , first get reference count and then we can close the reader. and it is therad safe (Reference:https://lucene.apache.org/core/6_0_1/core/org/apache/lucene/index/IndexReader.html) --- |
In reply to this post by qiuchenjian-2
Github user akashrn5 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2269#discussion_r186694695 --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/AbstractDataMapJob.java --- @@ -36,7 +36,7 @@ } @Override public List<ExtendedBlocklet> execute(DistributableDataMapFormat dataMapFormat, - FilterResolverIntf resolverIntf) { + FilterResolverIntf resolverIntf, CarbonTable carbonTable) { --- End diff -- it is used in DataMapPruneRDD and that this carbonTable is used in taskCompletionListener in internalCompute --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2269#discussion_r186723292 --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/DistributableDataMapFormat.java --- @@ -100,14 +103,18 @@ private static FilterResolverIntf getFilterExp(Configuration configuration) thro return new RecordReader<Void, ExtendedBlocklet>() { private Iterator<ExtendedBlocklet> blockletIterator; private ExtendedBlocklet currBlocklet; + private List<DataMap> dataMaps; @Override public void initialize(InputSplit inputSplit, TaskAttemptContext taskAttemptContext) throws IOException, InterruptedException { - DataMapDistributableWrapper distributable = (DataMapDistributableWrapper) inputSplit; - TableDataMap dataMap = DataMapStoreManager.getInstance() + distributable = (DataMapDistributableWrapper) inputSplit; + TableDataMap tableDataMap = DataMapStoreManager.getInstance() .getDataMap(table, distributable.getDistributable().getDataMapSchema()); - List<ExtendedBlocklet> blocklets = dataMap.prune(distributable.getDistributable(), - dataMapExprWrapper.getFilterResolverIntf(distributable.getUniqueId()), partitions); + dataMaps = tableDataMap.getTableDataMaps(distributable.getDistributable()); + List<ExtendedBlocklet> blocklets = tableDataMap + .prune(dataMaps, --- End diff -- I am not sure why need to pass dataMaps to the prune function. And `tableDataMap.getTableDataMaps` seems strange --- |
In reply to this post by qiuchenjian-2
Github user akashrn5 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2269#discussion_r186724944 --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/DistributableDataMapFormat.java --- @@ -100,14 +103,18 @@ private static FilterResolverIntf getFilterExp(Configuration configuration) thro return new RecordReader<Void, ExtendedBlocklet>() { private Iterator<ExtendedBlocklet> blockletIterator; private ExtendedBlocklet currBlocklet; + private List<DataMap> dataMaps; @Override public void initialize(InputSplit inputSplit, TaskAttemptContext taskAttemptContext) throws IOException, InterruptedException { - DataMapDistributableWrapper distributable = (DataMapDistributableWrapper) inputSplit; - TableDataMap dataMap = DataMapStoreManager.getInstance() + distributable = (DataMapDistributableWrapper) inputSplit; + TableDataMap tableDataMap = DataMapStoreManager.getInstance() .getDataMap(table, distributable.getDistributable().getDataMapSchema()); - List<ExtendedBlocklet> blocklets = dataMap.prune(distributable.getDistributable(), - dataMapExprWrapper.getFilterResolverIntf(distributable.getUniqueId()), partitions); + dataMaps = tableDataMap.getTableDataMaps(distributable.getDistributable()); + List<ExtendedBlocklet> blocklets = tableDataMap + .prune(dataMaps, --- End diff -- it is just refactoring , so that we can call close reader for the datamaps in the distributable, Initially there was a line of code to getDataMaps in Prune method inside TabledataMap.java,and we needed all datamaps of that distributable to call close reader, so its just a little refractoring to to this --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2269#discussion_r186743361 --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/AbstractDataMapJob.java --- @@ -36,7 +36,7 @@ } @Override public List<ExtendedBlocklet> execute(DistributableDataMapFormat dataMapFormat, - FilterResolverIntf resolverIntf) { + FilterResolverIntf resolverIntf, CarbonTable carbonTable) { --- End diff -- No need to pass carbonTable here, it is already present in `DistributableDataMapFormat`, you can get from there --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2269#discussion_r186744138 --- Diff: datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMapFactory.java --- @@ -109,4 +109,8 @@ public DataMapLevel getDataMapLevel() { return false; } } + + @Override public void clear() { --- End diff -- why empty method overridden? --- |
Free forum by Nabble | Edit this page |