Apache CarbonData Dev Mailing List archive › Apache CarbonData JIRA issues

[GitHub] carbondata pull request #2269: [WIP][LUCENE]close the lucene index reader af...

Classic

List

54 messages Options

Options

123

[GitHub] carbondata pull request #2269: [WIP][LUCENE]close the lucene index reader af...

GitHub user akashrn5 opened a pull request:

https://github.com/apache/carbondata/pull/2269

[WIP][LUCENE]close the lucene index reader after every task and clean the resource and other functional issues

Be sure to do all of the following checklist to help us incorporate
your contribution quickly and easily:

- [ ] Any interfaces changed?

- [ ] Any backward compatibility impacted?

- [ ] Document update required?

- [ ] Testing done
Please provide details on
- Whether new unit test cases have been added or why no new tests are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance test report.
- Any additional information to help reviewers in testing this change.

- [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/akashrn5/incubator-carbondata dfx

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/2269.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2269

----
commit 792ed11c176b47e95b63f9c6c81e39f852eec61d
Author: akashrn5 <akashnilugal@...>
Date: 2018-05-04T04:59:12Z

close the lucene index reader after every task and clean the resource and other functional issues

----

---

[GitHub] carbondata issue #2269: [WIP][LUCENE]close the lucene index reader after eve...

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2269

Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4475/

---

[GitHub] carbondata issue #2269: [WIP][LUCENE]close the lucene index reader after eve...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2269

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5635/

---

[GitHub] carbondata issue #2269: [WIP][LUCENE]close the lucene index reader after eve...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2269

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5697/

---

[GitHub] carbondata issue #2269: [WIP][LUCENE]close the lucene index reader after eve...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2269

Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4537/

---

[GitHub] carbondata issue #2269: [WIP][LUCENE]close the lucene index reader after eve...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2269

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5710/

---

[GitHub] carbondata issue #2269: [WIP][LUCENE]close the lucene index reader after eve...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2269

Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4548/

---

[GitHub] carbondata issue #2269: [WIP][LUCENE]close the lucene index reader after eve...

In reply to this post by qiuchenjian-2

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2269

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4775/

---

[GitHub] carbondata issue #2269: [WIP][LUCENE]close the lucene index reader after eve...

In reply to this post by qiuchenjian-2

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2269

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4789/

---

[GitHub] carbondata pull request #2269: [CARBONDATA-2433][LUCENE]close the lucene ind...

In reply to this post by qiuchenjian-2

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2269#discussion_r186685351

--- Diff: core/src/main/java/org/apache/carbondata/core/datamap/TableDataMap.java ---
@@ -133,6 +133,10 @@
return distributables;
}

+ public List<DataMap> getTableDataMaps(DataMapDistributable distributable) throws IOException {
--- End diff --

please add comment

---

[GitHub] carbondata pull request #2269: [CARBONDATA-2433][LUCENE]close the lucene ind...

In reply to this post by qiuchenjian-2

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2269#discussion_r186685606

--- Diff: datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMap.java ---
@@ -302,9 +302,20 @@ public boolean isScanRequired(FilterResolverIntf filterExp) {
/**
* Clear complete index table and release memory.
*/
- @Override
- public void clear() {
-
+ @Override public void clear() {
--- End diff --

move @Override to previous line

---

[GitHub] carbondata pull request #2269: [CARBONDATA-2433][LUCENE]close the lucene ind...

In reply to this post by qiuchenjian-2

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2269#discussion_r186685739

--- Diff: datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMap.java ---
@@ -302,9 +302,20 @@ public boolean isScanRequired(FilterResolverIntf filterExp) {
/**
* Clear complete index table and release memory.
*/
- @Override
- public void clear() {
-
+ @Override public void clear() {
+ if (null != indexReader) {
+ try {
+ int referenceCount = indexReader.getRefCount();
+ if (referenceCount > 0) {
+ indexReader.decRef();
--- End diff --

Is it a thread safe operation?

---

[GitHub] carbondata pull request #2269: [CARBONDATA-2433][LUCENE]close the lucene ind...

In reply to this post by qiuchenjian-2

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2269#discussion_r186685941

--- Diff: datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMap.java ---
@@ -302,9 +302,20 @@ public boolean isScanRequired(FilterResolverIntf filterExp) {
/**
* Clear complete index table and release memory.
*/
- @Override
- public void clear() {
-
+ @Override public void clear() {
+ if (null != indexReader) {
+ try {
+ int referenceCount = indexReader.getRefCount();
+ if (referenceCount > 0) {
+ indexReader.decRef();
--- End diff --

what if it is equal to 0, who will close it?

---

[GitHub] carbondata pull request #2269: [CARBONDATA-2433][LUCENE]close the lucene ind...

In reply to this post by qiuchenjian-2

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2269#discussion_r186686363

--- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/AbstractDataMapJob.java ---
@@ -36,7 +36,7 @@
}

@Override public List<ExtendedBlocklet> execute(DistributableDataMapFormat dataMapFormat,
- FilterResolverIntf resolverIntf) {
+ FilterResolverIntf resolverIntf, CarbonTable carbonTable) {
--- End diff --

why carbonTable is added? I could not find where it is used

---

[GitHub] carbondata pull request #2269: [CARBONDATA-2433][LUCENE]close the lucene ind...

In reply to this post by qiuchenjian-2

Github user akashrn5 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2269#discussion_r186694178

--- Diff: datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMap.java ---
@@ -302,9 +302,20 @@ public boolean isScanRequired(FilterResolverIntf filterExp) {
/**
* Clear complete index table and release memory.
*/
- @Override
- public void clear() {
-
+ @Override public void clear() {
+ if (null != indexReader) {
+ try {
+ int referenceCount = indexReader.getRefCount();
+ if (referenceCount > 0) {
+ indexReader.decRef();
--- End diff --

close index reader will internally call decRef() where it will check for reference count and once all the references comes decreases to zero it closes the reader.
If you directly call the close , it will call decRef() which can throw "this IndexReader is already closed" Exception if the reference count is zero, they it is recommended from lucene that , first get reference count and then we can close the reader.
and it is therad safe (Reference:https://lucene.apache.org/core/6_0_1/core/org/apache/lucene/index/IndexReader.html)

---

[GitHub] carbondata pull request #2269: [CARBONDATA-2433][LUCENE]close the lucene ind...

In reply to this post by qiuchenjian-2

Github user akashrn5 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2269#discussion_r186694695

--- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/AbstractDataMapJob.java ---
@@ -36,7 +36,7 @@
}

@Override public List<ExtendedBlocklet> execute(DistributableDataMapFormat dataMapFormat,
- FilterResolverIntf resolverIntf) {
+ FilterResolverIntf resolverIntf, CarbonTable carbonTable) {
--- End diff --

it is used in DataMapPruneRDD and that this carbonTable is used in taskCompletionListener in internalCompute

---

[GitHub] carbondata pull request #2269: [CARBONDATA-2433][LUCENE]close the lucene ind...

In reply to this post by qiuchenjian-2

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2269#discussion_r186723292

--- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/DistributableDataMapFormat.java ---
@@ -100,14 +103,18 @@ private static FilterResolverIntf getFilterExp(Configuration configuration) thro
return new RecordReader<Void, ExtendedBlocklet>() {
private Iterator<ExtendedBlocklet> blockletIterator;
private ExtendedBlocklet currBlocklet;
+ private List<DataMap> dataMaps;

@Override public void initialize(InputSplit inputSplit, TaskAttemptContext taskAttemptContext)
throws IOException, InterruptedException {
- DataMapDistributableWrapper distributable = (DataMapDistributableWrapper) inputSplit;
- TableDataMap dataMap = DataMapStoreManager.getInstance()
+ distributable = (DataMapDistributableWrapper) inputSplit;
+ TableDataMap tableDataMap = DataMapStoreManager.getInstance()
.getDataMap(table, distributable.getDistributable().getDataMapSchema());
- List<ExtendedBlocklet> blocklets = dataMap.prune(distributable.getDistributable(),
- dataMapExprWrapper.getFilterResolverIntf(distributable.getUniqueId()), partitions);
+ dataMaps = tableDataMap.getTableDataMaps(distributable.getDistributable());
+ List<ExtendedBlocklet> blocklets = tableDataMap
+ .prune(dataMaps,
--- End diff --

I am not sure why need to pass dataMaps to the prune function. And `tableDataMap.getTableDataMaps` seems strange

---

[GitHub] carbondata pull request #2269: [CARBONDATA-2433][LUCENE]close the lucene ind...

In reply to this post by qiuchenjian-2

Github user akashrn5 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2269#discussion_r186724944

--- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/DistributableDataMapFormat.java ---
@@ -100,14 +103,18 @@ private static FilterResolverIntf getFilterExp(Configuration configuration) thro
return new RecordReader<Void, ExtendedBlocklet>() {
private Iterator<ExtendedBlocklet> blockletIterator;
private ExtendedBlocklet currBlocklet;
+ private List<DataMap> dataMaps;

@Override public void initialize(InputSplit inputSplit, TaskAttemptContext taskAttemptContext)
throws IOException, InterruptedException {
- DataMapDistributableWrapper distributable = (DataMapDistributableWrapper) inputSplit;
- TableDataMap dataMap = DataMapStoreManager.getInstance()
+ distributable = (DataMapDistributableWrapper) inputSplit;
+ TableDataMap tableDataMap = DataMapStoreManager.getInstance()
.getDataMap(table, distributable.getDistributable().getDataMapSchema());
- List<ExtendedBlocklet> blocklets = dataMap.prune(distributable.getDistributable(),
- dataMapExprWrapper.getFilterResolverIntf(distributable.getUniqueId()), partitions);
+ dataMaps = tableDataMap.getTableDataMaps(distributable.getDistributable());
+ List<ExtendedBlocklet> blocklets = tableDataMap
+ .prune(dataMaps,
--- End diff --

it is just refactoring , so that we can call close reader for the datamaps in the distributable,
Initially there was a line of code to getDataMaps in Prune method inside TabledataMap.java,and we needed all datamaps of that distributable to call close reader, so its just a little refractoring to to this

---

[GitHub] carbondata pull request #2269: [CARBONDATA-2433][LUCENE]close the lucene ind...

In reply to this post by qiuchenjian-2

Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2269#discussion_r186743361

--- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/AbstractDataMapJob.java ---
@@ -36,7 +36,7 @@
}

@Override public List<ExtendedBlocklet> execute(DistributableDataMapFormat dataMapFormat,
- FilterResolverIntf resolverIntf) {
+ FilterResolverIntf resolverIntf, CarbonTable carbonTable) {
--- End diff --

No need to pass carbonTable here, it is already present in `DistributableDataMapFormat`, you can get from there

---

[GitHub] carbondata pull request #2269: [CARBONDATA-2433][LUCENE]close the lucene ind...

In reply to this post by qiuchenjian-2

Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2269#discussion_r186744138

--- Diff: datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMapFactory.java ---
@@ -109,4 +109,8 @@ public DataMapLevel getDataMapLevel() {
return false;
}
}
+
+ @Override public void clear() {
--- End diff --

why empty method overridden?

---

123