[GitHub] carbondata pull request #2944: [CARBONDATA-3122]CarbonReader memory leak

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2944: [CARBONDATA-3122]CarbonReader memory leak

qiuchenjian-2
GitHub user BJangir opened a pull request:

    https://github.com/apache/carbondata/pull/2944

    [CARBONDATA-3122]CarbonReader memory leak

    **Issue  Detail**
     CarbonReader  has List of initialized RecordReader for each Split  and each split holds page data till the reference of RecordReader is present in the List . Same is applicable for GC once user comes out from his/her calling method ( not cleaned even in `close()` ) but till then from each split , last page will be in memory which is not correct.  For ex.  if 1K carbon files then last page ( ~32K * 100 ,size if 100 String columns in memory ) of each file will be in memory till last split so total ~3GB memory will be occupied ( 1K * 32K * 100 .
    Check heap dump of 3 split after `reader.close()` is called ,It is be seen that currentreader+all list reader are still holding memory.
    ![image](https://user-images.githubusercontent.com/12861989/48916831-e09bf100-eea9-11e8-9b58-7a4ed572d72e.png)
   
    ![image](https://user-images.githubusercontent.com/12861989/48917034-d29aa000-eeaa-11e8-8683-666f6f6e57c9.png)
   
   
    **Solution**
    1. Once reader is finished assign `currentReader` to `null` in RecordReader List.  
    OR
    2. Assign future object as `null` in org.apache.carbondata.core.scan.processor.DataBlockIterator#close()
     Solution 2 is adopted so that it will give benefit  to other than CarbonReader Flow.
   
    **After Fix**
   
    ![image](https://user-images.githubusercontent.com/12861989/48917009-bd257600-eeaa-11e8-85f6-9e69bdda1908.png)
   
    Be sure to do all of the following checklist to help us incorporate
    your contribution quickly and easily:
   
     - [ ] Any interfaces changed?
     NA
     - [ ] Any backward compatibility impacted?
     NA
     - [ ] Document update required?
    NA
     - [ ] Testing done
           Manual Test
           
     - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
   
    NA

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/BJangir/incubator-carbondata reader_mem_leak

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/2944.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2944
   
----
commit 198c042251f1269a75de51d36d42e5bcd23fe651
Author: BJangir <babulaljangir111@...>
Date:   2018-11-22T17:04:32Z

    [CARBONDATA-3122]CarbonReader memory leak

----


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2944: [CARBONDATA-3122]CarbonReader memory leak

qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2944
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1516/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2944: [CARBONDATA-3122]CarbonReader memory leak

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2944
 
    Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1726/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2944: [CARBONDATA-3122]CarbonReader memory leak

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2944
 
    Build Failed  with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9774/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2944: [CARBONDATA-3122]CarbonReader memory leak

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user xubo245 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2944#discussion_r235835645
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/scan/processor/DataBlockIterator.java ---
    @@ -262,6 +262,7 @@ public void close() {
             if (blockletScannedResult != null) {
               blockletScannedResult.freeMemory();
             }
    +        future=null;
    --- End diff --
   
    please add white space before/after =


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2944: [CARBONDATA-3122]CarbonReader memory leak

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user BJangir commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2944#discussion_r235866771
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/scan/processor/DataBlockIterator.java ---
    @@ -262,6 +262,7 @@ public void close() {
             if (blockletScannedResult != null) {
               blockletScannedResult.freeMemory();
             }
    +        future=null;
    --- End diff --
   
    Done.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2944: [CARBONDATA-3122]CarbonReader memory leak

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2944
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1521/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2944: [CARBONDATA-3122]CarbonReader memory leak

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2944
 
    Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1731/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2944: [CARBONDATA-3122]CarbonReader memory leak

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2944
 
    Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9779/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2944: [CARBONDATA-3122]CarbonReader memory leak

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2944
 
    LGTM


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2944: [CARBONDATA-3122]CarbonReader memory leak

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user asfgit closed the pull request at:

    https://github.com/apache/carbondata/pull/2944


---