Login  Register

[GitHub] [carbondata] jack86596 opened a new pull request #4057: [CARBONDATA-4088] Drop metacache didn't clear some cache information …

Posted by GitBox on Dec 17, 2020; 11:15am
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/GitHub-carbondata-jack86596-opened-a-new-pull-request-4057-CARBONDATA-4088-Drop-metacache-didn-t-cle-tp104880.html


jack86596 opened a new pull request #4057:
URL: https://github.com/apache/carbondata/pull/4057


   …which leads to memory leak
   
    ### Why is this PR needed?
   When there are two spark applications, one drop a table, some cache information of this table stay in another application and cannot be removed with any method like "Drop metacache" command. This leads to memory leak. With the passage of time, memory leak will also accumulate which finally leads to driver OOM. Following are the leak points: 1) tableModifiedTimeStore in CarbonFileMetastore; 2) segmentLockMap in BlockletDataMapIndexStore; 3) absoluteTableIdentifierByteMap in SegmentPropertiesAndSchemaHolder; 4) tableInfoMap in CarbonMetadata.
   
    ### What changes were proposed in this PR?
   Using expiring map to cache the table information in CarbonMetadata and modified time in CarbonFileMetaStore so that stale information will be cleared automatically after the expiration time. Operations in BlockletDataMapIndexStore no need to be locked, remove all the logic related to segmentLockMap.
       
    ### Does this PR introduce any user interface change?
    - New configuration carbon.metacache.expiration.seconds is added.
   
    ### Is any new testcase added?
    - No


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]