GitHub user mohammadshahidkhan opened a pull request:
https://github.com/apache/incubator-carbondata/pull/454 [CARBONDATA-484] Implement LRU cache for B-Tree LRU Cache for B-Tree is proposed to ensure to avoid out memory, when too many number of tables exits and all are not frequently used. **Problem:** CarbonData is maintaining two level of B-Tree cache, one at the driver level and another at executor level. Currently CarbonData has the mechanism to invalidate the segments and blocks cache for the invalid table segments, but there is no eviction policy for the unused cached object. So the instance at which complete memory is utilized then the system will not be able to process any new requests. **Solution**: In the cache maintained at the driver level and at the executor there must be objects in cache currently not in use. Therefore system should have the mechanism to below mechanism. 1. Set the max memory limit till which objects could be hold in the memory. 2. When configured memory limit reached then identify the cached objects currently not in use so that the required memory could be freed without impacting the existing process. 3. Eviction should be done only till the required memory is not meet. 4. Compacted segments driver and executor cache should be invalidated ## Below things are as part of this PR: 1. Implemented Block B-Tree LRU cache at the executor side. 2. Implemented Segment B-Tree LRU cache at Driver side. 3. Implemented mechanism to clear the compacted segment cache from the segment cache. 4. Implemented mechanism to clear to blocks from the executor cache. You can merge this pull request into a Git repository by running: $ git pull https://github.com/mohammadshahidkhan/incubator-carbondata LRU_Cache_ForB-Tree Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-carbondata/pull/454.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #454 ---- commit 2455544b6cd710d4fef6494e41739529b1c6b3df Author: mohammadshahidkhan <[hidden email]> Date: 2016-10-05T04:45:09Z [CARBONDATA-484] Implement LRU cache for B-Tree ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
Github user CarbonDataQA commented on the issue:
https://github.com/apache/incubator-carbondata/pull/454 Build Failed with Spark 1.5.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/293/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/incubator-carbondata/pull/454 Build Success with Spark 1.5.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/294/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/incubator-carbondata/pull/454 Build Success with Spark 1.5.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/295/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/454#discussion_r93723776 --- Diff: core/src/main/java/org/apache/carbondata/core/carbon/datastore/AbstractBlockIndexStoreCache.java --- @@ -0,0 +1,110 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.carbondata.core.carbon.datastore; + +import java.util.Arrays; +import java.util.List; +import java.util.Map; +import java.util.concurrent.ConcurrentHashMap; + +import org.apache.carbondata.core.cache.Cache; +import org.apache.carbondata.core.cache.CarbonLRUCache; +import org.apache.carbondata.core.carbon.datastore.block.AbstractIndex; +import org.apache.carbondata.core.carbon.datastore.block.BlockInfo; +import org.apache.carbondata.core.carbon.datastore.block.TableBlockInfo; +import org.apache.carbondata.core.carbon.datastore.block.TableBlockUniqueIdentifier; +import org.apache.carbondata.core.carbon.metadata.blocklet.DataFileFooter; +import org.apache.carbondata.core.util.CarbonUtil; +import org.apache.carbondata.core.util.CarbonUtilException; + +/** + * This class validate and load the B-Tree in the executor lru cache + * @param <K> cache key + * @param <V> Block Meta data details + */ +public abstract class AbstractBlockIndexStoreCache<K, V> + implements Cache<TableBlockUniqueIdentifier, AbstractIndex> { + /** + * carbon store path + */ + protected String carbonStorePath; + /** + * CarbonLRU cache + */ + protected CarbonLRUCache lruCache; + + /** + * table segment id vs blockInfo list + */ + protected Map<String, List<BlockInfo>> segmentIdToBlockListMap; + + + /** + * map of block info to lock object map, while loading the btree this will be filled + * and removed after loading the tree for that particular block info, this will be useful + * while loading the tree concurrently so only block level lock will be applied another + * block can be loaded concurrently + */ + protected Map<BlockInfo, Object> blockInfoLock; + + public AbstractBlockIndexStoreCache(String carbonStorePath, CarbonLRUCache lruCache) { + this.carbonStorePath = carbonStorePath; + this.lruCache = lruCache; + blockInfoLock = new ConcurrentHashMap<BlockInfo, Object>(); + segmentIdToBlockListMap = new ConcurrentHashMap<>(); + } + + /** + * This method will get the value for the given key. If value does not exist + * for the given key, it will check and load the value. + * + * @param tableBlock + * @param tableBlockUniqueIdentifier + * @param lruCacheKey + */ + protected void checkAndLoadTableBlocks(AbstractIndex tableBlock, + TableBlockUniqueIdentifier tableBlockUniqueIdentifier, String lruCacheKey) + throws CarbonUtilException { + // calculate the required size is + TableBlockInfo blockInfo = tableBlockUniqueIdentifier.getTableBlockInfo(); + long requiredMetaSize = CarbonUtil + .calculateMetaSize(blockInfo.getFilePath(), blockInfo.getBlockOffset(), + blockInfo.getBlockLength()); + if (requiredMetaSize > 0) { + tableBlock.setMemorySize(requiredMetaSize); + tableBlock.incrementAccessCount(); + boolean isTableBlockAddedToLruCache = lruCache.put(lruCacheKey, tableBlock, requiredMetaSize); + // if column is successfully added to lru cache then only load the + // table blocks data + if (isTableBlockAddedToLruCache) { + // load table blocks data + // getting the data file meta data of the block + DataFileFooter footer = CarbonUtil + .readMetadatFile(blockInfo); + footer.setBlockInfo(new BlockInfo(blockInfo)); + // building the block + tableBlock.buildIndex(Arrays.asList(footer)); + } else { + throw new CarbonUtilException( --- End diff -- Why it is throwing CarbonUtilException?? I should throw IndexBuilderException. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/454#discussion_r93727347 --- Diff: core/src/main/java/org/apache/carbondata/core/cache/CacheProvider.java --- @@ -121,15 +141,24 @@ private void createDictionaryCacheForGivenType(CacheType cacheType, String carbo * @param cacheType */ private void createLRULevelCacheInstance(CacheType cacheType) { - CarbonLRUCache carbonLRUCache = null; - // if cache type is dictionary cache, then same lru cache instance has to be shared - // between forward and reverse cache - if (cacheType.equals(CacheType.REVERSE_DICTIONARY) || cacheType - .equals(CacheType.FORWARD_DICTIONARY)) { - carbonLRUCache = new CarbonLRUCache(CarbonCommonConstants.CARBON_MAX_LEVEL_CACHE_SIZE, - CarbonCommonConstants.CARBON_MAX_LEVEL_CACHE_SIZE_DEFAULT); - cacheTypeToLRUCacheMap.put(CacheType.REVERSE_DICTIONARY, carbonLRUCache); - cacheTypeToLRUCacheMap.put(CacheType.FORWARD_DICTIONARY, carbonLRUCache); + boolean isDriver = Boolean.parseBoolean(CarbonProperties.getInstance() --- End diff -- Please correct me if i am wrong If driver and executor is running under same JVM like local mode setup then this property will be always true? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/454#discussion_r93727495 --- Diff: core/src/main/java/org/apache/carbondata/core/carbon/datastore/BlockIndexStore.java --- @@ -16,113 +16,142 @@ * specific language governing permissions and limitations * under the License. */ + package org.apache.carbondata.core.carbon.datastore; import java.util.ArrayList; import java.util.Arrays; -import java.util.Iterator; import java.util.List; -import java.util.Map; import java.util.concurrent.Callable; -import java.util.concurrent.ConcurrentHashMap; -import java.util.concurrent.ExecutionException; +import java.util.concurrent.CopyOnWriteArrayList; import java.util.concurrent.ExecutorService; import java.util.concurrent.Executors; import java.util.concurrent.Future; import java.util.concurrent.TimeUnit; +import org.apache.carbondata.common.logging.LogService; +import org.apache.carbondata.common.logging.LogServiceFactory; +import org.apache.carbondata.core.cache.CarbonLRUCache; import org.apache.carbondata.core.carbon.AbsoluteTableIdentifier; +import org.apache.carbondata.core.carbon.CarbonTableIdentifier; import org.apache.carbondata.core.carbon.datastore.block.AbstractIndex; import org.apache.carbondata.core.carbon.datastore.block.BlockIndex; import org.apache.carbondata.core.carbon.datastore.block.BlockInfo; import org.apache.carbondata.core.carbon.datastore.block.TableBlockInfo; +import org.apache.carbondata.core.carbon.datastore.block.TableBlockUniqueIdentifier; import org.apache.carbondata.core.carbon.datastore.exception.IndexBuilderException; -import org.apache.carbondata.core.carbon.metadata.blocklet.DataFileFooter; import org.apache.carbondata.core.constants.CarbonCommonConstants; import org.apache.carbondata.core.util.CarbonProperties; -import org.apache.carbondata.core.util.CarbonUtil; import org.apache.carbondata.core.util.CarbonUtilException; /** - * Singleton Class to handle loading, unloading,clearing,storing of the table - * blocks + * This class is used to load the B-Tree in Executor LRU Cache */ -public class BlockIndexStore { - - /** - * singleton instance - */ - private static final BlockIndexStore CARBONTABLEBLOCKSINSTANCE = new BlockIndexStore(); - - /** - * map to hold the table and its list of blocks - */ - private Map<AbsoluteTableIdentifier, Map<BlockInfo, AbstractIndex>> tableBlocksMap; - - /** - * map to maintain segment id to block info map, this map will be used to - * while removing the block from memory when segment is compacted or deleted - */ - private Map<AbsoluteTableIdentifier, Map<String, List<BlockInfo>>> segmentIdToBlockListMap; - - /** - * map of block info to lock object map, while loading the btree this will be filled - * and removed after loading the tree for that particular block info, this will be useful - * while loading the tree concurrently so only block level lock will be applied another - * block can be loaded concurrently - */ - private Map<BlockInfo, Object> blockInfoLock; +public class BlockIndexStore<K, V> extends AbstractBlockIndexStoreCache<K, V> { /** - * table and its lock object to this will be useful in case of concurrent - * query scenario when more than one query comes for same table and in that - * case it will ensure that only one query will able to load the blocks + * LOGGER instance */ - private Map<AbsoluteTableIdentifier, Object> tableLockMap; + private static final LogService LOGGER = + LogServiceFactory.getLogService(BlockIndexStore.class.getName()); + public BlockIndexStore(String carbonStorePath, CarbonLRUCache lruCache) { + super(carbonStorePath, lruCache); + } /** - * block info to future task mapping - * useful when blocklet distribution is enabled and - * same block is loaded by multiple thread + * The method loads the block meta in B-tree lru cache and returns the block meta. + * + * @param tableBlockUniqueIdentifier Uniquely identifies the block + * @return returns the blocks B-Tree meta + * @throws CarbonUtilException */ - private Map<BlockInfo, Future<AbstractIndex>> mapOfBlockInfoToFuture; + @Override public AbstractIndex get(TableBlockUniqueIdentifier tableBlockUniqueIdentifier) + throws CarbonUtilException { --- End diff -- Why it is throwing CarbonUtilException? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/454#discussion_r93727651 --- Diff: core/src/main/java/org/apache/carbondata/core/carbon/datastore/BlockIndexStore.java --- @@ -16,113 +16,142 @@ * specific language governing permissions and limitations * under the License. */ + package org.apache.carbondata.core.carbon.datastore; import java.util.ArrayList; import java.util.Arrays; -import java.util.Iterator; import java.util.List; -import java.util.Map; import java.util.concurrent.Callable; -import java.util.concurrent.ConcurrentHashMap; -import java.util.concurrent.ExecutionException; +import java.util.concurrent.CopyOnWriteArrayList; import java.util.concurrent.ExecutorService; import java.util.concurrent.Executors; import java.util.concurrent.Future; import java.util.concurrent.TimeUnit; +import org.apache.carbondata.common.logging.LogService; +import org.apache.carbondata.common.logging.LogServiceFactory; +import org.apache.carbondata.core.cache.CarbonLRUCache; import org.apache.carbondata.core.carbon.AbsoluteTableIdentifier; +import org.apache.carbondata.core.carbon.CarbonTableIdentifier; import org.apache.carbondata.core.carbon.datastore.block.AbstractIndex; import org.apache.carbondata.core.carbon.datastore.block.BlockIndex; import org.apache.carbondata.core.carbon.datastore.block.BlockInfo; import org.apache.carbondata.core.carbon.datastore.block.TableBlockInfo; +import org.apache.carbondata.core.carbon.datastore.block.TableBlockUniqueIdentifier; import org.apache.carbondata.core.carbon.datastore.exception.IndexBuilderException; -import org.apache.carbondata.core.carbon.metadata.blocklet.DataFileFooter; import org.apache.carbondata.core.constants.CarbonCommonConstants; import org.apache.carbondata.core.util.CarbonProperties; -import org.apache.carbondata.core.util.CarbonUtil; import org.apache.carbondata.core.util.CarbonUtilException; /** - * Singleton Class to handle loading, unloading,clearing,storing of the table - * blocks + * This class is used to load the B-Tree in Executor LRU Cache */ -public class BlockIndexStore { - - /** - * singleton instance - */ - private static final BlockIndexStore CARBONTABLEBLOCKSINSTANCE = new BlockIndexStore(); - - /** - * map to hold the table and its list of blocks - */ - private Map<AbsoluteTableIdentifier, Map<BlockInfo, AbstractIndex>> tableBlocksMap; - - /** - * map to maintain segment id to block info map, this map will be used to - * while removing the block from memory when segment is compacted or deleted - */ - private Map<AbsoluteTableIdentifier, Map<String, List<BlockInfo>>> segmentIdToBlockListMap; - - /** - * map of block info to lock object map, while loading the btree this will be filled - * and removed after loading the tree for that particular block info, this will be useful - * while loading the tree concurrently so only block level lock will be applied another - * block can be loaded concurrently - */ - private Map<BlockInfo, Object> blockInfoLock; +public class BlockIndexStore<K, V> extends AbstractBlockIndexStoreCache<K, V> { /** - * table and its lock object to this will be useful in case of concurrent - * query scenario when more than one query comes for same table and in that - * case it will ensure that only one query will able to load the blocks + * LOGGER instance */ - private Map<AbsoluteTableIdentifier, Object> tableLockMap; + private static final LogService LOGGER = + LogServiceFactory.getLogService(BlockIndexStore.class.getName()); + public BlockIndexStore(String carbonStorePath, CarbonLRUCache lruCache) { + super(carbonStorePath, lruCache); + } /** - * block info to future task mapping - * useful when blocklet distribution is enabled and - * same block is loaded by multiple thread + * The method loads the block meta in B-tree lru cache and returns the block meta. + * + * @param tableBlockUniqueIdentifier Uniquely identifies the block + * @return returns the blocks B-Tree meta + * @throws CarbonUtilException */ - private Map<BlockInfo, Future<AbstractIndex>> mapOfBlockInfoToFuture; + @Override public AbstractIndex get(TableBlockUniqueIdentifier tableBlockUniqueIdentifier) + throws CarbonUtilException { + TableBlockInfo tableBlockInfo = tableBlockUniqueIdentifier.getTableBlockInfo(); + BlockInfo blockInfo = new BlockInfo(tableBlockInfo); + String lruCacheKey = + getLruCacheKey(tableBlockUniqueIdentifier.getAbsoluteTableIdentifier(), blockInfo); + AbstractIndex tableBlock = (AbstractIndex) lruCache.get(lruCacheKey); - private BlockIndexStore() { - tableBlocksMap = new ConcurrentHashMap<AbsoluteTableIdentifier, Map<BlockInfo, AbstractIndex>>( - CarbonCommonConstants.DEFAULT_COLLECTION_SIZE); - tableLockMap = new ConcurrentHashMap<AbsoluteTableIdentifier, Object>( - CarbonCommonConstants.DEFAULT_COLLECTION_SIZE); - blockInfoLock = new ConcurrentHashMap<BlockInfo, Object>(); - segmentIdToBlockListMap = new ConcurrentHashMap<>(); - mapOfBlockInfoToFuture = new ConcurrentHashMap<>(); + // if block is not loaded + if (null == tableBlock) { + // check any lock object is present in + // block info lock map + Object blockInfoLockObject = blockInfoLock.get(blockInfo); + // if lock object is not present then acquire + // the lock in block info lock and add a lock object in the map for + // particular block info, added double checking mechanism to add the lock + // object so in case of concurrent query we for same block info only one lock + // object will be added + if (null == blockInfoLockObject) { + synchronized (blockInfoLock) { + // again checking the block info lock, to check whether lock object is present + // or not if now also not present then add a lock object + blockInfoLockObject = blockInfoLock.get(blockInfo); + if (null == blockInfoLockObject) { + blockInfoLockObject = new Object(); + blockInfoLock.put(blockInfo, blockInfoLockObject); + } + } + } + //acquire the lock for particular block info + synchronized (blockInfoLockObject) { + // check again whether block is present or not to avoid the + // same block is loaded + //more than once in case of concurrent query + tableBlock = (AbstractIndex) lruCache.get( + getLruCacheKey(tableBlockUniqueIdentifier.getAbsoluteTableIdentifier(), blockInfo)); + // if still block is not present then load the block + if (null == tableBlock) { + tableBlock = loadBlock(tableBlockUniqueIdentifier); + fillSegmentIdToBlockListMap(tableBlockUniqueIdentifier.getAbsoluteTableIdentifier(), + blockInfo); + } + } + } else { + tableBlock.incrementAccessCount(); + } + return tableBlock; } /** - * Return the instance of this class - * - * @return singleton instance + * @param absoluteTableIdentifier + * @param blockInfo */ - public static BlockIndexStore getInstance() { - return CARBONTABLEBLOCKSINSTANCE; + private void fillSegmentIdToBlockListMap(AbsoluteTableIdentifier absoluteTableIdentifier, + BlockInfo blockInfo) { + TableSegmentUniqueIdentifier segmentIdentifier = + new TableSegmentUniqueIdentifier(absoluteTableIdentifier, + blockInfo.getTableBlockInfo().getSegmentId()); + List<BlockInfo> blockInfos = + segmentIdToBlockListMap.get(segmentIdentifier.getUniqueTableSegmentIdentifier()); + if (null == blockInfos) { + synchronized (segmentIdentifier) { + blockInfos = + segmentIdToBlockListMap.get(segmentIdentifier.getUniqueTableSegmentIdentifier()); + if (null == blockInfos) { + blockInfos = new CopyOnWriteArrayList<>(); --- End diff -- Why we need copyOnWriteArrayList?? As this list contain all the block info for one segment. So when we are taking a lock for segment i think copyonwrite array list is not required. Please comment --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/454#discussion_r93727727 --- Diff: core/src/main/java/org/apache/carbondata/core/carbon/datastore/BlockIndexStore.java --- @@ -244,124 +222,96 @@ public static BlockIndexStore getInstance() { * * @param loadedBlockArray array of blocks which will be filled * @param blocksList blocks loaded in thread - * @throws IndexBuilderException in case of any failure + * @throws CarbonUtilException in case of any failure */ - private void fillLoadedBlocks(AbstractIndex[] loadedBlockArray, List<BlockInfo> blockInfos) - throws IndexBuilderException { + private void fillLoadedBlocks(AbstractIndex[] loadedBlockArray, + List<Future<AbstractIndex>> blocksList) throws CarbonUtilException { + int blockCounter = 0; + boolean exceptionOccurred = false; + Throwable exceptionRef = null; for (int i = 0; i < loadedBlockArray.length; i++) { - if (null == loadedBlockArray[i]) { - try { - loadedBlockArray[i] = mapOfBlockInfoToFuture.get(blockInfos.get(i)).get(); - } catch (InterruptedException | ExecutionException e) { - throw new IndexBuilderException(e); - } + try { + loadedBlockArray[i] = blocksList.get(blockCounter++).get(); + } catch (Throwable e) { + exceptionOccurred = true; + exceptionRef = e; } - + } + if (exceptionOccurred) { --- End diff -- I think this if check is not required we can move this part of the code to catch block. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/454#discussion_r93727747 --- Diff: core/src/main/java/org/apache/carbondata/core/carbon/datastore/BlockIndexStore.java --- @@ -244,124 +222,96 @@ public static BlockIndexStore getInstance() { * * @param loadedBlockArray array of blocks which will be filled * @param blocksList blocks loaded in thread - * @throws IndexBuilderException in case of any failure + * @throws CarbonUtilException in case of any failure */ - private void fillLoadedBlocks(AbstractIndex[] loadedBlockArray, List<BlockInfo> blockInfos) - throws IndexBuilderException { + private void fillLoadedBlocks(AbstractIndex[] loadedBlockArray, + List<Future<AbstractIndex>> blocksList) throws CarbonUtilException { + int blockCounter = 0; + boolean exceptionOccurred = false; + Throwable exceptionRef = null; for (int i = 0; i < loadedBlockArray.length; i++) { - if (null == loadedBlockArray[i]) { - try { - loadedBlockArray[i] = mapOfBlockInfoToFuture.get(blockInfos.get(i)).get(); - } catch (InterruptedException | ExecutionException e) { - throw new IndexBuilderException(e); - } + try { + loadedBlockArray[i] = blocksList.get(blockCounter++).get(); + } catch (Throwable e) { --- End diff -- why to catch throwable?? why not any specific exception --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user mohammadshahidkhan commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/454#discussion_r94018107 --- Diff: core/src/main/java/org/apache/carbondata/core/carbon/datastore/AbstractBlockIndexStoreCache.java --- @@ -0,0 +1,110 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.carbondata.core.carbon.datastore; + +import java.util.Arrays; +import java.util.List; +import java.util.Map; +import java.util.concurrent.ConcurrentHashMap; + +import org.apache.carbondata.core.cache.Cache; +import org.apache.carbondata.core.cache.CarbonLRUCache; +import org.apache.carbondata.core.carbon.datastore.block.AbstractIndex; +import org.apache.carbondata.core.carbon.datastore.block.BlockInfo; +import org.apache.carbondata.core.carbon.datastore.block.TableBlockInfo; +import org.apache.carbondata.core.carbon.datastore.block.TableBlockUniqueIdentifier; +import org.apache.carbondata.core.carbon.metadata.blocklet.DataFileFooter; +import org.apache.carbondata.core.util.CarbonUtil; +import org.apache.carbondata.core.util.CarbonUtilException; + +/** + * This class validate and load the B-Tree in the executor lru cache + * @param <K> cache key + * @param <V> Block Meta data details + */ +public abstract class AbstractBlockIndexStoreCache<K, V> + implements Cache<TableBlockUniqueIdentifier, AbstractIndex> { + /** + * carbon store path + */ + protected String carbonStorePath; + /** + * CarbonLRU cache + */ + protected CarbonLRUCache lruCache; + + /** + * table segment id vs blockInfo list + */ + protected Map<String, List<BlockInfo>> segmentIdToBlockListMap; + + + /** + * map of block info to lock object map, while loading the btree this will be filled + * and removed after loading the tree for that particular block info, this will be useful + * while loading the tree concurrently so only block level lock will be applied another + * block can be loaded concurrently + */ + protected Map<BlockInfo, Object> blockInfoLock; + + public AbstractBlockIndexStoreCache(String carbonStorePath, CarbonLRUCache lruCache) { + this.carbonStorePath = carbonStorePath; + this.lruCache = lruCache; + blockInfoLock = new ConcurrentHashMap<BlockInfo, Object>(); + segmentIdToBlockListMap = new ConcurrentHashMap<>(); + } + + /** + * This method will get the value for the given key. If value does not exist + * for the given key, it will check and load the value. + * + * @param tableBlock + * @param tableBlockUniqueIdentifier + * @param lruCacheKey + */ + protected void checkAndLoadTableBlocks(AbstractIndex tableBlock, + TableBlockUniqueIdentifier tableBlockUniqueIdentifier, String lruCacheKey) + throws CarbonUtilException { + // calculate the required size is + TableBlockInfo blockInfo = tableBlockUniqueIdentifier.getTableBlockInfo(); + long requiredMetaSize = CarbonUtil + .calculateMetaSize(blockInfo.getFilePath(), blockInfo.getBlockOffset(), + blockInfo.getBlockLength()); + if (requiredMetaSize > 0) { + tableBlock.setMemorySize(requiredMetaSize); + tableBlock.incrementAccessCount(); + boolean isTableBlockAddedToLruCache = lruCache.put(lruCacheKey, tableBlock, requiredMetaSize); + // if column is successfully added to lru cache then only load the + // table blocks data + if (isTableBlockAddedToLruCache) { + // load table blocks data + // getting the data file meta data of the block + DataFileFooter footer = CarbonUtil + .readMetadatFile(blockInfo); + footer.setBlockInfo(new BlockInfo(blockInfo)); + // building the block + tableBlock.buildIndex(Arrays.asList(footer)); + } else { + throw new CarbonUtilException( --- End diff -- Its bound to the Cache interface methods V get(K key) throws CarbonUtilException; List<V> getAll(List<K> keys) throws CarbonUtilException; --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user mohammadshahidkhan commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/454#discussion_r94018295 --- Diff: core/src/main/java/org/apache/carbondata/core/cache/CacheProvider.java --- @@ -121,15 +141,24 @@ private void createDictionaryCacheForGivenType(CacheType cacheType, String carbo * @param cacheType */ private void createLRULevelCacheInstance(CacheType cacheType) { - CarbonLRUCache carbonLRUCache = null; - // if cache type is dictionary cache, then same lru cache instance has to be shared - // between forward and reverse cache - if (cacheType.equals(CacheType.REVERSE_DICTIONARY) || cacheType - .equals(CacheType.FORWARD_DICTIONARY)) { - carbonLRUCache = new CarbonLRUCache(CarbonCommonConstants.CARBON_MAX_LEVEL_CACHE_SIZE, - CarbonCommonConstants.CARBON_MAX_LEVEL_CACHE_SIZE_DEFAULT); - cacheTypeToLRUCacheMap.put(CacheType.REVERSE_DICTIONARY, carbonLRUCache); - cacheTypeToLRUCacheMap.put(CacheType.FORWARD_DICTIONARY, carbonLRUCache); + boolean isDriver = Boolean.parseBoolean(CarbonProperties.getInstance() --- End diff -- Its bound to the Cache interface methods V get(K key) throws CarbonUtilException; List getAll(List keys) throws CarbonUtilException; --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user mohammadshahidkhan commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/454#discussion_r94018321 --- Diff: core/src/main/java/org/apache/carbondata/core/carbon/datastore/BlockIndexStore.java --- @@ -16,113 +16,142 @@ * specific language governing permissions and limitations * under the License. */ + package org.apache.carbondata.core.carbon.datastore; import java.util.ArrayList; import java.util.Arrays; -import java.util.Iterator; import java.util.List; -import java.util.Map; import java.util.concurrent.Callable; -import java.util.concurrent.ConcurrentHashMap; -import java.util.concurrent.ExecutionException; +import java.util.concurrent.CopyOnWriteArrayList; import java.util.concurrent.ExecutorService; import java.util.concurrent.Executors; import java.util.concurrent.Future; import java.util.concurrent.TimeUnit; +import org.apache.carbondata.common.logging.LogService; +import org.apache.carbondata.common.logging.LogServiceFactory; +import org.apache.carbondata.core.cache.CarbonLRUCache; import org.apache.carbondata.core.carbon.AbsoluteTableIdentifier; +import org.apache.carbondata.core.carbon.CarbonTableIdentifier; import org.apache.carbondata.core.carbon.datastore.block.AbstractIndex; import org.apache.carbondata.core.carbon.datastore.block.BlockIndex; import org.apache.carbondata.core.carbon.datastore.block.BlockInfo; import org.apache.carbondata.core.carbon.datastore.block.TableBlockInfo; +import org.apache.carbondata.core.carbon.datastore.block.TableBlockUniqueIdentifier; import org.apache.carbondata.core.carbon.datastore.exception.IndexBuilderException; -import org.apache.carbondata.core.carbon.metadata.blocklet.DataFileFooter; import org.apache.carbondata.core.constants.CarbonCommonConstants; import org.apache.carbondata.core.util.CarbonProperties; -import org.apache.carbondata.core.util.CarbonUtil; import org.apache.carbondata.core.util.CarbonUtilException; /** - * Singleton Class to handle loading, unloading,clearing,storing of the table - * blocks + * This class is used to load the B-Tree in Executor LRU Cache */ -public class BlockIndexStore { - - /** - * singleton instance - */ - private static final BlockIndexStore CARBONTABLEBLOCKSINSTANCE = new BlockIndexStore(); - - /** - * map to hold the table and its list of blocks - */ - private Map<AbsoluteTableIdentifier, Map<BlockInfo, AbstractIndex>> tableBlocksMap; - - /** - * map to maintain segment id to block info map, this map will be used to - * while removing the block from memory when segment is compacted or deleted - */ - private Map<AbsoluteTableIdentifier, Map<String, List<BlockInfo>>> segmentIdToBlockListMap; - - /** - * map of block info to lock object map, while loading the btree this will be filled - * and removed after loading the tree for that particular block info, this will be useful - * while loading the tree concurrently so only block level lock will be applied another - * block can be loaded concurrently - */ - private Map<BlockInfo, Object> blockInfoLock; +public class BlockIndexStore<K, V> extends AbstractBlockIndexStoreCache<K, V> { /** - * table and its lock object to this will be useful in case of concurrent - * query scenario when more than one query comes for same table and in that - * case it will ensure that only one query will able to load the blocks + * LOGGER instance */ - private Map<AbsoluteTableIdentifier, Object> tableLockMap; + private static final LogService LOGGER = + LogServiceFactory.getLogService(BlockIndexStore.class.getName()); + public BlockIndexStore(String carbonStorePath, CarbonLRUCache lruCache) { + super(carbonStorePath, lruCache); + } /** - * block info to future task mapping - * useful when blocklet distribution is enabled and - * same block is loaded by multiple thread + * The method loads the block meta in B-tree lru cache and returns the block meta. + * + * @param tableBlockUniqueIdentifier Uniquely identifies the block + * @return returns the blocks B-Tree meta + * @throws CarbonUtilException */ - private Map<BlockInfo, Future<AbstractIndex>> mapOfBlockInfoToFuture; + @Override public AbstractIndex get(TableBlockUniqueIdentifier tableBlockUniqueIdentifier) + throws CarbonUtilException { --- End diff -- Its bound to the Cache interface methods V get(K key) throws CarbonUtilException; List getAll(List keys) throws CarbonUtilException; --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user mohammadshahidkhan commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/454#discussion_r94018535 --- Diff: core/src/main/java/org/apache/carbondata/core/carbon/datastore/BlockIndexStore.java --- @@ -244,124 +222,96 @@ public static BlockIndexStore getInstance() { * * @param loadedBlockArray array of blocks which will be filled * @param blocksList blocks loaded in thread - * @throws IndexBuilderException in case of any failure + * @throws CarbonUtilException in case of any failure */ - private void fillLoadedBlocks(AbstractIndex[] loadedBlockArray, List<BlockInfo> blockInfos) - throws IndexBuilderException { + private void fillLoadedBlocks(AbstractIndex[] loadedBlockArray, + List<Future<AbstractIndex>> blocksList) throws CarbonUtilException { + int blockCounter = 0; + boolean exceptionOccurred = false; + Throwable exceptionRef = null; for (int i = 0; i < loadedBlockArray.length; i++) { - if (null == loadedBlockArray[i]) { - try { - loadedBlockArray[i] = mapOfBlockInfoToFuture.get(blockInfos.get(i)).get(); - } catch (InterruptedException | ExecutionException e) { - throw new IndexBuilderException(e); - } + try { + loadedBlockArray[i] = blocksList.get(blockCounter++).get(); + } catch (Throwable e) { + exceptionOccurred = true; + exceptionRef = e; } - + } + if (exceptionOccurred) { --- End diff -- if we move to the catch block then we will not be able decrement the access count of all the loaded objects --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user mohammadshahidkhan commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/454#discussion_r94019181 --- Diff: core/src/main/java/org/apache/carbondata/core/carbon/datastore/BlockIndexStore.java --- @@ -16,113 +16,142 @@ * specific language governing permissions and limitations * under the License. */ + package org.apache.carbondata.core.carbon.datastore; import java.util.ArrayList; import java.util.Arrays; -import java.util.Iterator; import java.util.List; -import java.util.Map; import java.util.concurrent.Callable; -import java.util.concurrent.ConcurrentHashMap; -import java.util.concurrent.ExecutionException; +import java.util.concurrent.CopyOnWriteArrayList; import java.util.concurrent.ExecutorService; import java.util.concurrent.Executors; import java.util.concurrent.Future; import java.util.concurrent.TimeUnit; +import org.apache.carbondata.common.logging.LogService; +import org.apache.carbondata.common.logging.LogServiceFactory; +import org.apache.carbondata.core.cache.CarbonLRUCache; import org.apache.carbondata.core.carbon.AbsoluteTableIdentifier; +import org.apache.carbondata.core.carbon.CarbonTableIdentifier; import org.apache.carbondata.core.carbon.datastore.block.AbstractIndex; import org.apache.carbondata.core.carbon.datastore.block.BlockIndex; import org.apache.carbondata.core.carbon.datastore.block.BlockInfo; import org.apache.carbondata.core.carbon.datastore.block.TableBlockInfo; +import org.apache.carbondata.core.carbon.datastore.block.TableBlockUniqueIdentifier; import org.apache.carbondata.core.carbon.datastore.exception.IndexBuilderException; -import org.apache.carbondata.core.carbon.metadata.blocklet.DataFileFooter; import org.apache.carbondata.core.constants.CarbonCommonConstants; import org.apache.carbondata.core.util.CarbonProperties; -import org.apache.carbondata.core.util.CarbonUtil; import org.apache.carbondata.core.util.CarbonUtilException; /** - * Singleton Class to handle loading, unloading,clearing,storing of the table - * blocks + * This class is used to load the B-Tree in Executor LRU Cache */ -public class BlockIndexStore { - - /** - * singleton instance - */ - private static final BlockIndexStore CARBONTABLEBLOCKSINSTANCE = new BlockIndexStore(); - - /** - * map to hold the table and its list of blocks - */ - private Map<AbsoluteTableIdentifier, Map<BlockInfo, AbstractIndex>> tableBlocksMap; - - /** - * map to maintain segment id to block info map, this map will be used to - * while removing the block from memory when segment is compacted or deleted - */ - private Map<AbsoluteTableIdentifier, Map<String, List<BlockInfo>>> segmentIdToBlockListMap; - - /** - * map of block info to lock object map, while loading the btree this will be filled - * and removed after loading the tree for that particular block info, this will be useful - * while loading the tree concurrently so only block level lock will be applied another - * block can be loaded concurrently - */ - private Map<BlockInfo, Object> blockInfoLock; +public class BlockIndexStore<K, V> extends AbstractBlockIndexStoreCache<K, V> { /** - * table and its lock object to this will be useful in case of concurrent - * query scenario when more than one query comes for same table and in that - * case it will ensure that only one query will able to load the blocks + * LOGGER instance */ - private Map<AbsoluteTableIdentifier, Object> tableLockMap; + private static final LogService LOGGER = + LogServiceFactory.getLogService(BlockIndexStore.class.getName()); + public BlockIndexStore(String carbonStorePath, CarbonLRUCache lruCache) { + super(carbonStorePath, lruCache); + } /** - * block info to future task mapping - * useful when blocklet distribution is enabled and - * same block is loaded by multiple thread + * The method loads the block meta in B-tree lru cache and returns the block meta. + * + * @param tableBlockUniqueIdentifier Uniquely identifies the block + * @return returns the blocks B-Tree meta + * @throws CarbonUtilException */ - private Map<BlockInfo, Future<AbstractIndex>> mapOfBlockInfoToFuture; + @Override public AbstractIndex get(TableBlockUniqueIdentifier tableBlockUniqueIdentifier) + throws CarbonUtilException { + TableBlockInfo tableBlockInfo = tableBlockUniqueIdentifier.getTableBlockInfo(); + BlockInfo blockInfo = new BlockInfo(tableBlockInfo); + String lruCacheKey = + getLruCacheKey(tableBlockUniqueIdentifier.getAbsoluteTableIdentifier(), blockInfo); + AbstractIndex tableBlock = (AbstractIndex) lruCache.get(lruCacheKey); - private BlockIndexStore() { - tableBlocksMap = new ConcurrentHashMap<AbsoluteTableIdentifier, Map<BlockInfo, AbstractIndex>>( - CarbonCommonConstants.DEFAULT_COLLECTION_SIZE); - tableLockMap = new ConcurrentHashMap<AbsoluteTableIdentifier, Object>( - CarbonCommonConstants.DEFAULT_COLLECTION_SIZE); - blockInfoLock = new ConcurrentHashMap<BlockInfo, Object>(); - segmentIdToBlockListMap = new ConcurrentHashMap<>(); - mapOfBlockInfoToFuture = new ConcurrentHashMap<>(); + // if block is not loaded + if (null == tableBlock) { + // check any lock object is present in + // block info lock map + Object blockInfoLockObject = blockInfoLock.get(blockInfo); + // if lock object is not present then acquire + // the lock in block info lock and add a lock object in the map for + // particular block info, added double checking mechanism to add the lock + // object so in case of concurrent query we for same block info only one lock + // object will be added + if (null == blockInfoLockObject) { + synchronized (blockInfoLock) { + // again checking the block info lock, to check whether lock object is present + // or not if now also not present then add a lock object + blockInfoLockObject = blockInfoLock.get(blockInfo); + if (null == blockInfoLockObject) { + blockInfoLockObject = new Object(); + blockInfoLock.put(blockInfo, blockInfoLockObject); + } + } + } + //acquire the lock for particular block info + synchronized (blockInfoLockObject) { + // check again whether block is present or not to avoid the + // same block is loaded + //more than once in case of concurrent query + tableBlock = (AbstractIndex) lruCache.get( + getLruCacheKey(tableBlockUniqueIdentifier.getAbsoluteTableIdentifier(), blockInfo)); + // if still block is not present then load the block + if (null == tableBlock) { + tableBlock = loadBlock(tableBlockUniqueIdentifier); + fillSegmentIdToBlockListMap(tableBlockUniqueIdentifier.getAbsoluteTableIdentifier(), + blockInfo); + } + } + } else { + tableBlock.incrementAccessCount(); + } + return tableBlock; } /** - * Return the instance of this class - * - * @return singleton instance + * @param absoluteTableIdentifier + * @param blockInfo */ - public static BlockIndexStore getInstance() { - return CARBONTABLEBLOCKSINSTANCE; + private void fillSegmentIdToBlockListMap(AbsoluteTableIdentifier absoluteTableIdentifier, + BlockInfo blockInfo) { + TableSegmentUniqueIdentifier segmentIdentifier = + new TableSegmentUniqueIdentifier(absoluteTableIdentifier, + blockInfo.getTableBlockInfo().getSegmentId()); + List<BlockInfo> blockInfos = + segmentIdToBlockListMap.get(segmentIdentifier.getUniqueTableSegmentIdentifier()); + if (null == blockInfos) { + synchronized (segmentIdentifier) { + blockInfos = + segmentIdToBlockListMap.get(segmentIdentifier.getUniqueTableSegmentIdentifier()); + if (null == blockInfos) { + blockInfos = new CopyOnWriteArrayList<>(); --- End diff -- Because two threads can do write for same object. Please see the else part else { blockInfos.add(blockInfo); } --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user mohammadshahidkhan commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/454#discussion_r94019232 --- Diff: core/src/main/java/org/apache/carbondata/core/carbon/datastore/BlockIndexStore.java --- @@ -244,124 +222,96 @@ public static BlockIndexStore getInstance() { * * @param loadedBlockArray array of blocks which will be filled * @param blocksList blocks loaded in thread - * @throws IndexBuilderException in case of any failure + * @throws CarbonUtilException in case of any failure */ - private void fillLoadedBlocks(AbstractIndex[] loadedBlockArray, List<BlockInfo> blockInfos) - throws IndexBuilderException { + private void fillLoadedBlocks(AbstractIndex[] loadedBlockArray, + List<Future<AbstractIndex>> blocksList) throws CarbonUtilException { + int blockCounter = 0; + boolean exceptionOccurred = false; + Throwable exceptionRef = null; for (int i = 0; i < loadedBlockArray.length; i++) { - if (null == loadedBlockArray[i]) { - try { - loadedBlockArray[i] = mapOfBlockInfoToFuture.get(blockInfos.get(i)).get(); - } catch (InterruptedException | ExecutionException e) { - throw new IndexBuilderException(e); - } + try { + loadedBlockArray[i] = blocksList.get(blockCounter++).get(); + } catch (Throwable e) { --- End diff -- You are correct instead of throwable we can add specialized cache clause --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/454#discussion_r94295193 --- Diff: core/src/main/java/org/apache/carbondata/core/carbon/datastore/BlockIndexStore.java --- @@ -244,124 +222,96 @@ public static BlockIndexStore getInstance() { * * @param loadedBlockArray array of blocks which will be filled * @param blocksList blocks loaded in thread - * @throws IndexBuilderException in case of any failure + * @throws CarbonUtilException in case of any failure */ - private void fillLoadedBlocks(AbstractIndex[] loadedBlockArray, List<BlockInfo> blockInfos) - throws IndexBuilderException { + private void fillLoadedBlocks(AbstractIndex[] loadedBlockArray, + List<Future<AbstractIndex>> blocksList) throws CarbonUtilException { + int blockCounter = 0; + boolean exceptionOccurred = false; + Throwable exceptionRef = null; for (int i = 0; i < loadedBlockArray.length; i++) { - if (null == loadedBlockArray[i]) { - try { - loadedBlockArray[i] = mapOfBlockInfoToFuture.get(blockInfos.get(i)).get(); - } catch (InterruptedException | ExecutionException e) { - throw new IndexBuilderException(e); - } + try { + loadedBlockArray[i] = blocksList.get(blockCounter++).get(); + } catch (Throwable e) { --- End diff -- @kumarvishal09 @mohammadshahidkhan ...throwable has been added here because get method of FutureTask class can even thrown unchecked exceptions (like CancellationException) apart from the checked exception (ExecutionException and InterruptedException) listed by get() API. In case of any failure we need to decrement the access count else access count will never be decremented for successfully loaded blocks and they will always remain in memory. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/454#discussion_r94302315 --- Diff: core/src/main/java/org/apache/carbondata/core/cache/CacheProvider.java --- @@ -121,15 +141,24 @@ private void createDictionaryCacheForGivenType(CacheType cacheType, String carbo * @param cacheType */ private void createLRULevelCacheInstance(CacheType cacheType) { - CarbonLRUCache carbonLRUCache = null; - // if cache type is dictionary cache, then same lru cache instance has to be shared - // between forward and reverse cache - if (cacheType.equals(CacheType.REVERSE_DICTIONARY) || cacheType - .equals(CacheType.FORWARD_DICTIONARY)) { - carbonLRUCache = new CarbonLRUCache(CarbonCommonConstants.CARBON_MAX_LEVEL_CACHE_SIZE, - CarbonCommonConstants.CARBON_MAX_LEVEL_CACHE_SIZE_DEFAULT); - cacheTypeToLRUCacheMap.put(CacheType.REVERSE_DICTIONARY, carbonLRUCache); - cacheTypeToLRUCacheMap.put(CacheType.FORWARD_DICTIONARY, carbonLRUCache); + boolean isDriver = Boolean.parseBoolean(CarbonProperties.getInstance() + .getProperty(CarbonCommonConstants.IS_DRIVER_INSTANCE, "false")); + if (isDriver) { + carbonLRUCache = new CarbonLRUCache(CarbonCommonConstants.CARBON_MAX_DRIVER_LRU_CACHE_SIZE, + CarbonCommonConstants.CARBON_MAX_LRU_CACHE_SIZE_DEFAULT); + } else { + // if executor cache size is not configured then driver cache conf will be used + String executorCacheSize = CarbonProperties.getInstance() + .getProperty(CarbonCommonConstants.CARBON_MAX_EXECUTOR_LRU_CACHE_SIZE); + if (null != executorCacheSize) { + carbonLRUCache = new CarbonLRUCache(CarbonCommonConstants.CARBON_MAX_DRIVER_LRU_CACHE_SIZE, + CarbonCommonConstants.CARBON_MAX_LRU_CACHE_SIZE_DEFAULT); --- End diff -- correct the code....in case executor cache size use CARBON_MAX_EXECUTOR_LRU_CACHE_SIZE --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/454#discussion_r94302586 --- Diff: core/src/main/java/org/apache/carbondata/core/cache/CacheProvider.java --- @@ -148,7 +177,11 @@ private boolean dictionaryCacheAlreadyExists(CacheType cacheType) { * Below method will be used to clear the cache */ public void dropAllCache() { - cacheTypeToLRUCacheMap.clear(); - cacheTypeToCacheMap.clear(); + if(null != carbonLRUCache) { + carbonLRUCache.clear(); + } + if(null != cacheTypeToCacheMap) { --- End diff -- null check is not required for cacheTypeToCacheMap as it gets initialized when the class is first refrenced --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/incubator-carbondata/pull/454 Build Success with Spark 1.5.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/416/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
Free forum by Nabble | Edit this page |