Github user CarbonDataQA commented on the issue:
https://github.com/apache/incubator-carbondata/pull/492 Build Success with Spark 1.5.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/472/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/incubator-carbondata/pull/492 Build Success with Spark 1.5.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/474/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/incubator-carbondata/pull/492 Build Success with Spark 1.5.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/475/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/incubator-carbondata/pull/492 Build Success with Spark 1.5.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/476/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/492#discussion_r94937003 --- Diff: core/src/main/java/org/apache/carbondata/core/carbon/datastore/SegmentTaskIndexStore.java --- @@ -185,15 +210,25 @@ private SegmentTaskIndexWrapper loadAndGetTaskIdToSegmentsMap( // acquire lock to lod the segment synchronized (segmentLoderLockObject) { segmentTaskIndexWrapper = (SegmentTaskIndexWrapper) lruCache.get(lruCacheKey); - if (null == segmentTaskIndexWrapper) { - // creating a map of take if to table segment - taskIdToSegmentIndexMap = new HashMap<TaskBucketHolder, AbstractIndex>(); - segmentTaskIndexWrapper = new SegmentTaskIndexWrapper(taskIdToSegmentIndexMap); + if (null == segmentTaskIndexWrapper || tableSegmentUniqueIdentifier + .isSegmentUpdated()) { + // if the segment is updated then get the existing block task id map details + // so that the same can be updated after loading the btree. + if (tableSegmentUniqueIdentifier.isSegmentUpdated() + && null != segmentTaskIndexWrapper) { --- End diff -- double check for tableSegmentUniqueIdentifier.isSegmentUpdated() in above if block and this if block. Please remove one of them --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user ManoharVanam commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/492#discussion_r94937272 --- Diff: core/src/main/java/org/apache/carbondata/common/iudprocessor/cache/BlockletLevelDeleteDeltaDataCache.java --- @@ -0,0 +1,29 @@ +package org.apache.carbondata.common.iudprocessor.cache; + +import org.roaringbitmap.RoaringBitmap; + +/** + * Created by S71955 on 06-10-2016. + */ +public class BlockletLevelDeleteDeltaDataCache { --- End diff -- ok --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user ManoharVanam commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/492#discussion_r94937313 --- Diff: core/src/main/java/org/apache/carbondata/core/update/CarbonUpdateUtil.java --- @@ -0,0 +1,777 @@ + --- End diff -- ok --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user ManoharVanam commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/492#discussion_r94937329 --- Diff: core/src/main/java/org/apache/carbondata/core/updatestatus/SegmentStatusManager.java --- @@ -92,75 +112,87 @@ public static long getTableStatusLastModifiedTime( * @return * @throws IOException */ - public static SegmentStatus getSegmentStatus(AbsoluteTableIdentifier identifier) - throws IOException { + public ValidAndInvalidSegmentsInfo getValidAndInvalidSegments() throws IOException { // @TODO: move reading LoadStatus file to separate class - List<String> validSegments = new ArrayList<String>(10); - List<String> validUpdatedSegments = new ArrayList<String>(10); - List<String> invalidSegments = new ArrayList<String>(10); - CarbonTablePath carbonTablePath = CarbonStorePath.getCarbonTablePath(identifier.getStorePath(), - identifier.getCarbonTableIdentifier()); + List<String> listOfValidSegments = new ArrayList<String>(10); + List<String> listOfValidUpdatedSegments = new ArrayList<String>(10); + List<String> listOfInvalidSegments = new ArrayList<String>(10); + CarbonTablePath carbonTablePath = CarbonStorePath + .getCarbonTablePath(absoluteTableIdentifier.getStorePath(), + absoluteTableIdentifier.getCarbonTableIdentifier()); String dataPath = carbonTablePath.getTableStatusFilePath(); DataInputStream dataInputStream = null; Gson gsonObjectToRead = new Gson(); AtomicFileOperations fileOperation = - new AtomicFileOperationsImpl(dataPath, FileFactory.getFileType(dataPath)); + new AtomicFileOperationsImpl(dataPath, FileFactory.getFileType(dataPath)); LoadMetadataDetails[] loadFolderDetailsArray; try { if (FileFactory.isFileExist(dataPath, FileFactory.getFileType(dataPath))) { + dataInputStream = fileOperation.openForRead(); + BufferedReader buffReader = - new BufferedReader( - new InputStreamReader(dataInputStream, CarbonCommonConstants.DEFAULT_CHARSET)); + new BufferedReader(new InputStreamReader(dataInputStream, "UTF-8")); loadFolderDetailsArray = gsonObjectToRead.fromJson(buffReader, LoadMetadataDetails[].class); //just directly iterate Array List<LoadMetadataDetails> loadFolderDetails = Arrays.asList(loadFolderDetailsArray); for (LoadMetadataDetails loadMetadataDetails : loadFolderDetails) { - String loadStatus = loadMetadataDetails.getLoadStatus(); - if (CarbonCommonConstants.STORE_LOADSTATUS_SUCCESS.equalsIgnoreCase(loadStatus) - || CarbonCommonConstants.MARKED_FOR_UPDATE.equalsIgnoreCase(loadStatus) - || CarbonCommonConstants.STORE_LOADSTATUS_PARTIAL_SUCCESS.equalsIgnoreCase( - loadStatus)) { + if (CarbonCommonConstants.STORE_LOADSTATUS_SUCCESS + .equalsIgnoreCase(loadMetadataDetails.getLoadStatus()) + || CarbonCommonConstants.MARKED_FOR_UPDATE + .equalsIgnoreCase(loadMetadataDetails.getLoadStatus()) + || CarbonCommonConstants.STORE_LOADSTATUS_PARTIAL_SUCCESS + .equalsIgnoreCase(loadMetadataDetails.getLoadStatus())) { // check for merged loads. if (null != loadMetadataDetails.getMergedLoadName()) { - if (!validSegments.contains(loadMetadataDetails.getMergedLoadName())) { - validSegments.add(loadMetadataDetails.getMergedLoadName()); + if (!listOfValidSegments.contains(loadMetadataDetails.getMergedLoadName())) { + listOfValidSegments.add(loadMetadataDetails.getMergedLoadName()); } // if merged load is updated then put it in updated list - if (CarbonCommonConstants.MARKED_FOR_UPDATE.equalsIgnoreCase(loadStatus)) { - validUpdatedSegments.add(loadMetadataDetails.getMergedLoadName()); + if (CarbonCommonConstants.MARKED_FOR_UPDATE + .equalsIgnoreCase(loadMetadataDetails.getLoadStatus())) { + listOfValidUpdatedSegments.add(loadMetadataDetails.getMergedLoadName()); } continue; } - if (CarbonCommonConstants.MARKED_FOR_UPDATE.equalsIgnoreCase(loadStatus)) { - validUpdatedSegments.add(loadMetadataDetails.getLoadName()); + + if (CarbonCommonConstants.MARKED_FOR_UPDATE + .equalsIgnoreCase(loadMetadataDetails.getLoadStatus())) { + + listOfValidUpdatedSegments.add(loadMetadataDetails.getLoadName()); } - validSegments.add(loadMetadataDetails.getLoadName()); - } else if (CarbonCommonConstants.STORE_LOADSTATUS_FAILURE.equalsIgnoreCase(loadStatus) - || CarbonCommonConstants.SEGMENT_COMPACTED.equalsIgnoreCase(loadStatus) - || CarbonCommonConstants.MARKED_FOR_DELETE.equalsIgnoreCase(loadStatus)) { - invalidSegments.add(loadMetadataDetails.getLoadName()); + listOfValidSegments.add(loadMetadataDetails.getLoadName()); + } else if ((CarbonCommonConstants.STORE_LOADSTATUS_FAILURE + .equalsIgnoreCase(loadMetadataDetails.getLoadStatus()) + || CarbonCommonConstants.COMPACTED + .equalsIgnoreCase(loadMetadataDetails.getLoadStatus()) + || CarbonCommonConstants.MARKED_FOR_DELETE + .equalsIgnoreCase(loadMetadataDetails.getLoadStatus()))) { + listOfInvalidSegments.add(loadMetadataDetails.getLoadName()); } --- End diff -- ok --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user ManoharVanam commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/492#discussion_r94937338 --- Diff: core/src/main/java/org/apache/carbondata/core/updatestatus/SegmentUpdateStatusManager.java --- @@ -0,0 +1,971 @@ +package org.apache.carbondata.core.updatestatus; --- End diff -- ok --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/492#discussion_r94937615 --- Diff: core/src/main/java/org/apache/carbondata/core/carbon/datastore/TableSegmentUniqueIdentifier.java --- @@ -101,10 +102,17 @@ public String getSegmentId() { */ public String getUniqueTableSegmentIdentifier() { CarbonTableIdentifier carbonTableIdentifier = - absoluteTableIdentifier.getCarbonTableIdentifier(); - return carbonTableIdentifier.getDatabaseName() - + CarbonCommonConstants.FILE_SEPARATOR + carbonTableIdentifier - .getTableId() + CarbonCommonConstants.FILE_SEPARATOR + segmentId; + absoluteTableIdentifier.getCarbonTableIdentifier(); + return carbonTableIdentifier.getDatabaseName() + CarbonCommonConstants.FILE_SEPARATOR + + carbonTableIdentifier.getTableName() + CarbonCommonConstants.UNDERSCORE + + carbonTableIdentifier.getTableId() + CarbonCommonConstants.FILE_SEPARATOR + segmentId; --- End diff -- As each tableId will be unique I think we can shorten the lruCacheKey and modify the code as below. return carbonTableIdentifier.getTableId() + CarbonCommonConstants.FILE_SEPARATOR + segmentId; --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/incubator-carbondata/pull/492 Build Success with Spark 1.5.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/477/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user mohammadshahidkhan commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/492#discussion_r94941379 --- Diff: core/src/main/java/org/apache/carbondata/core/carbon/datastore/block/TableBlockInfo.java --- @@ -64,6 +67,12 @@ */ private BlockletInfos blockletInfos = new BlockletInfos(); + /** + * map of block location and storage id + */ + private Map<String, String> blockStorageIdMap = --- End diff -- what is the use of blockStorageIdMap? If not in use please remove. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/492#discussion_r94942075 --- Diff: core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java --- @@ -199,6 +199,16 @@ * FACT_FILE_EXT */ public static final String FACT_FILE_EXT = ".carbondata"; + + /** + * DELETE_DELTA_FILE_EXT + */ + public static final String DELETE_DELTA_FILE_EXT = ".deletedelta"; + + /** + * UPDATE_DELTA_FILE_EXT + */ + public static final String UPDATE_DELTA_FILE_EXT = FACT_FILE_EXT; --- End diff -- 2 variables are not required. Use FACT_FILE_EXT in place of UPDATE_DELTA_FILE_EXT wherever required --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user nareshpr commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/492#discussion_r94942475 --- Diff: core/src/main/java/org/apache/carbondata/scan/scanner/impl/FilterScanner.java --- @@ -114,16 +117,22 @@ private void fillScannedResult(BlocksChunkHolder blocksChunkHolder) throws FilterUnsupportedException { scannedResult.reset(); - QueryStatistic totalBlockletStatistic = queryStatisticsModel.getStatisticsTypeAndObjMap() - .get(QueryStatisticsConstants.TOTAL_BLOCKLET_NUM); - totalBlockletStatistic.addCountStatistic(QueryStatisticsConstants.TOTAL_BLOCKLET_NUM, - totalBlockletStatistic.getCount() + 1); - queryStatisticsModel.getRecorder().recordStatistics(totalBlockletStatistic); + scannedResult.setBlockletId( + blockExecutionInfo.getBlockId() + CarbonCommonConstants.FILE_SEPARATOR + blocksChunkHolder + .getDataBlock().nodeNumber()); // apply min max if (isMinMaxEnabled) { - BitSet bitSet = this.filterExecuter - .isScanRequired(blocksChunkHolder.getDataBlock().getColumnsMaxValue(), - blocksChunkHolder.getDataBlock().getColumnsMinValue()); + BitSet bitSet = null; + // check for implicit include filter instance + if (filterExecuter instanceof ImplicitColumnFilterExecutor) { --- End diff -- This code will be removed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user mohammadshahidkhan commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/492#discussion_r94943353 --- Diff: core/src/main/java/org/apache/carbondata/core/carbon/datastore/TableSegmentUniqueIdentifier.java --- @@ -101,10 +102,17 @@ public String getSegmentId() { */ public String getUniqueTableSegmentIdentifier() { CarbonTableIdentifier carbonTableIdentifier = - absoluteTableIdentifier.getCarbonTableIdentifier(); - return carbonTableIdentifier.getDatabaseName() - + CarbonCommonConstants.FILE_SEPARATOR + carbonTableIdentifier - .getTableId() + CarbonCommonConstants.FILE_SEPARATOR + segmentId; + absoluteTableIdentifier.getCarbonTableIdentifier(); + return carbonTableIdentifier.getDatabaseName() + CarbonCommonConstants.FILE_SEPARATOR + + carbonTableIdentifier.getTableName() + CarbonCommonConstants.UNDERSCORE + + carbonTableIdentifier.getTableId() + CarbonCommonConstants.FILE_SEPARATOR + segmentId; --- End diff -- @manishgupta88 Adding database name and table name along with the table id will make easy to debug. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user mohammadshahidkhan commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/492#discussion_r94943478 --- Diff: core/src/main/java/org/apache/carbondata/core/carbon/datastore/SegmentTaskIndexStore.java --- @@ -82,23 +85,24 @@ public SegmentTaskIndexStore(String carbonStorePath, CarbonLRUCache lruCache) { @Override public SegmentTaskIndexWrapper get(TableSegmentUniqueIdentifier tableSegmentUniqueIdentifier) throws IOException { - SegmentTaskIndexWrapper segmentTaskIndexWrapper = - loadAndGetTaskIdToSegmentsMap(tableSegmentUniqueIdentifier.getSegmentToTableBlocksInfos(), - tableSegmentUniqueIdentifier.getAbsoluteTableIdentifier(), - tableSegmentUniqueIdentifier); + SegmentTaskIndexWrapper segmentTaskIndexWrapper = null; + try { + segmentTaskIndexWrapper = + loadAndGetTaskIdToSegmentsMap(tableSegmentUniqueIdentifier.getSegmentToTableBlocksInfos(), + tableSegmentUniqueIdentifier.getAbsoluteTableIdentifier(), + tableSegmentUniqueIdentifier); + } catch (IndexBuilderException e) { + throw new IOException(e.getMessage(), e); + } catch (Throwable e) { + throw new IOException("Problem in loading segment block.", e); + } + if (null != segmentTaskIndexWrapper) { segmentTaskIndexWrapper.incrementAccessCount(); --- End diff -- Please remove the segmentTaskIndexWrapper.incrementAccessCount(); It is not needed here. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/incubator-carbondata/pull/492 Build Success with Spark 1.5.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/479/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user gvramana commented on the issue:
https://github.com/apache/incubator-carbondata/pull/492 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user asfgit closed the pull request at:
https://github.com/apache/incubator-carbondata/pull/492 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user zzcclp commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/492#discussion_r94985585 --- Diff: integration/spark/src/main/scala/org/apache/spark/sql/optimizer/CarbonOptimizer.scala --- @@ -72,23 +74,71 @@ object CarbonOptimizer { class ResolveCarbonFunctions(relations: Seq[CarbonDecoderRelation]) extends Rule[LogicalPlan] with PredicateHelper { val LOGGER = LogServiceFactory.getLogService(this.getClass.getName) - def apply(plan: LogicalPlan): LogicalPlan = { - if (relations.nonEmpty && !isOptimized(plan)) { + def apply(logicalPlan: LogicalPlan): LogicalPlan = { + if (relations.nonEmpty && !isOptimized(logicalPlan)) { + val plan = processPlan(logicalPlan) + val udfTransformedPlan = pushDownUDFToJoinLeftRelation(plan) LOGGER.info("Starting to optimize plan") val recorder = CarbonTimeStatisticsFactory.createExecutorRecorder("") val queryStatistic = new QueryStatistic() - val result = transformCarbonPlan(plan, relations) + val result = transformCarbonPlan(udfTransformedPlan, relations) queryStatistic.addStatistics("Time taken for Carbon Optimizer to optimize: ", System.currentTimeMillis) recorder.recordStatistics(queryStatistic) recorder.logStatistics() result } else { LOGGER.info("Skip CarbonOptimizer") - plan + logicalPlan } } + private def processPlan(plan: LogicalPlan): LogicalPlan = { + plan transform { + case ProjectForUpdate(table, cols, Seq(updatePlan)) => + var isTransformed = false + val newPlan = updatePlan transform { + case Project(pList, child) if (!isTransformed) => + val (dest: Seq[NamedExpression], source: Seq[NamedExpression]) = pList + .splitAt(pList.size - cols.size) + val diff = cols.diff(dest.map(_.name)) + if (diff.size > 0) { + sys.error(s"Unknown column(s) ${diff.mkString(",")} in table ${table.tableName}") + } + isTransformed = true + Project(dest.filter(a => !cols.contains(a.name)) ++ source, child) + } + ProjectForUpdateCommand(newPlan, table.tableIdentifier) --- End diff -- @ravikiran23 @jackylk UnresolvedRelation.tableIdentifier is a 'Seq[String]' type in Spark-1.5 while in Spark-1.6 it's a 'TableIdentifier' type, so compile unsuccessfully with Spark-1.6. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
Free forum by Nabble | Edit this page |