kunal642 opened a new pull request #3601: Fixed performance issue for drop table
URL: https://github.com/apache/carbondata/pull/3601 ### Why is this PR needed? ### What changes were proposed in this PR? ### Does this PR introduce any user interface change? - No - Yes. (please explain the change and update document) ### Is any new testcase added? - No - Yes ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
CarbonDataQA1 commented on issue #3601: [WIP] Fixed performance issue for drop table
URL: https://github.com/apache/carbondata/pull/3601#issuecomment-581471527 Build Failed with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/136/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3601: [WIP] Fixed performance issue for drop table
URL: https://github.com/apache/carbondata/pull/3601#issuecomment-581472138 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1840/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3601: [CARBONDATA-3677] Fixed performance issue for drop table
URL: https://github.com/apache/carbondata/pull/3601#issuecomment-581757193 Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/142/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3601: [CARBONDATA-3677] Fixed performance issue for drop table
URL: https://github.com/apache/carbondata/pull/3601#issuecomment-581771136 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1846/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
jackylk commented on a change in pull request #3601: [CARBONDATA-3677] Fixed performance issue for drop table
URL: https://github.com/apache/carbondata/pull/3601#discussion_r374774775 ########## File path: core/src/main/java/org/apache/carbondata/core/datamap/DataMapChooser.java ########## @@ -121,6 +121,10 @@ public DataMapExprWrapper chooseCGDataMap(FilterResolverIntf resolverIntf) { return chooseDataMap(DataMapLevel.CG, resolverIntf); } + public boolean hasCGDataMap() { Review comment: It is strange we add this func in a class called `Chooser`, can you put in a more suitable class? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
jackylk commented on a change in pull request #3601: [CARBONDATA-3677] Fixed performance issue for drop table
URL: https://github.com/apache/carbondata/pull/3601#discussion_r374775357 ########## File path: core/src/main/java/org/apache/carbondata/core/datamap/DataMapChooser.java ########## @@ -121,6 +121,10 @@ public DataMapExprWrapper chooseCGDataMap(FilterResolverIntf resolverIntf) { return chooseDataMap(DataMapLevel.CG, resolverIntf); } + public boolean hasCGDataMap() { Review comment: It is strange we add this func in a class called `Chooser`, can you put it in CarbonTable? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
jackylk commented on a change in pull request #3601: [CARBONDATA-3677] Fixed performance issue for drop table
URL: https://github.com/apache/carbondata/pull/3601#discussion_r374777441 ########## File path: integration/spark2/src/main/scala/org/apache/carbondata/indexserver/DistributedPruneRDD.scala ########## @@ -142,7 +142,6 @@ private[indexserver] class DistributedPruneRDD(@transient private val ss: SparkS while (reader.nextKeyValue()) { blocklets.add(reader.getCurrentValue) } - reader.close() Review comment: not close reader? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
jackylk commented on a change in pull request #3601: [CARBONDATA-3677] Fixed performance issue for drop table
URL: https://github.com/apache/carbondata/pull/3601#discussion_r374781431 ########## File path: core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java ########## @@ -513,7 +513,17 @@ public void refreshSegmentCacheIfRequired(CarbonTable carbonTable, * @param identifier Table identifier */ public void clearDataMaps(AbsoluteTableIdentifier identifier) { - clearDataMaps(identifier, true); + CarbonTable carbonTable = getCarbonTable(identifier); + boolean launchJob = false; + try { + // launchJob will be true if either the table has a CGDatamap or index server is enabled for + // the specified table. + launchJob = new DataMapChooser(carbonTable).hasCGDataMap() || CarbonProperties.getInstance() + .isDistributedPruningEnabled(identifier.getDatabaseName(), identifier.getTableName()); + } catch (IOException e) { + LOGGER.warn("Unable to launch job to clear datamaps.", e); + } + clearDataMaps(identifier, launchJob); Review comment: `launchJob` parameter name is not easy to understand, can you change it to `cleareDataMapCache(identifier, clearInAllWorkers)` to indicate that it will clear the datamap cache in all workers ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
jackylk commented on a change in pull request #3601: [CARBONDATA-3677] Fixed performance issue for drop table
URL: https://github.com/apache/carbondata/pull/3601#discussion_r374781431 ########## File path: core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java ########## @@ -513,7 +513,17 @@ public void refreshSegmentCacheIfRequired(CarbonTable carbonTable, * @param identifier Table identifier */ public void clearDataMaps(AbsoluteTableIdentifier identifier) { - clearDataMaps(identifier, true); + CarbonTable carbonTable = getCarbonTable(identifier); + boolean launchJob = false; + try { + // launchJob will be true if either the table has a CGDatamap or index server is enabled for + // the specified table. + launchJob = new DataMapChooser(carbonTable).hasCGDataMap() || CarbonProperties.getInstance() + .isDistributedPruningEnabled(identifier.getDatabaseName(), identifier.getTableName()); + } catch (IOException e) { + LOGGER.warn("Unable to launch job to clear datamaps.", e); + } + clearDataMaps(identifier, launchJob); Review comment: `launchJob` parameter name is not easy to understand, can you change it to `clearDataMapCache(identifier, clearInAllWorkers)` to indicate that it will clear the datamap cache in all workers ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
jackylk commented on a change in pull request #3601: [CARBONDATA-3677] Fixed performance issue for drop table
URL: https://github.com/apache/carbondata/pull/3601#discussion_r374783957 ########## File path: core/src/main/java/org/apache/carbondata/core/datamap/DistributableDataMapFormat.java ########## @@ -196,6 +196,11 @@ public float getProgress() { @Override public void close() { + // Clear the datamaps from executor + if (isFallbackJob) { + DataMapStoreManager.getInstance() + .clearDataMaps(table.getAbsoluteTableIdentifier(), false); Review comment: ```suggestion .clearDataMapCache(table.getAbsoluteTableIdentifier(), false); ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
kunal642 commented on a change in pull request #3601: [CARBONDATA-3677] Fixed performance issue for drop table
URL: https://github.com/apache/carbondata/pull/3601#discussion_r375648297 ########## File path: core/src/main/java/org/apache/carbondata/core/datamap/DataMapChooser.java ########## @@ -121,6 +121,10 @@ public DataMapExprWrapper chooseCGDataMap(FilterResolverIntf resolverIntf) { return chooseDataMap(DataMapLevel.CG, resolverIntf); } + public boolean hasCGDataMap() { Review comment: Moved to DataMapStoreManager because this method is used only by DataMapStoreManager. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
kunal642 commented on a change in pull request #3601: [CARBONDATA-3677] Fixed performance issue for drop table
URL: https://github.com/apache/carbondata/pull/3601#discussion_r375648396 ########## File path: integration/spark2/src/main/scala/org/apache/carbondata/indexserver/DistributedPruneRDD.scala ########## @@ -142,7 +142,6 @@ private[indexserver] class DistributedPruneRDD(@transient private val ss: SparkS while (reader.nextKeyValue()) { blocklets.add(reader.getCurrentValue) } - reader.close() Review comment: nextKeyValue is already closing when next() returns false ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
kunal642 commented on a change in pull request #3601: [CARBONDATA-3677] Fixed performance issue for drop table
URL: https://github.com/apache/carbondata/pull/3601#discussion_r375648653 ########## File path: core/src/main/java/org/apache/carbondata/core/datamap/DistributableDataMapFormat.java ########## @@ -196,6 +196,11 @@ public float getProgress() { @Override public void close() { + // Clear the datamaps from executor + if (isFallbackJob) { + DataMapStoreManager.getInstance() + .clearDataMaps(table.getAbsoluteTableIdentifier(), false); Review comment: done ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
kunal642 commented on a change in pull request #3601: [CARBONDATA-3677] Fixed performance issue for drop table
URL: https://github.com/apache/carbondata/pull/3601#discussion_r375648707 ########## File path: core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java ########## @@ -513,7 +513,17 @@ public void refreshSegmentCacheIfRequired(CarbonTable carbonTable, * @param identifier Table identifier */ public void clearDataMaps(AbsoluteTableIdentifier identifier) { - clearDataMaps(identifier, true); + CarbonTable carbonTable = getCarbonTable(identifier); + boolean launchJob = false; + try { + // launchJob will be true if either the table has a CGDatamap or index server is enabled for + // the specified table. + launchJob = new DataMapChooser(carbonTable).hasCGDataMap() || CarbonProperties.getInstance() + .isDistributedPruningEnabled(identifier.getDatabaseName(), identifier.getTableName()); + } catch (IOException e) { + LOGGER.warn("Unable to launch job to clear datamaps.", e); + } + clearDataMaps(identifier, launchJob); Review comment: done ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3601: [CARBONDATA-3677] Fixed performance issue for drop table
URL: https://github.com/apache/carbondata/pull/3601#issuecomment-582748130 Build Failed with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/164/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3601: [CARBONDATA-3677] Fixed performance issue for drop table
URL: https://github.com/apache/carbondata/pull/3601#issuecomment-582748510 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1867/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
Indhumathi27 commented on a change in pull request #3601: [CARBONDATA-3677] Fixed performance issue for drop table
URL: https://github.com/apache/carbondata/pull/3601#discussion_r375702400 ########## File path: core/src/main/java/org/apache/carbondata/core/metadata/schema/table/CarbonTable.java ########## @@ -35,6 +35,7 @@ import org.apache.carbondata.core.constants.CarbonCommonConstants; import org.apache.carbondata.core.constants.CarbonLoadOptionConstants; import org.apache.carbondata.core.constants.SortScopeOptions; +import org.apache.carbondata.core.datamap.DataMapLevel; Review comment: please remove if not used ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3601: [CARBONDATA-3677] Fixed performance issue for drop table
URL: https://github.com/apache/carbondata/pull/3601#issuecomment-583829617 Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/192/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3601: [CARBONDATA-3677] Fixed performance issue for drop table
URL: https://github.com/apache/carbondata/pull/3601#issuecomment-583834562 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1894/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
Free forum by Nabble | Edit this page |