[GitHub] [carbondata] QiangCai opened a new pull request #3777: [WIP][Perf] Disable MV feature by default

classic Classic list List threaded Threaded
19 messages Options
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] QiangCai opened a new pull request #3777: [WIP][Perf] Disable MV feature by default

GitBox

QiangCai opened a new pull request #3777:
URL: https://github.com/apache/carbondata/pull/3777


    ### Why is this PR needed?
   
   
    ### What changes were proposed in this PR?
   
       
    ### Does this PR introduce any user interface change?
    - No
    - Yes. (please explain the change and update document)
   
    ### Is any new testcase added?
    - No
    - Yes
   
       
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3777: [WIP][Perf] Disable MV feature by default

GitBox

CarbonDataQA1 commented on pull request #3777:
URL: https://github.com/apache/carbondata/pull/3777#issuecomment-635102422


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3080/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3777: [WIP][Perf] Disable MV feature by default

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3777:
URL: https://github.com/apache/carbondata/pull/3777#issuecomment-635102683


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1359/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] jackylk commented on pull request #3777: [WIP][Perf] Support disable mv

GitBox
In reply to this post by GitBox

jackylk commented on pull request #3777:
URL: https://github.com/apache/carbondata/pull/3777#issuecomment-636468129


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] jackylk commented on a change in pull request #3777: [CARBONDATA-3837] Fallback to the original plan when mv rewrite throw exception

GitBox
In reply to this post by GitBox

jackylk commented on a change in pull request #3777:
URL: https://github.com/apache/carbondata/pull/3777#discussion_r432946953



##########
File path: integration/spark/src/main/spark2.4/org/apache/spark/sql/CarbonToSparkAdapter.scala
##########
@@ -193,9 +193,17 @@ class CarbonOptimizer(
     catalog: SessionCatalog,
     optimizer: Optimizer) extends Optimizer(catalog) {
 
-  private lazy val mvRules = Seq(Batch("Materialized View Optimizers", Once,
+  private lazy val _mvRules = Seq(Batch("Materialized View Optimizers", Once,
     Seq(new MVRewriteRule(session)): _*))
 
+  private def mvRules: Seq[Batch] = {
+    // enable mv by default
+    session.conf.get("spark.carbon.mv.enable", "true").toBoolean match {

Review comment:
       make a  constant for default value




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] jackylk commented on a change in pull request #3777: [CARBONDATA-3837] Fallback to the original plan when mv rewrite throw exception

GitBox
In reply to this post by GitBox

jackylk commented on a change in pull request #3777:
URL: https://github.com/apache/carbondata/pull/3777#discussion_r432947210



##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/optimizer/MVRewriteRule.scala
##########
@@ -39,15 +39,31 @@ import org.apache.carbondata.view.MVFunctions.DUMMY_FUNCTION
  */
 class MVRewriteRule(session: SparkSession) extends Rule[LogicalPlan] {
 
-  private val logger = MVRewriteRule.LOGGER
-
   private val catalogFactory = new MVCatalogFactory[MVSchemaWrapper] {
     override def newCatalog(): MVCatalog[MVSchemaWrapper] = {
       new MVCatalogInSpark(session)
     }
   }
 
   override def apply(logicalPlan: LogicalPlan): LogicalPlan = {
+    // only query need to check MVRewriteRule
+    logicalPlan match {
+      case _: Command => return logicalPlan
+      case _: LocalRelation => return logicalPlan
+      case _ =>
+    }
+    try {
+      tryRewritePlan(logicalPlan)
+    } catch {
+      case e =>
+        // catch all exceptions to avoid to impact query.
+        // if MVRewriteRule throw exception, here will fallback to original plan.

Review comment:
       ```suggestion
           // if exception is thrown while rewriting the query, will fallback to original query plan.
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] jackylk commented on a change in pull request #3777: [CARBONDATA-3837] Fallback to the original plan when mv rewrite throw exception

GitBox
In reply to this post by GitBox

jackylk commented on a change in pull request #3777:
URL: https://github.com/apache/carbondata/pull/3777#discussion_r432947240



##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/optimizer/MVRewriteRule.scala
##########
@@ -39,15 +39,31 @@ import org.apache.carbondata.view.MVFunctions.DUMMY_FUNCTION
  */
 class MVRewriteRule(session: SparkSession) extends Rule[LogicalPlan] {
 
-  private val logger = MVRewriteRule.LOGGER
-
   private val catalogFactory = new MVCatalogFactory[MVSchemaWrapper] {
     override def newCatalog(): MVCatalog[MVSchemaWrapper] = {
       new MVCatalogInSpark(session)
     }
   }
 
   override def apply(logicalPlan: LogicalPlan): LogicalPlan = {
+    // only query need to check MVRewriteRule
+    logicalPlan match {
+      case _: Command => return logicalPlan
+      case _: LocalRelation => return logicalPlan
+      case _ =>
+    }
+    try {
+      tryRewritePlan(logicalPlan)
+    } catch {
+      case e =>
+        // catch all exceptions to avoid to impact query.

Review comment:
       This line can be deleted




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] QiangCai commented on a change in pull request #3777: [CARBONDATA-3837] Fallback to the original plan when mv rewrite throw exception

GitBox
In reply to this post by GitBox

QiangCai commented on a change in pull request #3777:
URL: https://github.com/apache/carbondata/pull/3777#discussion_r432947870



##########
File path: integration/spark/src/main/spark2.4/org/apache/spark/sql/CarbonToSparkAdapter.scala
##########
@@ -193,9 +193,17 @@ class CarbonOptimizer(
     catalog: SessionCatalog,
     optimizer: Optimizer) extends Optimizer(catalog) {
 
-  private lazy val mvRules = Seq(Batch("Materialized View Optimizers", Once,
+  private lazy val _mvRules = Seq(Batch("Materialized View Optimizers", Once,
     Seq(new MVRewriteRule(session)): _*))
 
+  private def mvRules: Seq[Batch] = {
+    // enable mv by default
+    session.conf.get("spark.carbon.mv.enable", "true").toBoolean match {

Review comment:
       this is a runtime configuration.
   so the user can use the SET command to change it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3777: [CARBONDATA-3837] Fallback to the original plan when mv rewrite throw exception

GitBox
In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #3777:
URL: https://github.com/apache/carbondata/pull/3777#discussion_r432948384



##########
File path: integration/spark/src/main/spark2.3/org/apache/spark/sql/CarbonToSparkAdapter.scala
##########
@@ -154,9 +154,17 @@ class CarbonOptimizer(
     catalog: SessionCatalog,
     optimizer: Optimizer) extends Optimizer(catalog) {
 
-  private lazy val mvRules = Seq(Batch("Materialized View Optimizers", Once,
+  private lazy val _mvRules = Seq(Batch("Materialized View Optimizers", Once,
     Seq(new MVRewriteRule(session)): _*))
 
+  private def mvRules: Seq[Batch] = {
+    // enable mv by default
+    session.conf.get("spark.carbon.mv.enable", "true").toBoolean match {

Review comment:
       How user knows this property exist ? Need to add it in some document ?
   And why not as carbon property ?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] jackylk commented on a change in pull request #3777: [CARBONDATA-3837] Fallback to the original plan when mv rewrite throw exception

GitBox
In reply to this post by GitBox

jackylk commented on a change in pull request #3777:
URL: https://github.com/apache/carbondata/pull/3777#discussion_r432948452



##########
File path: integration/spark/src/main/spark2.4/org/apache/spark/sql/CarbonToSparkAdapter.scala
##########
@@ -193,9 +193,17 @@ class CarbonOptimizer(
     catalog: SessionCatalog,
     optimizer: Optimizer) extends Optimizer(catalog) {
 
-  private lazy val mvRules = Seq(Batch("Materialized View Optimizers", Once,
+  private lazy val _mvRules = Seq(Batch("Materialized View Optimizers", Once,
     Seq(new MVRewriteRule(session)): _*))
 
+  private def mvRules: Seq[Batch] = {
+    // enable mv by default
+    session.conf.get("spark.carbon.mv.enable", "true").toBoolean match {

Review comment:
       conf name should start with carbon, and please add in doc also




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] jackylk commented on a change in pull request #3777: [CARBONDATA-3837] Fallback to the original plan when mv rewrite throw exception

GitBox
In reply to this post by GitBox

jackylk commented on a change in pull request #3777:
URL: https://github.com/apache/carbondata/pull/3777#discussion_r432948661



##########
File path: integration/spark/src/main/spark2.3/org/apache/spark/sql/CarbonToSparkAdapter.scala
##########
@@ -154,9 +154,17 @@ class CarbonOptimizer(
     catalog: SessionCatalog,
     optimizer: Optimizer) extends Optimizer(catalog) {
 
-  private lazy val mvRules = Seq(Batch("Materialized View Optimizers", Once,
+  private lazy val _mvRules = Seq(Batch("Materialized View Optimizers", Once,
     Seq(new MVRewriteRule(session)): _*))
 
+  private def mvRules: Seq[Batch] = {
+    // enable mv by default
+    session.conf.get("spark.carbon.mv.enable", "true").toBoolean match {

Review comment:
       please add in https://github.com/apache/carbondata/blob/master/docs/configuration-parameters.md 
   conf name should start with carbon, changed to `carbon.mv.enabled`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] QiangCai commented on a change in pull request #3777: [CARBONDATA-3837] Fallback to the original plan when mv rewrite throw exception

GitBox
In reply to this post by GitBox

QiangCai commented on a change in pull request #3777:
URL: https://github.com/apache/carbondata/pull/3777#discussion_r432953564



##########
File path: integration/spark/src/main/spark2.3/org/apache/spark/sql/CarbonToSparkAdapter.scala
##########
@@ -154,9 +154,17 @@ class CarbonOptimizer(
     catalog: SessionCatalog,
     optimizer: Optimizer) extends Optimizer(catalog) {
 
-  private lazy val mvRules = Seq(Batch("Materialized View Optimizers", Once,
+  private lazy val _mvRules = Seq(Batch("Materialized View Optimizers", Once,
     Seq(new MVRewriteRule(session)): _*))
 
+  private def mvRules: Seq[Batch] = {
+    // enable mv by default
+    session.conf.get("spark.carbon.mv.enable", "true").toBoolean match {

Review comment:
       fixed




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3777: [CARBONDATA-3837] Fallback to the original plan when mv rewrite throw exception

GitBox
In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #3777:
URL: https://github.com/apache/carbondata/pull/3777#discussion_r432959146



##########
File path: core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
##########
@@ -1562,6 +1562,20 @@ private CarbonCommonConstants() {
 
   public static final String CARBON_LUCENE_INDEX_STOP_WORDS_DEFAULT = "false";
 
+  //////////////////////////////////////////////////////////////////////////////////////////
+  // MV parameter start here
+  //////////////////////////////////////////////////////////////////////////////////////////
+
+  /**
+   * Property to enable MV rewrite
+   */
+  @CarbonProperty(dynamicConfigurable = true)

Review comment:
       You wanted to make this dynamic configurable, but forgot to add in
   SessionParams#validateKeyValue ?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3777: [CARBONDATA-3837] Fallback to the original plan when mv rewrite throw exception

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3777:
URL: https://github.com/apache/carbondata/pull/3777#issuecomment-636507759


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3112/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3777: [CARBONDATA-3837] Fallback to the original plan when mv rewrite throw exception

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3777:
URL: https://github.com/apache/carbondata/pull/3777#issuecomment-636508600


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1388/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3777: [CARBONDATA-3837] Fallback to the original plan when mv rewrite throw exception

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3777:
URL: https://github.com/apache/carbondata/pull/3777#issuecomment-636568643


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3113/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3777: [CARBONDATA-3837] Fallback to the original plan when mv rewrite throw exception

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3777:
URL: https://github.com/apache/carbondata/pull/3777#issuecomment-636568738


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1389/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] jackylk commented on pull request #3777: [CARBONDATA-3837] Fallback to the original plan when mv rewrite throw exception

GitBox
In reply to this post by GitBox

jackylk commented on pull request #3777:
URL: https://github.com/apache/carbondata/pull/3777#issuecomment-636579166


   LGTM


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] asfgit closed pull request #3777: [CARBONDATA-3837] Fallback to the original plan when mv rewrite throw exception

GitBox
In reply to this post by GitBox

asfgit closed pull request #3777:
URL: https://github.com/apache/carbondata/pull/3777


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]