jackylk opened a new pull request #3609: [WIP] Support MV extension and MV syntax
URL: https://github.com/apache/carbondata/pull/3609 ### Why is this PR needed? ### What changes were proposed in this PR? This PR adds Materialized View extension for spark: Following SQL command are supported: 1. CREATE MATERIALIZED VIEW 2. DROP MATERIALIZED VIEW 3. SHOW MATERIALIZED VIEW 4. REBUILD MATERIALIZED VIEW 5. ALTER MATERIALIZED VIEW COMPACT Following optimizer rules are added: 1. Rewrite SQL statement by matching existing MV and select the lowest cost MV ### Does this PR introduce any user interface change? - Yes. (new SQL syntax is added) ### Is any new testcase added? - No (existing Datamap testcase is modified to use MV syntax) ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
CarbonDataQA1 commented on issue #3609: [WIP] Support MV extension and MV syntax
URL: https://github.com/apache/carbondata/pull/3609#issuecomment-584156592 Build Failed with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/215/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3609: [WIP] Support MV extension and MV syntax
URL: https://github.com/apache/carbondata/pull/3609#issuecomment-584160108 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1917/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3609: [WIP] Support MV extension and MV syntax
URL: https://github.com/apache/carbondata/pull/3609#issuecomment-584178925 Build Failed with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/218/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3609: [WIP] Support MV extension and MV syntax
URL: https://github.com/apache/carbondata/pull/3609#issuecomment-584202324 Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/220/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3609: [WIP] Support MV extension and MV syntax
URL: https://github.com/apache/carbondata/pull/3609#issuecomment-584229059 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1922/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
jackylk commented on issue #3609: [WIP] Support MV extension and MV syntax
URL: https://github.com/apache/carbondata/pull/3609#issuecomment-584442975 retest this please ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3609: [WIP] Support MV extension and MV syntax
URL: https://github.com/apache/carbondata/pull/3609#issuecomment-584447788 Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/223/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3609: [CARBONDATA-3689] Support independent MV extension and MV syntax
URL: https://github.com/apache/carbondata/pull/3609#issuecomment-584459531 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1925/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
jackylk commented on issue #3609: [CARBONDATA-3689] Support independent MV extension and MV syntax
URL: https://github.com/apache/carbondata/pull/3609#issuecomment-584611183 retest this please ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3609: [CARBONDATA-3689] Support independent MV extension and MV syntax
URL: https://github.com/apache/carbondata/pull/3609#issuecomment-584617697 Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/238/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3609: [CARBONDATA-3689] Support independent MV extension and MV syntax
URL: https://github.com/apache/carbondata/pull/3609#issuecomment-584639662 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1940/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
Indhumathi27 commented on a change in pull request #3609: [CARBONDATA-3689] Support independent MV extension and MV syntax
URL: https://github.com/apache/carbondata/pull/3609#discussion_r378112277 ########## File path: datamap/mv/core/src/test/scala/org/apache/carbondata/mv/testutil/Tpcds_1_4_QueryBatch.scala ########## @@ -1,20 +1,3 @@ -/* - * Licensed to the Apache Software Foundation (ASF) under one or more - * contributor license agreements. See the NOTICE file distributed with - * this work for additional information regarding copyright ownership. - * The ASF licenses this file to You under the Apache License, Version 2.0 - * (the "License"); you may not use this file except in compliance with - * the License. You may obtain a copy of the License at - * Review comment: why License is removed? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
jackylk commented on a change in pull request #3609: [CARBONDATA-3689] Support independent MV extension and MV syntax
URL: https://github.com/apache/carbondata/pull/3609#discussion_r378144529 ########## File path: datamap/mv/core/src/test/scala/org/apache/carbondata/mv/testutil/Tpcds_1_4_QueryBatch.scala ########## @@ -1,20 +1,3 @@ -/* - * Licensed to the Apache Software Foundation (ASF) under one or more - * contributor license agreements. See the NOTICE file distributed with - * this work for additional information regarding copyright ownership. - * The ASF licenses this file to You under the Apache License, Version 2.0 - * (the "License"); you may not use this file except in compliance with - * the License. You may obtain a copy of the License at - * Review comment: fixed ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3609: [CARBONDATA-3689] Support independent MV extension and MV syntax
URL: https://github.com/apache/carbondata/pull/3609#issuecomment-585133263 Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/256/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3609: [CARBONDATA-3689] Support independent MV extension and MV syntax
URL: https://github.com/apache/carbondata/pull/3609#issuecomment-585156553 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1959/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
Indhumathi27 commented on a change in pull request #3609: [CARBONDATA-3689] Support independent MV extension and MV syntax
URL: https://github.com/apache/carbondata/pull/3609#discussion_r378676919 ########## File path: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/extension/MVParser.scala ########## @@ -0,0 +1,226 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.mv.extension + +import scala.language.implicitConversions +import scala.util.matching.Regex +import scala.util.parsing.combinator.PackratParsers +import scala.util.parsing.combinator.syntactical.StandardTokenParsers + +import org.apache.spark.sql.{DataFrame, SparkSession} +import org.apache.spark.sql.catalyst.{CarbonParserUtil, SqlLexical, TableIdentifier} +import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan +import org.apache.spark.sql.execution.command.AlterTableModel +import org.apache.spark.sql.execution.command.datamap.{CarbonCreateDataMapCommand, CarbonDataMapRebuildCommand, CarbonDataMapShowCommand, CarbonDropDataMapCommand} +import org.apache.spark.sql.execution.command.management.CarbonAlterTableCompactionCommand +import org.apache.spark.sql.hive.CarbonMVRules +import org.apache.spark.sql.util.{CarbonException, SparkSQLUtil} + +import org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException +import org.apache.carbondata.mv.rewrite.MVUdf + +class MVParser extends StandardTokenParsers with PackratParsers { + + // Keywords used in this parser + protected val SELECT: Regex = carbonKeyWord("SELECT") + protected val CREATE: Regex = carbonKeyWord("CREATE") + protected val MATERIALIZED: Regex = carbonKeyWord("MATERIALIZED") + protected val VIEW: Regex = carbonKeyWord("VIEW") + protected val VIEWS: Regex = carbonKeyWord("VIEWS") + protected val AS: Regex = carbonKeyWord("AS") + protected val DROP: Regex = carbonKeyWord("DROP") + protected val SHOW: Regex = carbonKeyWord("SHOW") + protected val IF: Regex = carbonKeyWord("IF") + protected val EXISTS: Regex = carbonKeyWord("EXISTS") + protected val NOT: Regex = carbonKeyWord("NOT") + protected val MVPROPERTIES: Regex = carbonKeyWord("MVPROPERTIES") + protected val WITH: Regex = carbonKeyWord("WITH") + protected val DEFERRED: Regex = carbonKeyWord("DEFERRED") + protected val REBUILD: Regex = carbonKeyWord("REBUILD") + protected val ON: Regex = carbonKeyWord("ON") + protected val TABLE: Regex = carbonKeyWord("TABLE") + protected val ALTER: Regex = carbonKeyWord("ALTER") + protected val COMPACT: Regex = carbonKeyWord("COMPACT") + protected val IN: Regex = carbonKeyWord("IN") + protected val SEGMENT: Regex = carbonKeyWord("SEGMENT") + protected val ID: Regex = carbonKeyWord("ID") + protected val WHERE: Regex = carbonKeyWord("WHERE") + + /** + * This will convert key word to regular expression. + */ + private def carbonKeyWord(keys: String): Regex = { + ("(?i)" + keys).r + } + + implicit def regexToParser(regex: Regex): Parser[String] = { + import lexical.Identifier + acceptMatch( + s"identifier matching regex ${ regex }", + { case Identifier(str) if regex.unapplySeq(str).isDefined => str } + ) + } + + // By default, use Reflection to find the reserved words defined in the sub class. + // NOTICE, Since the Keyword properties defined by sub class, we couldn't call this + // method during the parent class instantiation, because the sub class instance + // isn't created yet. + protected lazy val reservedWords: Seq[String] = + this + .getClass + .getMethods + .filter(_.getReturnType == classOf[Keyword]) + .map(_.invoke(this).asInstanceOf[Keyword].normalize) + + // Set the keywords as empty by default, will change that later. + override val lexical = new SqlLexical + + protected case class Keyword(str: String) { + def normalize: String = lexical.normalizeKeyword(str) + def parser: Parser[String] = normalize + } + + def parse(input: String): LogicalPlan = { + synchronized { + phrase(start)(new lexical.Scanner(input)) match { + case Success(plan, _) => + plan + case failureOrError => + CarbonException.analysisException(failureOrError.toString) + } + } + } + + private lazy val start: Parser[LogicalPlan] = mvCommand + + private lazy val mvCommand: Parser[LogicalPlan] = + createMV | dropMV | showMV | rebuildMV | compactMV + + /** + * CREATE MATERIALIZED VIEW IF NOT EXISTS mv_name + * MVPROPERTIES('KEY'='VALUE') AS mv_query_statement + */ + private lazy val createMV: Parser[LogicalPlan] = + CREATE ~> MATERIALIZED ~> VIEW ~> opt(IF ~> NOT ~> EXISTS) ~ ident ~ Review comment: ```suggestion CREATE ~> MATERIALIZED ~> VIEW ~> opt(IF ~> NOT ~> EXISTS) ~ ident ~ ontable ~(ident <~ ".").? ~ ident- ``` i suggest `db.table name` can be provided while creating materilaised views, which will be helpful in future. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
Indhumathi27 commented on a change in pull request #3609: [CARBONDATA-3689] Support independent MV extension and MV syntax
URL: https://github.com/apache/carbondata/pull/3609#discussion_r378677095 ########## File path: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/extension/MVParser.scala ########## @@ -0,0 +1,226 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.mv.extension + +import scala.language.implicitConversions +import scala.util.matching.Regex +import scala.util.parsing.combinator.PackratParsers +import scala.util.parsing.combinator.syntactical.StandardTokenParsers + +import org.apache.spark.sql.{DataFrame, SparkSession} +import org.apache.spark.sql.catalyst.{CarbonParserUtil, SqlLexical, TableIdentifier} +import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan +import org.apache.spark.sql.execution.command.AlterTableModel +import org.apache.spark.sql.execution.command.datamap.{CarbonCreateDataMapCommand, CarbonDataMapRebuildCommand, CarbonDataMapShowCommand, CarbonDropDataMapCommand} +import org.apache.spark.sql.execution.command.management.CarbonAlterTableCompactionCommand +import org.apache.spark.sql.hive.CarbonMVRules +import org.apache.spark.sql.util.{CarbonException, SparkSQLUtil} + +import org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException +import org.apache.carbondata.mv.rewrite.MVUdf + +class MVParser extends StandardTokenParsers with PackratParsers { + + // Keywords used in this parser + protected val SELECT: Regex = carbonKeyWord("SELECT") + protected val CREATE: Regex = carbonKeyWord("CREATE") + protected val MATERIALIZED: Regex = carbonKeyWord("MATERIALIZED") + protected val VIEW: Regex = carbonKeyWord("VIEW") + protected val VIEWS: Regex = carbonKeyWord("VIEWS") + protected val AS: Regex = carbonKeyWord("AS") + protected val DROP: Regex = carbonKeyWord("DROP") + protected val SHOW: Regex = carbonKeyWord("SHOW") + protected val IF: Regex = carbonKeyWord("IF") + protected val EXISTS: Regex = carbonKeyWord("EXISTS") + protected val NOT: Regex = carbonKeyWord("NOT") + protected val MVPROPERTIES: Regex = carbonKeyWord("MVPROPERTIES") + protected val WITH: Regex = carbonKeyWord("WITH") + protected val DEFERRED: Regex = carbonKeyWord("DEFERRED") + protected val REBUILD: Regex = carbonKeyWord("REBUILD") + protected val ON: Regex = carbonKeyWord("ON") + protected val TABLE: Regex = carbonKeyWord("TABLE") + protected val ALTER: Regex = carbonKeyWord("ALTER") + protected val COMPACT: Regex = carbonKeyWord("COMPACT") + protected val IN: Regex = carbonKeyWord("IN") + protected val SEGMENT: Regex = carbonKeyWord("SEGMENT") + protected val ID: Regex = carbonKeyWord("ID") + protected val WHERE: Regex = carbonKeyWord("WHERE") + + /** + * This will convert key word to regular expression. + */ + private def carbonKeyWord(keys: String): Regex = { + ("(?i)" + keys).r + } + + implicit def regexToParser(regex: Regex): Parser[String] = { + import lexical.Identifier + acceptMatch( + s"identifier matching regex ${ regex }", + { case Identifier(str) if regex.unapplySeq(str).isDefined => str } + ) + } + + // By default, use Reflection to find the reserved words defined in the sub class. + // NOTICE, Since the Keyword properties defined by sub class, we couldn't call this + // method during the parent class instantiation, because the sub class instance + // isn't created yet. + protected lazy val reservedWords: Seq[String] = + this + .getClass + .getMethods + .filter(_.getReturnType == classOf[Keyword]) + .map(_.invoke(this).asInstanceOf[Keyword].normalize) + + // Set the keywords as empty by default, will change that later. + override val lexical = new SqlLexical + + protected case class Keyword(str: String) { + def normalize: String = lexical.normalizeKeyword(str) + def parser: Parser[String] = normalize + } + + def parse(input: String): LogicalPlan = { + synchronized { + phrase(start)(new lexical.Scanner(input)) match { + case Success(plan, _) => + plan + case failureOrError => + CarbonException.analysisException(failureOrError.toString) + } + } + } + + private lazy val start: Parser[LogicalPlan] = mvCommand + + private lazy val mvCommand: Parser[LogicalPlan] = + createMV | dropMV | showMV | rebuildMV | compactMV + + /** + * CREATE MATERIALIZED VIEW IF NOT EXISTS mv_name + * MVPROPERTIES('KEY'='VALUE') AS mv_query_statement + */ + private lazy val createMV: Parser[LogicalPlan] = + CREATE ~> MATERIALIZED ~> VIEW ~> opt(IF ~> NOT ~> EXISTS) ~ ident ~ + opt(WITH ~> DEFERRED ~> REBUILD) ~ + (MVPROPERTIES ~> "(" ~> repsep(options, ",") <~ ")").? ~ + (AS ~> restInput).? <~ opt(";") ^^ { + case ifNotExists ~ mvName ~ deferredRebuild ~ mvProperties ~ query => + val map = mvProperties.getOrElse(List[(String, String)]()).toMap[String, String] + CarbonCreateDataMapCommand(mvName, None, "mv", map, query, + ifNotExists.isDefined, deferredRebuild.isDefined) + } + + /** + * DROP MATERIALIZED VIEW IF EXISTS mv_name + */ + private lazy val dropMV: Parser[LogicalPlan] = + DROP ~> MATERIALIZED ~> VIEW ~> opt(IF ~> EXISTS) ~ ident <~ opt(";") ^^ { Review comment: ```suggestion DROP ~> MATERIALIZED ~> VIEW ~> opt(IF ~> EXISTS) ~ ident ~ONTABLE ~ (ident <~ ".").? ~ ident<~ opt(";") ^^ { ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
Indhumathi27 commented on a change in pull request #3609: [CARBONDATA-3689] Support independent MV extension and MV syntax
URL: https://github.com/apache/carbondata/pull/3609#discussion_r378677616 ########## File path: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/extension/MVParser.scala ########## @@ -0,0 +1,226 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.mv.extension + +import scala.language.implicitConversions +import scala.util.matching.Regex +import scala.util.parsing.combinator.PackratParsers +import scala.util.parsing.combinator.syntactical.StandardTokenParsers + +import org.apache.spark.sql.{DataFrame, SparkSession} +import org.apache.spark.sql.catalyst.{CarbonParserUtil, SqlLexical, TableIdentifier} +import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan +import org.apache.spark.sql.execution.command.AlterTableModel +import org.apache.spark.sql.execution.command.datamap.{CarbonCreateDataMapCommand, CarbonDataMapRebuildCommand, CarbonDataMapShowCommand, CarbonDropDataMapCommand} +import org.apache.spark.sql.execution.command.management.CarbonAlterTableCompactionCommand +import org.apache.spark.sql.hive.CarbonMVRules +import org.apache.spark.sql.util.{CarbonException, SparkSQLUtil} + +import org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException +import org.apache.carbondata.mv.rewrite.MVUdf + +class MVParser extends StandardTokenParsers with PackratParsers { + + // Keywords used in this parser + protected val SELECT: Regex = carbonKeyWord("SELECT") + protected val CREATE: Regex = carbonKeyWord("CREATE") + protected val MATERIALIZED: Regex = carbonKeyWord("MATERIALIZED") + protected val VIEW: Regex = carbonKeyWord("VIEW") + protected val VIEWS: Regex = carbonKeyWord("VIEWS") + protected val AS: Regex = carbonKeyWord("AS") + protected val DROP: Regex = carbonKeyWord("DROP") + protected val SHOW: Regex = carbonKeyWord("SHOW") + protected val IF: Regex = carbonKeyWord("IF") + protected val EXISTS: Regex = carbonKeyWord("EXISTS") + protected val NOT: Regex = carbonKeyWord("NOT") + protected val MVPROPERTIES: Regex = carbonKeyWord("MVPROPERTIES") + protected val WITH: Regex = carbonKeyWord("WITH") + protected val DEFERRED: Regex = carbonKeyWord("DEFERRED") + protected val REBUILD: Regex = carbonKeyWord("REBUILD") + protected val ON: Regex = carbonKeyWord("ON") + protected val TABLE: Regex = carbonKeyWord("TABLE") + protected val ALTER: Regex = carbonKeyWord("ALTER") + protected val COMPACT: Regex = carbonKeyWord("COMPACT") + protected val IN: Regex = carbonKeyWord("IN") + protected val SEGMENT: Regex = carbonKeyWord("SEGMENT") + protected val ID: Regex = carbonKeyWord("ID") + protected val WHERE: Regex = carbonKeyWord("WHERE") + + /** + * This will convert key word to regular expression. + */ + private def carbonKeyWord(keys: String): Regex = { + ("(?i)" + keys).r + } + + implicit def regexToParser(regex: Regex): Parser[String] = { + import lexical.Identifier + acceptMatch( + s"identifier matching regex ${ regex }", + { case Identifier(str) if regex.unapplySeq(str).isDefined => str } + ) + } + + // By default, use Reflection to find the reserved words defined in the sub class. + // NOTICE, Since the Keyword properties defined by sub class, we couldn't call this + // method during the parent class instantiation, because the sub class instance + // isn't created yet. + protected lazy val reservedWords: Seq[String] = + this + .getClass + .getMethods + .filter(_.getReturnType == classOf[Keyword]) + .map(_.invoke(this).asInstanceOf[Keyword].normalize) + + // Set the keywords as empty by default, will change that later. + override val lexical = new SqlLexical + + protected case class Keyword(str: String) { + def normalize: String = lexical.normalizeKeyword(str) + def parser: Parser[String] = normalize + } + + def parse(input: String): LogicalPlan = { + synchronized { + phrase(start)(new lexical.Scanner(input)) match { + case Success(plan, _) => + plan + case failureOrError => + CarbonException.analysisException(failureOrError.toString) + } + } + } + + private lazy val start: Parser[LogicalPlan] = mvCommand + + private lazy val mvCommand: Parser[LogicalPlan] = + createMV | dropMV | showMV | rebuildMV | compactMV + + /** + * CREATE MATERIALIZED VIEW IF NOT EXISTS mv_name + * MVPROPERTIES('KEY'='VALUE') AS mv_query_statement + */ + private lazy val createMV: Parser[LogicalPlan] = + CREATE ~> MATERIALIZED ~> VIEW ~> opt(IF ~> NOT ~> EXISTS) ~ ident ~ + opt(WITH ~> DEFERRED ~> REBUILD) ~ + (MVPROPERTIES ~> "(" ~> repsep(options, ",") <~ ")").? ~ + (AS ~> restInput).? <~ opt(";") ^^ { + case ifNotExists ~ mvName ~ deferredRebuild ~ mvProperties ~ query => + val map = mvProperties.getOrElse(List[(String, String)]()).toMap[String, String] + CarbonCreateDataMapCommand(mvName, None, "mv", map, query, + ifNotExists.isDefined, deferredRebuild.isDefined) + } + + /** + * DROP MATERIALIZED VIEW IF EXISTS mv_name + */ + private lazy val dropMV: Parser[LogicalPlan] = + DROP ~> MATERIALIZED ~> VIEW ~> opt(IF ~> EXISTS) ~ ident <~ opt(";") ^^ { + case ifExits ~ mvName => + CarbonDropDataMapCommand(mvName, ifExits.isDefined, None) + } + + /** + * SHOW MATERIALIZED VIEWS + */ + private lazy val showMV: Parser[LogicalPlan] = + SHOW ~> MATERIALIZED ~> VIEWS ~> opt(onTable) <~ opt(";") ^^ { + case tableIdent => + CarbonDataMapShowCommand(tableIdent) + } + + /** + * REBUILD MATERIALIZED VIEW mv_name + */ + private lazy val rebuildMV: Parser[LogicalPlan] = + REBUILD ~> MATERIALIZED ~> VIEW ~> ident <~ opt(";") ^^ { Review comment: ```suggestion REBUILD ~> MATERIALIZED ~> VIEW ~> ident ~ ONTABLE ~ (ident <~ ".").? ~ ident<~ opt(";") ^^ { ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
jackylk commented on a change in pull request #3609: [CARBONDATA-3689] Support independent MV extension and MV syntax
URL: https://github.com/apache/carbondata/pull/3609#discussion_r378853691 ########## File path: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/extension/MVParser.scala ########## @@ -0,0 +1,226 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.mv.extension + +import scala.language.implicitConversions +import scala.util.matching.Regex +import scala.util.parsing.combinator.PackratParsers +import scala.util.parsing.combinator.syntactical.StandardTokenParsers + +import org.apache.spark.sql.{DataFrame, SparkSession} +import org.apache.spark.sql.catalyst.{CarbonParserUtil, SqlLexical, TableIdentifier} +import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan +import org.apache.spark.sql.execution.command.AlterTableModel +import org.apache.spark.sql.execution.command.datamap.{CarbonCreateDataMapCommand, CarbonDataMapRebuildCommand, CarbonDataMapShowCommand, CarbonDropDataMapCommand} +import org.apache.spark.sql.execution.command.management.CarbonAlterTableCompactionCommand +import org.apache.spark.sql.hive.CarbonMVRules +import org.apache.spark.sql.util.{CarbonException, SparkSQLUtil} + +import org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException +import org.apache.carbondata.mv.rewrite.MVUdf + +class MVParser extends StandardTokenParsers with PackratParsers { + + // Keywords used in this parser + protected val SELECT: Regex = carbonKeyWord("SELECT") + protected val CREATE: Regex = carbonKeyWord("CREATE") + protected val MATERIALIZED: Regex = carbonKeyWord("MATERIALIZED") + protected val VIEW: Regex = carbonKeyWord("VIEW") + protected val VIEWS: Regex = carbonKeyWord("VIEWS") + protected val AS: Regex = carbonKeyWord("AS") + protected val DROP: Regex = carbonKeyWord("DROP") + protected val SHOW: Regex = carbonKeyWord("SHOW") + protected val IF: Regex = carbonKeyWord("IF") + protected val EXISTS: Regex = carbonKeyWord("EXISTS") + protected val NOT: Regex = carbonKeyWord("NOT") + protected val MVPROPERTIES: Regex = carbonKeyWord("MVPROPERTIES") + protected val WITH: Regex = carbonKeyWord("WITH") + protected val DEFERRED: Regex = carbonKeyWord("DEFERRED") + protected val REBUILD: Regex = carbonKeyWord("REBUILD") + protected val ON: Regex = carbonKeyWord("ON") + protected val TABLE: Regex = carbonKeyWord("TABLE") + protected val ALTER: Regex = carbonKeyWord("ALTER") + protected val COMPACT: Regex = carbonKeyWord("COMPACT") + protected val IN: Regex = carbonKeyWord("IN") + protected val SEGMENT: Regex = carbonKeyWord("SEGMENT") + protected val ID: Regex = carbonKeyWord("ID") + protected val WHERE: Regex = carbonKeyWord("WHERE") + + /** + * This will convert key word to regular expression. + */ + private def carbonKeyWord(keys: String): Regex = { + ("(?i)" + keys).r + } + + implicit def regexToParser(regex: Regex): Parser[String] = { + import lexical.Identifier + acceptMatch( + s"identifier matching regex ${ regex }", + { case Identifier(str) if regex.unapplySeq(str).isDefined => str } + ) + } + + // By default, use Reflection to find the reserved words defined in the sub class. + // NOTICE, Since the Keyword properties defined by sub class, we couldn't call this + // method during the parent class instantiation, because the sub class instance + // isn't created yet. + protected lazy val reservedWords: Seq[String] = + this + .getClass + .getMethods + .filter(_.getReturnType == classOf[Keyword]) + .map(_.invoke(this).asInstanceOf[Keyword].normalize) + + // Set the keywords as empty by default, will change that later. + override val lexical = new SqlLexical + + protected case class Keyword(str: String) { + def normalize: String = lexical.normalizeKeyword(str) + def parser: Parser[String] = normalize + } + + def parse(input: String): LogicalPlan = { + synchronized { + phrase(start)(new lexical.Scanner(input)) match { + case Success(plan, _) => + plan + case failureOrError => + CarbonException.analysisException(failureOrError.toString) + } + } + } + + private lazy val start: Parser[LogicalPlan] = mvCommand + + private lazy val mvCommand: Parser[LogicalPlan] = + createMV | dropMV | showMV | rebuildMV | compactMV + + /** + * CREATE MATERIALIZED VIEW IF NOT EXISTS mv_name + * MVPROPERTIES('KEY'='VALUE') AS mv_query_statement + */ + private lazy val createMV: Parser[LogicalPlan] = + CREATE ~> MATERIALIZED ~> VIEW ~> opt(IF ~> NOT ~> EXISTS) ~ ident ~ + opt(WITH ~> DEFERRED ~> REBUILD) ~ + (MVPROPERTIES ~> "(" ~> repsep(options, ",") <~ ")").? ~ + (AS ~> restInput).? <~ opt(";") ^^ { + case ifNotExists ~ mvName ~ deferredRebuild ~ mvProperties ~ query => + val map = mvProperties.getOrElse(List[(String, String)]()).toMap[String, String] + CarbonCreateDataMapCommand(mvName, None, "mv", map, query, + ifNotExists.isDefined, deferredRebuild.isDefined) + } + + /** + * DROP MATERIALIZED VIEW IF EXISTS mv_name + */ + private lazy val dropMV: Parser[LogicalPlan] = + DROP ~> MATERIALIZED ~> VIEW ~> opt(IF ~> EXISTS) ~ ident <~ opt(";") ^^ { Review comment: On table is not required because MV can be a join, so I think better not forcing user to give this ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
Free forum by Nabble | Edit this page |