[GitHub] carbondata pull request #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA...

classic Classic list List threaded Threaded
31 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA...

qiuchenjian-2
GitHub user ravipesala opened a pull request:

    https://github.com/apache/carbondata/pull/2478

    [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA-2568][MV] Add validations for unsupported MV queries

    This PR depends on https://github.com/apache/carbondata/pull/2453
    Problem: Validations are missing on the unsupported MV queries while creating MV datamap.
    Solution: Added validation for the unsupported MV queries while creating datamap.
   
    Be sure to do all of the following checklist to help us incorporate
    your contribution quickly and easily:
   
     - [ ] Any interfaces changed?
     
     - [ ] Any backward compatibility impacted?
     
     - [ ] Document update required?
   
     - [ ] Testing done
            Please provide details on
            - Whether new unit test cases have been added or why no new tests are required?
            - How it is tested? Please attach test report.
            - Is it a performance related change? Please attach the performance test report.
            - Any additional information to help reviewers in testing this change.
           
     - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
   


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ravipesala/incubator-carbondata pr-2540

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/2478.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2478
   
----
commit 844cff764e94f18ff8d115ccd636a06a0931bffb
Author: ravipesala <ravi.pesala@...>
Date:   2018-06-14T06:10:07Z

    Fixed order by in mv and aggregation functions inside projection expressions are fixed

commit 642d05b9dab3a001c2338a5dca920d952f0ccd5c
Author: ravipesala <ravi.pesala@...>
Date:   2018-07-09T10:19:57Z

    Added validation for unsupported MV queries

----


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA-2568][...

qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2478
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6987/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA-2568][...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2478
 
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5765/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA-2568][...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2478
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5748/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA-2568][...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2478
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7192/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA-2568][...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2478
 
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5967/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA-2568][...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2478
 
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5854/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA-2568][...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2478
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5858/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA-2568][...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2478
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7245/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA-2568][...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2478
 
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6020/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA-2568][...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2478
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7289/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA-2568][...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2478
 
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6056/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA-2568][...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2478
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5917/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2478#discussion_r204432239
 
    --- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala ---
    @@ -118,6 +122,43 @@ object MVHelper {
         DataMapStoreManager.getInstance().saveDataMapSchema(dataMapSchema)
       }
     
    +  private def validateMVQuery(sparkSession: SparkSession,
    +      logicalPlan: LogicalPlan) {
    +    val dataMapProvider = DataMapManager.get().getDataMapProvider(null,
    +      new DataMapSchema("", DataMapClassProvider.MV.getShortName), sparkSession)
    +    dataMapProvider
    +    var catalog = DataMapStoreManager.getInstance().getDataMapCatalog(dataMapProvider,
    +      DataMapClassProvider.MV.getShortName).asInstanceOf[SummaryDatasetCatalog]
    +    if (catalog == null) {
    +      catalog = new SummaryDatasetCatalog(sparkSession)
    +    }
    +    val modularPlan =
    +      catalog.mvSession.sessionState.modularizer.modularize(
    +        catalog.mvSession.sessionState.optimizer.execute(logicalPlan)).next().semiHarmonized
    +
    +    val isValid = modularPlan match {
    +      case g: GroupBy =>
    +        // Make sure all predicates are present in projections.
    +        g.predicateList.forall{p =>
    +          g.outputList.exists{
    +            case a: Alias =>
    +              a.semanticEquals(p) || a.child.semanticEquals(p)
    +            case other => other.semanticEquals(p)
    +          }
    +        }
    +      case _ => true
    +    }
    +    if (!isValid) {
    +      throw new UnsupportedOperationException("Group by columns must be present in project columns")
    +    }
    +    if(catalog.isMVWithSameQueryPresent(logicalPlan)) {
    --- End diff --
   
    add space after `if`


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2478#discussion_r204434793
 
    --- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala ---
    @@ -118,6 +122,43 @@ object MVHelper {
         DataMapStoreManager.getInstance().saveDataMapSchema(dataMapSchema)
       }
     
    +  private def validateMVQuery(sparkSession: SparkSession,
    +      logicalPlan: LogicalPlan) {
    --- End diff --
   
    add `: Unit`


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2478#discussion_r204435022
 
    --- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala ---
    @@ -118,6 +122,43 @@ object MVHelper {
         DataMapStoreManager.getInstance().saveDataMapSchema(dataMapSchema)
       }
     
    +  private def validateMVQuery(sparkSession: SparkSession,
    +      logicalPlan: LogicalPlan) {
    +    val dataMapProvider = DataMapManager.get().getDataMapProvider(null,
    +      new DataMapSchema("", DataMapClassProvider.MV.getShortName), sparkSession)
    +    dataMapProvider
    --- End diff --
   
    useless statement


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2478#discussion_r204435669
 
    --- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala ---
    @@ -118,6 +122,43 @@ object MVHelper {
         DataMapStoreManager.getInstance().saveDataMapSchema(dataMapSchema)
       }
     
    +  private def validateMVQuery(sparkSession: SparkSession,
    +      logicalPlan: LogicalPlan) {
    +    val dataMapProvider = DataMapManager.get().getDataMapProvider(null,
    +      new DataMapSchema("", DataMapClassProvider.MV.getShortName), sparkSession)
    +    dataMapProvider
    +    var catalog = DataMapStoreManager.getInstance().getDataMapCatalog(dataMapProvider,
    +      DataMapClassProvider.MV.getShortName).asInstanceOf[SummaryDatasetCatalog]
    --- End diff --
   
    Does casting `null` to `SummaryDatasetCatalog` ok?


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2478#discussion_r204435991
 
    --- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala ---
    @@ -118,6 +122,43 @@ object MVHelper {
         DataMapStoreManager.getInstance().saveDataMapSchema(dataMapSchema)
       }
     
    +  private def validateMVQuery(sparkSession: SparkSession,
    +      logicalPlan: LogicalPlan) {
    +    val dataMapProvider = DataMapManager.get().getDataMapProvider(null,
    +      new DataMapSchema("", DataMapClassProvider.MV.getShortName), sparkSession)
    +    dataMapProvider
    +    var catalog = DataMapStoreManager.getInstance().getDataMapCatalog(dataMapProvider,
    +      DataMapClassProvider.MV.getShortName).asInstanceOf[SummaryDatasetCatalog]
    +    if (catalog == null) {
    +      catalog = new SummaryDatasetCatalog(sparkSession)
    +    }
    +    val modularPlan =
    +      catalog.mvSession.sessionState.modularizer.modularize(
    +        catalog.mvSession.sessionState.optimizer.execute(logicalPlan)).next().semiHarmonized
    +
    +    val isValid = modularPlan match {
    +      case g: GroupBy =>
    +        // Make sure all predicates are present in projections.
    +        g.predicateList.forall{p =>
    +          g.outputList.exists{
    +            case a: Alias =>
    +              a.semanticEquals(p) || a.child.semanticEquals(p)
    +            case other => other.semanticEquals(p)
    +          }
    +        }
    +      case _ => true
    +    }
    +    if (!isValid) {
    +      throw new UnsupportedOperationException("Group by columns must be present in project columns")
    +    }
    +    if(catalog.isMVWithSameQueryPresent(logicalPlan)) {
    +      throw new UnsupportedOperationException("MV with same query present")
    +    }
    +    if (!modularPlan.isSPJGH)  {
    --- End diff --
   
    I think it is better to move this before line 139


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2478#discussion_r204436654
 
    --- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala ---
    @@ -118,6 +122,43 @@ object MVHelper {
         DataMapStoreManager.getInstance().saveDataMapSchema(dataMapSchema)
       }
     
    +  private def validateMVQuery(sparkSession: SparkSession,
    +      logicalPlan: LogicalPlan) {
    +    val dataMapProvider = DataMapManager.get().getDataMapProvider(null,
    +      new DataMapSchema("", DataMapClassProvider.MV.getShortName), sparkSession)
    +    dataMapProvider
    +    var catalog = DataMapStoreManager.getInstance().getDataMapCatalog(dataMapProvider,
    +      DataMapClassProvider.MV.getShortName).asInstanceOf[SummaryDatasetCatalog]
    +    if (catalog == null) {
    +      catalog = new SummaryDatasetCatalog(sparkSession)
    +    }
    +    val modularPlan =
    +      catalog.mvSession.sessionState.modularizer.modularize(
    +        catalog.mvSession.sessionState.optimizer.execute(logicalPlan)).next().semiHarmonized
    +
    +    val isValid = modularPlan match {
    +      case g: GroupBy =>
    +        // Make sure all predicates are present in projections.
    +        g.predicateList.forall{p =>
    +          g.outputList.exists{
    +            case a: Alias =>
    +              a.semanticEquals(p) || a.child.semanticEquals(p)
    +            case other => other.semanticEquals(p)
    +          }
    +        }
    +      case _ => true
    +    }
    +    if (!isValid) {
    +      throw new UnsupportedOperationException("Group by columns must be present in project columns")
    +    }
    +    if(catalog.isMVWithSameQueryPresent(logicalPlan)) {
    +      throw new UnsupportedOperationException("MV with same query present")
    +    }
    +    if (!modularPlan.isSPJGH)  {
    +      throw new UnsupportedOperationException("MV is not supported for this query")
    --- End diff --
   
    Can you explain for specificlly like `Only Select-Predicate-Join-Groupby-Having query is supported for MV'


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2478#discussion_r204479220
 
    --- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala ---
    @@ -118,6 +122,43 @@ object MVHelper {
         DataMapStoreManager.getInstance().saveDataMapSchema(dataMapSchema)
       }
     
    +  private def validateMVQuery(sparkSession: SparkSession,
    +      logicalPlan: LogicalPlan) {
    +    val dataMapProvider = DataMapManager.get().getDataMapProvider(null,
    +      new DataMapSchema("", DataMapClassProvider.MV.getShortName), sparkSession)
    +    dataMapProvider
    +    var catalog = DataMapStoreManager.getInstance().getDataMapCatalog(dataMapProvider,
    +      DataMapClassProvider.MV.getShortName).asInstanceOf[SummaryDatasetCatalog]
    +    if (catalog == null) {
    +      catalog = new SummaryDatasetCatalog(sparkSession)
    +    }
    +    val modularPlan =
    +      catalog.mvSession.sessionState.modularizer.modularize(
    +        catalog.mvSession.sessionState.optimizer.execute(logicalPlan)).next().semiHarmonized
    +
    +    val isValid = modularPlan match {
    +      case g: GroupBy =>
    +        // Make sure all predicates are present in projections.
    +        g.predicateList.forall{p =>
    +          g.outputList.exists{
    +            case a: Alias =>
    +              a.semanticEquals(p) || a.child.semanticEquals(p)
    +            case other => other.semanticEquals(p)
    +          }
    +        }
    +      case _ => true
    +    }
    +    if (!isValid) {
    +      throw new UnsupportedOperationException("Group by columns must be present in project columns")
    +    }
    +    if(catalog.isMVWithSameQueryPresent(logicalPlan)) {
    +      throw new UnsupportedOperationException("MV with same query present")
    +    }
    +    if (!modularPlan.isSPJGH)  {
    +      throw new UnsupportedOperationException("MV is not supported for this query")
    --- End diff --
   
    ok


---
12