[GitHub] carbondata pull request #2083: [WIP]Pre-Aggregate Streaming table handling

classic Classic list List threaded Threaded
76 messages Options
1234
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2083: [CARBONDATA-2269]Support Query On PreAggregate table...

qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2083
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/4537/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2083: [CARBONDATA-2269]Support Query On PreAggregate table...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2083
 
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/3309/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2083: [CARBONDATA-2269]Support Query On PreAggregate table...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2083
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/4540/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2083: [CARBONDATA-2269]Support Query On PreAggregate table...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2083
 
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/3312/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2083: [CARBONDATA-2269]Support Query On PreAggregate table...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2083
 
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4036/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2083: [CARBONDATA-2269]Support Query On PreAggregate table...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2083
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4037/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2083: [CARBONDATA-2269]Support Query On PreAggregate table...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2083
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4039/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2083: [CARBONDATA-2269]Support Query On PreAggregate table...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2083
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/4591/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2083: [CARBONDATA-2269]Support Query On PreAggregate table...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2083
 
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/3365/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2083: [CARBONDATA-2269]Support Query On PreAggregate table...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2083
 
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/3367/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2083: [CARBONDATA-2269]Support Query On PreAggregat...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2083#discussion_r176990583
 
    --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java ---
    @@ -143,11 +143,15 @@ protected CarbonTable getOrCreateCarbonTable(Configuration configuration) throws
         SegmentStatusManager segmentStatusManager = new SegmentStatusManager(identifier);
         SegmentStatusManager.ValidAndInvalidSegmentsInfo segments =
             segmentStatusManager.getValidAndInvalidSegments(loadMetadataDetails);
    -
    -    if (getValidateSegmentsToAccess(job.getConfiguration())) {
    +    // if for streaming table only access is streaming table is enabled access then access
    +    // only streaming segments this will be
    --- End diff --
   
    This comment is not clear. Kindly re write it for better understanding


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2083: [CARBONDATA-2269]Support Query On PreAggregat...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2083#discussion_r176992476
 
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/hive/CarbonPreAggregateRules.scala ---
    @@ -475,6 +603,25 @@ case class CarbonPreAggregateQueryRules(sparkSession: SparkSession) extends Rule
                           CarbonReflectionUtils
                             .getSubqueryAlias(sparkSession, Some(alias2), newChild, None),
                           None)))
    +              if(carbonTable.isStreamingTable) {
    +                setSegmentsForStreaming(carbonTable, aggDataMapSchema)
    +                // get new fact expression
    +                val factExp = updateFactTablePlanForStreaming(agg)
    +                // get new Aggregate node expression
    +                val streamingNodeExp = getExpressionsForStreaming(aggExp)
    +                // clear the expression as in case of streaming it is not required
    +                updatedExpression.clear
    +                // Add Aggregate node to aggregate data from fact and aggregate
    +                Aggregate(
    +                  grExp,
    +                  streamingNodeExp.asInstanceOf[Seq[NamedExpression]],
    +                  // add union node to get the result from both
    +                  Union(
    +                    factExp,
    +                    updateAggPlan))
    +              } else {
    +                updateAggPlan
    +              }
    --- End diff --
   
    The code modification is same at above 3 places and here. Can you extract it to a common method


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2083: [CARBONDATA-2269]Support Query On PreAggregate table...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2083
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/4593/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2083: [CARBONDATA-2269]Support Query On PreAggregat...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2083#discussion_r177004075
 
    --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java ---
    @@ -143,11 +143,15 @@ protected CarbonTable getOrCreateCarbonTable(Configuration configuration) throws
         SegmentStatusManager segmentStatusManager = new SegmentStatusManager(identifier);
         SegmentStatusManager.ValidAndInvalidSegmentsInfo segments =
             segmentStatusManager.getValidAndInvalidSegments(loadMetadataDetails);
    -
    -    if (getValidateSegmentsToAccess(job.getConfiguration())) {
    +    // if for streaming table only access is streaming table is enabled access then access
    +    // only streaming segments this will be
    --- End diff --
   
    ok


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2083: [CARBONDATA-2269]Support Query On PreAggregat...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2083#discussion_r177004129
 
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/hive/CarbonPreAggregateRules.scala ---
    @@ -475,6 +603,25 @@ case class CarbonPreAggregateQueryRules(sparkSession: SparkSession) extends Rule
                           CarbonReflectionUtils
                             .getSubqueryAlias(sparkSession, Some(alias2), newChild, None),
                           None)))
    +              if(carbonTable.isStreamingTable) {
    +                setSegmentsForStreaming(carbonTable, aggDataMapSchema)
    +                // get new fact expression
    +                val factExp = updateFactTablePlanForStreaming(agg)
    +                // get new Aggregate node expression
    +                val streamingNodeExp = getExpressionsForStreaming(aggExp)
    +                // clear the expression as in case of streaming it is not required
    +                updatedExpression.clear
    +                // Add Aggregate node to aggregate data from fact and aggregate
    +                Aggregate(
    +                  grExp,
    +                  streamingNodeExp.asInstanceOf[Seq[NamedExpression]],
    +                  // add union node to get the result from both
    +                  Union(
    +                    factExp,
    +                    updateAggPlan))
    +              } else {
    +                updateAggPlan
    +              }
    --- End diff --
   
    ok


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2083: [CARBONDATA-2269]Support Query On PreAggregate table...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2083
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4089/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2083: [CARBONDATA-2269]Support Query On PreAggregate table...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2083
 
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4090/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2083: [CARBONDATA-2269]Support Query On PreAggregate table...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2083
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4091/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2083: [CARBONDATA-2269]Support Query On PreAggregat...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2083#discussion_r177143560
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java ---
    @@ -1618,6 +1618,15 @@
        */
       public static final String CARBON_INVISIBLE_SEGMENTS_PRESERVE_COUNT_DEFAULT = "200";
     
    +  /**
    +   * Internal configuration for running query on pre aggregate created on streaming table
    +   * this will be used when plan got changed from main table to pre aggregate for streaming
    +   * table.
    +   */
    +  public static final String CARBON_STREAMING_SEGMENT = "carbon.streaming.segment";
    --- End diff --
   
    If this is not set by user, move it to the class use it


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2083: [CARBONDATA-2269]Support Query On PreAggregat...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2083#discussion_r177145297
 
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/hive/CarbonPreAggregateRules.scala ---
    @@ -296,14 +370,33 @@ case class CarbonPreAggregateQueryRules(sparkSession: SparkSession) extends Rule
                       carbonTable,
                       agg)
                   isPlanUpdated = true
    -              Aggregate(updatedGroupExp,
    +              val updateAggPlan = Aggregate(updatedGroupExp,
                     updatedAggExp,
                     CarbonReflectionUtils
                       .getSubqueryAlias(sparkSession,
                         Some(alias1),
                         CarbonReflectionUtils
                           .getSubqueryAlias(sparkSession, Some(alias2), newChild, None),
                         None))
    +              if(carbonTable.isStreamingTable) {
    --- End diff --
   
    space after `if`


---
1234