[GitHub] carbondata pull request #2665: [CARBONDATA-2897][DataMap] Optimize datamap c...

classic Classic list List threaded Threaded
62 messages Options
1234
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2665: [CARBONDATA-2897][DataMap] Optimize datamap c...

qiuchenjian-2
Github user kevinjmh commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2665#discussion_r218679293
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/datamap/DataMapChooser.java ---
    @@ -268,23 +238,38 @@ private ExpressionTuple selectDataMap(Expression expression, List<TableDataMap>
     
       private void extractColumnExpression(Expression expression,
           List<ColumnExpression> columnExpressions) {
    -    if (expression instanceof ColumnExpression) {
    -      columnExpressions.add((ColumnExpression) expression);
    -    } else if (expression instanceof MatchExpression) {
    -      // this is a special case for lucene
    -      // build a fake ColumnExpression to filter datamaps which contain target column
    -      // a Lucene query string is alike "column:query term"
    -      String[] queryItems = expression.getString().split(":", 2);
    -      if (queryItems.length == 2) {
    -        columnExpressions.add(new ColumnExpression(queryItems[0], null));
    -      }
    -    } else if (expression != null) {
    -      List<Expression> children = expression.getChildren();
    -      if (children != null && children.size() > 0) {
    -        for (Expression exp : children) {
    -          extractColumnExpression(exp, columnExpressions);
    +    switch (expression.getFilterExpressionType()) {
    --- End diff --
   
    Change to check by method `isSupport` in datamap factory


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2665: [CARBONDATA-2897][DataMap] Optimize datamap chooser

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2665
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/346/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2665: [CARBONDATA-2897][DataMap] Optimize datamap chooser

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2665
 
    Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/8593/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2665: [CARBONDATA-2897][DataMap] Optimize datamap chooser

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2665
 
    Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/523/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2665: [CARBONDATA-2897][DataMap] Optimize datamap chooser

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user kevinjmh commented on the issue:

    https://github.com/apache/carbondata/pull/2665
 
    retest this please


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2665: [CARBONDATA-2897][DataMap] Optimize datamap chooser

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2665
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/359/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2665: [CARBONDATA-2897][DataMap] Optimize datamap chooser

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2665
 
    Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/8606/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2665: [CARBONDATA-2897][DataMap] Optimize datamap chooser

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2665
 
    Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/536/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2665: [CARBONDATA-2897][DataMap] Optimize datamap c...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2665#discussion_r219066781
 
    --- Diff: datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMapFactory.java ---
    @@ -123,6 +129,42 @@ public BloomCoarseGrainDataMapFactory(CarbonTable carbonTable, DataMapSchema dat
         }
       }
     
    +  @Override
    +  public boolean isSupport(Expression expression) {
    +    ColumnExpression filterColumn = null;
    +    // First check type of child nodes, and get filter column name
    +    switch (expression.getFilterExpressionType()) {
    +      case IN:
    +        InExpression inExpr = (InExpression) expression;
    +        if (inExpr.getLeft() instanceof ColumnExpression &&
    +            inExpr.getRight() instanceof ListExpression) {
    +          filterColumn = (ColumnExpression) inExpr.getLeft();
    +        } else if (inExpr.getRight() instanceof ColumnExpression &&
    +            inExpr.getLeft() instanceof ListExpression) {
    +          filterColumn = (ColumnExpression) inExpr.getRight();
    +        }
    +        break;
    +      case EQUALS:
    +        EqualToExpression equalToExpr = (EqualToExpression) expression;
    +        if (equalToExpr.getLeft() instanceof ColumnExpression &&
    +            equalToExpr.getRight() instanceof LiteralExpression) {
    +          filterColumn = (ColumnExpression) equalToExpr.getLeft();
    +        } else if (equalToExpr.getRight() instanceof ColumnExpression &&
    +            equalToExpr.getLeft() instanceof LiteralExpression) {
    +          filterColumn = (ColumnExpression) equalToExpr.getRight();
    +        }
    +        break;
    +      default:
    +        break;
    +    }
    +    // Then check if filter column is in index columns
    +    if (null != filterColumn && dataMapMeta.getIndexedColumnNames().contains(
    --- End diff --
   
    will there be a case-sensitive problem here?


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2665: [CARBONDATA-2897][DataMap] Optimize datamap c...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2665#discussion_r219066622
 
    --- Diff: datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMapFactory.java ---
    @@ -123,6 +129,42 @@ public BloomCoarseGrainDataMapFactory(CarbonTable carbonTable, DataMapSchema dat
         }
       }
     
    +  @Override
    +  public boolean isSupport(Expression expression) {
    +    ColumnExpression filterColumn = null;
    +    // First check type of child nodes, and get filter column name
    +    switch (expression.getFilterExpressionType()) {
    +      case IN:
    +        InExpression inExpr = (InExpression) expression;
    +        if (inExpr.getLeft() instanceof ColumnExpression &&
    +            inExpr.getRight() instanceof ListExpression) {
    +          filterColumn = (ColumnExpression) inExpr.getLeft();
    +        } else if (inExpr.getRight() instanceof ColumnExpression &&
    +            inExpr.getLeft() instanceof ListExpression) {
    +          filterColumn = (ColumnExpression) inExpr.getRight();
    +        }
    +        break;
    +      case EQUALS:
    +        EqualToExpression equalToExpr = (EqualToExpression) expression;
    +        if (equalToExpr.getLeft() instanceof ColumnExpression &&
    +            equalToExpr.getRight() instanceof LiteralExpression) {
    +          filterColumn = (ColumnExpression) equalToExpr.getLeft();
    +        } else if (equalToExpr.getRight() instanceof ColumnExpression &&
    +            equalToExpr.getLeft() instanceof LiteralExpression) {
    +          filterColumn = (ColumnExpression) equalToExpr.getRight();
    +        }
    +        break;
    +      default:
    +        break;
    --- End diff --
   
    just return 'false' here for unsupported expression


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2665: [CARBONDATA-2897][DataMap] Optimize datamap c...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2665#discussion_r219067294
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java ---
    @@ -364,6 +365,13 @@ public DataMapMeta getMeta() {
         return null;
       }
     
    +  @Override
    +  public boolean isSupport(Expression expression) {
    +    // this method is used for choosing cg/fg datamap
    +    // for default datamap always return false
    +    return false;
    --- End diff --
   
    I think  you can by default return true here, even it has no side effect, but it will make sense for code reading: default datamap can handle all the expressions


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2665: [CARBONDATA-2897][DataMap] Optimize datamap chooser

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/2665
 
    since it changes the datamap chooser logic, I'd recommend you to test it with larger amount of data and observe the pruning


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2665: [CARBONDATA-2897][DataMap] Optimize datamap c...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user kevinjmh commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2665#discussion_r219376540
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java ---
    @@ -364,6 +365,13 @@ public DataMapMeta getMeta() {
         return null;
       }
     
    +  @Override
    +  public boolean isSupport(Expression expression) {
    +    // this method is used for choosing cg/fg datamap
    +    // for default datamap always return false
    +    return false;
    --- End diff --
   
    I get your point. I will change it to return true for better code reading. Also will add checking meta info( meta of default datamap is NULLnow) to avoid null point exception


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2665: [CARBONDATA-2897][DataMap] Optimize datamap c...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user kevinjmh commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2665#discussion_r219378217
 
    --- Diff: datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMapFactory.java ---
    @@ -123,6 +129,42 @@ public BloomCoarseGrainDataMapFactory(CarbonTable carbonTable, DataMapSchema dat
         }
       }
     
    +  @Override
    +  public boolean isSupport(Expression expression) {
    +    ColumnExpression filterColumn = null;
    +    // First check type of child nodes, and get filter column name
    +    switch (expression.getFilterExpressionType()) {
    +      case IN:
    +        InExpression inExpr = (InExpression) expression;
    +        if (inExpr.getLeft() instanceof ColumnExpression &&
    +            inExpr.getRight() instanceof ListExpression) {
    +          filterColumn = (ColumnExpression) inExpr.getLeft();
    +        } else if (inExpr.getRight() instanceof ColumnExpression &&
    +            inExpr.getLeft() instanceof ListExpression) {
    +          filterColumn = (ColumnExpression) inExpr.getRight();
    +        }
    +        break;
    +      case EQUALS:
    +        EqualToExpression equalToExpr = (EqualToExpression) expression;
    +        if (equalToExpr.getLeft() instanceof ColumnExpression &&
    +            equalToExpr.getRight() instanceof LiteralExpression) {
    +          filterColumn = (ColumnExpression) equalToExpr.getLeft();
    +        } else if (equalToExpr.getRight() instanceof ColumnExpression &&
    +            equalToExpr.getLeft() instanceof LiteralExpression) {
    +          filterColumn = (ColumnExpression) equalToExpr.getRight();
    +        }
    +        break;
    +      default:
    +        break;
    +    }
    +    // Then check if filter column is in index columns
    +    if (null != filterColumn && dataMapMeta.getIndexedColumnNames().contains(
    --- End diff --
   
    for safety, convert to lowercase in both side to check


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2665: [CARBONDATA-2897][DataMap] Optimize datamap c...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user kevinjmh commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2665#discussion_r219378413
 
    --- Diff: datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMapFactory.java ---
    @@ -123,6 +129,42 @@ public BloomCoarseGrainDataMapFactory(CarbonTable carbonTable, DataMapSchema dat
         }
       }
     
    +  @Override
    +  public boolean isSupport(Expression expression) {
    +    ColumnExpression filterColumn = null;
    +    // First check type of child nodes, and get filter column name
    +    switch (expression.getFilterExpressionType()) {
    +      case IN:
    +        InExpression inExpr = (InExpression) expression;
    +        if (inExpr.getLeft() instanceof ColumnExpression &&
    +            inExpr.getRight() instanceof ListExpression) {
    +          filterColumn = (ColumnExpression) inExpr.getLeft();
    +        } else if (inExpr.getRight() instanceof ColumnExpression &&
    +            inExpr.getLeft() instanceof ListExpression) {
    +          filterColumn = (ColumnExpression) inExpr.getRight();
    +        }
    +        break;
    +      case EQUALS:
    +        EqualToExpression equalToExpr = (EqualToExpression) expression;
    +        if (equalToExpr.getLeft() instanceof ColumnExpression &&
    +            equalToExpr.getRight() instanceof LiteralExpression) {
    +          filterColumn = (ColumnExpression) equalToExpr.getLeft();
    +        } else if (equalToExpr.getRight() instanceof ColumnExpression &&
    +            equalToExpr.getLeft() instanceof LiteralExpression) {
    +          filterColumn = (ColumnExpression) equalToExpr.getRight();
    +        }
    +        break;
    +      default:
    +        break;
    --- End diff --
   
    ok


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2665: [CARBONDATA-2897][DataMap] Optimize datamap chooser

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2665
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/379/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2665: [CARBONDATA-2897][DataMap] Optimize datamap chooser

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2665
 
    Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/8627/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2665: [CARBONDATA-2897][DataMap] Optimize datamap chooser

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2665
 
    Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/557/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2665: [CARBONDATA-2897][DataMap] Optimize datamap chooser

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/2665
 
    retest this please


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2665: [CARBONDATA-2897][DataMap] Optimize datamap chooser

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2665
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/458/



---
1234