[GitHub] carbondata pull request #2445: [CARBONDATA-2655][BloomDataMap] BloomFilter d...

classic Classic list List threaded Threaded
21 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2445: [CARBONDATA-2655][BloomDataMap] BloomFilter d...

qiuchenjian-2
GitHub user xuchuanyin opened a pull request:

    https://github.com/apache/carbondata/pull/2445

    [CARBONDATA-2655][BloomDataMap]  BloomFilter datamap support in operator

    Now queries with in expression on bloom index column can leverage the
    BloomFilter datamap.
   
    Be sure to do all of the following checklist to help us incorporate
    your contribution quickly and easily:
   
     - [ ] Any interfaces changed?
     
     - [ ] Any backward compatibility impacted?
     
     - [ ] Document update required?
   
     - [ ] Testing done
            Please provide details on
            - Whether new unit test cases have been added or why no new tests are required?
            - How it is tested? Please attach test report.
            - Is it a performance related change? Please attach the performance test report.
            - Any additional information to help reviewers in testing this change.
           
     - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
   


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/xuchuanyin/carbondata 0704_bloom_support_in_operator

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/2445.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2445
   
----
commit 7612a8cd2b5168c780b7d1448ff4f723eebb10fd
Author: xuchuanyin <xuchuanyin@...>
Date:   2018-07-04T06:56:40Z

    Fix bugs in querying on bloom column with empty value
   
    Convert null values to corresponding values while querying on bloom
    column

commit ada65edc00a443398e9631644608d70b88e71d10
Author: xuchuanyin <xuchuanyin@...>
Date:   2018-07-04T07:00:29Z

    Add test for querying on longstring bloom index column
   
    Supporting longstring as bloom index column has already been done in
    PR2403, here we only add test for it

commit 35e1bf5dd1b991e43cca82cbf63289ceed4a41ac
Author: xuchuanyin <xuchuanyin@...>
Date:   2018-07-04T09:29:16Z

    BloomFilter datamap support in operator
   
    Now queries with in expression on bloom index column can leverage the
    BloomFilter datamap.

----


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2445: [CARBONDATA-2655][BloomDataMap] BloomFilter datamap ...

qiuchenjian-2
Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/2445
 
    This PR dependes on PR #2416


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2445: [CARBONDATA-2655][BloomDataMap] BloomFilter datamap ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2445
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5600/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2445: [CARBONDATA-2655][BloomDataMap] BloomFilter datamap ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2445
 
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5582/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2445: [CARBONDATA-2655][BloomDataMap] BloomFilter datamap ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2445
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6760/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2445: [CARBONDATA-2655][BloomDataMap] BloomFilter datamap ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/2445
 
    Do not trigger build before PR2413 is merged


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2445: [CARBONDATA-2655][BloomDataMap] BloomFilter datamap ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2445
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6911/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2445: [CARBONDATA-2655][BloomDataMap] BloomFilter datamap ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2445
 
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5677/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2445: [CARBONDATA-2655][BloomDataMap] BloomFilter datamap ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2445
 
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5696/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2445: [CARBONDATA-2655][BloomDataMap] BloomFilter datamap ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/2445
 
    retest sdv please


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2445: [CARBONDATA-2655][BloomDataMap] BloomFilter datamap ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/2445
 
    retest sdv please


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2445: [CARBONDATA-2655][BloomDataMap] BloomFilter datamap ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user brijoobopanna commented on the issue:

    https://github.com/apache/carbondata/pull/2445
 
    retest sdv please


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2445: [CARBONDATA-2655][BloomDataMap] BloomFilter datamap ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2445
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5754/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2445: [CARBONDATA-2655][BloomDataMap] BloomFilter datamap ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on the issue:

    https://github.com/apache/carbondata/pull/2445
 
    LGTM


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2445: [CARBONDATA-2655][BloomDataMap] BloomFilter d...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2445#discussion_r201891274
 
    --- Diff: datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMap.java ---
    @@ -186,19 +188,39 @@ public void initIndexColumnConverters(CarbonTable carbonTable, List<CarbonColumn
             column = ((ColumnExpression) left).getColumnName();
             if (this.name2Col.containsKey(column)) {
               BloomQueryModel bloomQueryModel =
    -              buildQueryModelFromExpression((ColumnExpression) left, (LiteralExpression) right);
    +              buildQueryModelForEqual((ColumnExpression) left, (LiteralExpression) right);
               queryModels.add(bloomQueryModel);
             }
             return queryModels;
           } else if (left instanceof LiteralExpression && right instanceof ColumnExpression) {
             column = ((ColumnExpression) right).getColumnName();
             if (this.name2Col.containsKey(column)) {
               BloomQueryModel bloomQueryModel =
    -              buildQueryModelFromExpression((ColumnExpression) right, (LiteralExpression) left);
    +              buildQueryModelForEqual((ColumnExpression) right, (LiteralExpression) left);
               queryModels.add(bloomQueryModel);
             }
             return queryModels;
           }
    +    } else if (expression instanceof InExpression) {
    +      Expression left = ((InExpression) expression).getLeft();
    +      Expression right = ((InExpression) expression).getRight();
    +      String column;
    +      if (left instanceof ColumnExpression && right instanceof ListExpression) {
    +        column = ((ColumnExpression) left).getColumnName();
    +        if (this.name2Col.containsKey(column)) {
    +          List<BloomQueryModel> models =
    +              buildQueryModelForIn((ColumnExpression) left, (ListExpression) right);
    +          queryModels.addAll(models);
    +        }
    +        return queryModels;
    +      } else if (left instanceof ListExpression && right instanceof ColumnExpression) {
    +        column = ((ColumnExpression) right).getColumnName();
    +        if (this.name2Col.containsKey(column)) {
    +          List<BloomQueryModel> models =
    +              buildQueryModelForIn((ColumnExpression) right, (ListExpression) left);
    +          queryModels.addAll(models);
    +        }
    +      }
    --- End diff --
   
    What if it does not fit into previous two if branch? can you add an else to throw exception and explain it will not come to else?


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2445: [CARBONDATA-2655][BloomDataMap] BloomFilter d...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2445#discussion_r201891618
 
    --- Diff: datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMap.java ---
    @@ -186,19 +188,39 @@ public void initIndexColumnConverters(CarbonTable carbonTable, List<CarbonColumn
             column = ((ColumnExpression) left).getColumnName();
             if (this.name2Col.containsKey(column)) {
               BloomQueryModel bloomQueryModel =
    -              buildQueryModelFromExpression((ColumnExpression) left, (LiteralExpression) right);
    +              buildQueryModelForEqual((ColumnExpression) left, (LiteralExpression) right);
               queryModels.add(bloomQueryModel);
             }
             return queryModels;
           } else if (left instanceof LiteralExpression && right instanceof ColumnExpression) {
             column = ((ColumnExpression) right).getColumnName();
             if (this.name2Col.containsKey(column)) {
               BloomQueryModel bloomQueryModel =
    -              buildQueryModelFromExpression((ColumnExpression) right, (LiteralExpression) left);
    +              buildQueryModelForEqual((ColumnExpression) right, (LiteralExpression) left);
               queryModels.add(bloomQueryModel);
             }
             return queryModels;
           }
    +    } else if (expression instanceof InExpression) {
    +      Expression left = ((InExpression) expression).getLeft();
    +      Expression right = ((InExpression) expression).getRight();
    +      String column;
    +      if (left instanceof ColumnExpression && right instanceof ListExpression) {
    +        column = ((ColumnExpression) left).getColumnName();
    +        if (this.name2Col.containsKey(column)) {
    +          List<BloomQueryModel> models =
    +              buildQueryModelForIn((ColumnExpression) left, (ListExpression) right);
    +          queryModels.addAll(models);
    +        }
    +        return queryModels;
    +      } else if (left instanceof ListExpression && right instanceof ColumnExpression) {
    +        column = ((ColumnExpression) right).getColumnName();
    +        if (this.name2Col.containsKey(column)) {
    +          List<BloomQueryModel> models =
    +              buildQueryModelForIn((ColumnExpression) right, (ListExpression) left);
    +          queryModels.addAll(models);
    +        }
    +      }
    --- End diff --
   
    Each index datamap has a set of supported operators, for bloomfilter datamap it supports `equal` and `in`, so there are only two branches here.
   
    will add comment here


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2445: [CARBONDATA-2655][BloomDataMap] BloomFilter datamap ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2445
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7042/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2445: [CARBONDATA-2655][BloomDataMap] BloomFilter datamap ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2445
 
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5819/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2445: [CARBONDATA-2655][BloomDataMap] BloomFilter datamap ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2445
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5786/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2445: [CARBONDATA-2655][BloomDataMap] BloomFilter datamap ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on the issue:

    https://github.com/apache/carbondata/pull/2445
 
    LGTM


---
12