[GitHub] carbondata pull request #2311: [CARBONDATA-2487] Block filters for lucene wi...

classic Classic list List threaded Threaded
24 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2311: [CARBONDATA-2487] Block filters for lucene wi...

qiuchenjian-2
GitHub user Indhumathi27 opened a pull request:

    https://github.com/apache/carbondata/pull/2311

    [CARBONDATA-2487] Block filters for lucene with more than one text_match udf

   
     - [x] Any interfaces changed?
            NA
     - [x] Any backward compatibility impacted?
            NA
     - [x] Document update required?
             NA
     - [x] Testing done
          UT added
     - [x] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
             NA


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/Indhumathi27/carbondata luceneblockudf

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/2311.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2311
   
----
commit 78af3c61f1c4cdaafdfe1e98d1083dbb4fff3989
Author: Indhumathi27 <indhumathim27@...>
Date:   2018-05-16T11:18:30Z

    [CARBONDATA-2487] Block filters for lucene with more than one text_match udf

----


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2311: [CARBONDATA-2487] Block filters for lucene with more...

qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2311
 
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4759/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2311: [CARBONDATA-2487] Block filters for lucene with more...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2311
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5913/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2311: [CARBONDATA-2487] Block filters for lucene with more...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2311
 
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4946/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2311: [CARBONDATA-2487] Block filters for lucene with more...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2311
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5919/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2311: [CARBONDATA-2487] Block filters for lucene with more...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2311
 
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4765/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2311: [CARBONDATA-2487] Block filters for lucene with more...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2311
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4951/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2311: [CARBONDATA-2487] Block filters for lucene wi...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2311#discussion_r189221128
 
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/strategy/CarbonLateDecodeStrategy.scala ---
    @@ -479,6 +479,21 @@ private[sql] class CarbonLateDecodeStrategy extends SparkStrategy {
           case a: Attribute if isComplexAttribute(a) => a
         }.size == 0 )
     
    +    // block filters for lucene with more than one text_match udf
    +    // Todo: handle when lucene and normal query filter is supported
    +    if (predicates.nonEmpty) {
    +      if (predicates.seq.head.isInstanceOf[ScalaUDF]) {
    --- End diff --
   
    Logic seems wrong, supposed to be check over all predicates not just on head


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2311: [CARBONDATA-2487] Block filters for lucene wi...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user Indhumathi27 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2311#discussion_r189224751
 
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/strategy/CarbonLateDecodeStrategy.scala ---
    @@ -479,6 +479,21 @@ private[sql] class CarbonLateDecodeStrategy extends SparkStrategy {
           case a: Attribute if isComplexAttribute(a) => a
         }.size == 0 )
     
    +    // block filters for lucene with more than one text_match udf
    +    // Todo: handle when lucene and normal query filter is supported
    +    if (predicates.nonEmpty) {
    +      if (predicates.seq.head.isInstanceOf[ScalaUDF]) {
    --- End diff --
   
    Now..filter on text_match and normal query is not supported(TEXT_MATCH('name:n') AND id=5). Hence for lucene ,filter should be explicitily lucene text_match.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2311: [CARBONDATA-2487] Block filters for lucene wi...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user Indhumathi27 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2311#discussion_r189231955
 
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/strategy/CarbonLateDecodeStrategy.scala ---
    @@ -479,6 +479,21 @@ private[sql] class CarbonLateDecodeStrategy extends SparkStrategy {
           case a: Attribute if isComplexAttribute(a) => a
         }.size == 0 )
     
    +    // block filters for lucene with more than one text_match udf
    +    // Todo: handle when lucene and normal query filter is supported
    +    if (predicates.nonEmpty) {
    +      if (predicates.seq.head.isInstanceOf[ScalaUDF]) {
    --- End diff --
   
    @ravipesala please review


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2311: [CARBONDATA-2487] Block filters for lucene with more...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2311
 
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4807/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2311: [CARBONDATA-2487] Block filters for lucene with more...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user Indhumathi27 commented on the issue:

    https://github.com/apache/carbondata/pull/2311
 
    Retest this please


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2311: [CARBONDATA-2487] Block filters for lucene with more...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2311
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5965/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2311: [CARBONDATA-2487] Block filters for lucene with more...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2311
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4991/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2311: [CARBONDATA-2487] Block filters for lucene with more...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2311
 
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4812/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2311: [CARBONDATA-2487] Block filters for lucene with more...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2311
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5970/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2311: [CARBONDATA-2487] Block filters for lucene wi...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2311#discussion_r189417575
 
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/strategy/CarbonLateDecodeStrategy.scala ---
    @@ -479,6 +479,24 @@ private[sql] class CarbonLateDecodeStrategy extends SparkStrategy {
           case a: Attribute if isComplexAttribute(a) => a
         }.size == 0 )
     
    +    // block filters for lucene with more than one text_match udf
    +    // Todo: handle when lucene and normal query filter is supported
    +    var count: Int = 0
    +    if (predicates.nonEmpty) {
    +      predicates.foreach(predicate => {
    +        if (predicate.isInstanceOf[ScalaUDF]) {
    +          predicate match {
    +            case u: ScalaUDF if u.function.isInstanceOf[TextMatchUDF] ||
    --- End diff --
   
    Better to check it in below `for..yield...` loop, instead of adding another loop
    And you do not need to count the number of UDF, throw exception when second UDF is encountered


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2311: [CARBONDATA-2487] Block filters for lucene wi...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2311#discussion_r189417704
 
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/strategy/CarbonLateDecodeStrategy.scala ---
    @@ -479,6 +479,24 @@ private[sql] class CarbonLateDecodeStrategy extends SparkStrategy {
           case a: Attribute if isComplexAttribute(a) => a
         }.size == 0 )
     
    +    // block filters for lucene with more than one text_match udf
    +    // Todo: handle when lucene and normal query filter is supported
    +    var count: Int = 0
    +    if (predicates.nonEmpty) {
    +      predicates.foreach(predicate => {
    +        if (predicate.isInstanceOf[ScalaUDF]) {
    +          predicate match {
    +            case u: ScalaUDF if u.function.isInstanceOf[TextMatchUDF] ||
    +                                u.function.isInstanceOf[TextMatchMaxDocUDF] => count = count + 1
    +          }
    +        }
    +      })
    +      if (count > 1) {
    +        throw new MalformedCarbonCommandException(
    +          "Only one TEXT_MATCH UDF for filters is allowed")
    --- End diff --
   
    In the exception message, should tell user to include all search pattern matching logic in one UDF instead of writing separate UDF


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2311: [CARBONDATA-2487] Block filters for lucene wi...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user Indhumathi27 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2311#discussion_r189421516
 
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/strategy/CarbonLateDecodeStrategy.scala ---
    @@ -479,6 +479,24 @@ private[sql] class CarbonLateDecodeStrategy extends SparkStrategy {
           case a: Attribute if isComplexAttribute(a) => a
         }.size == 0 )
     
    +    // block filters for lucene with more than one text_match udf
    +    // Todo: handle when lucene and normal query filter is supported
    +    var count: Int = 0
    +    if (predicates.nonEmpty) {
    +      predicates.foreach(predicate => {
    +        if (predicate.isInstanceOf[ScalaUDF]) {
    +          predicate match {
    +            case u: ScalaUDF if u.function.isInstanceOf[TextMatchUDF] ||
    +                                u.function.isInstanceOf[TextMatchMaxDocUDF] => count = count + 1
    +          }
    +        }
    +      })
    +      if (count > 1) {
    +        throw new MalformedCarbonCommandException(
    +          "Only one TEXT_MATCH UDF for filters is allowed")
    --- End diff --
   
    okay


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2311: [CARBONDATA-2487] Block filters for lucene with more...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2311
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5994/



---
12