[GitHub] carbondata pull request #2180: [CARBONDATA-2356] Added UT & SDV Testcases fo...

classic Classic list List threaded Threaded
80 messages Options
1234
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2180: [CARBONDATA-2356] Added UT Scenarios for LuceneDataM...

qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2180
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5245/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2180: [CARBONDATA-2356] Added UT Scenarios for Luce...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user Indhumathi27 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2180#discussion_r183140665
 
    --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMapSuite.scala ---
    @@ -197,13 +197,334 @@ class LuceneFineGrainDataMapSuite extends QueryTest with BeforeAndAfterAll {
         sql("DROP TABLE IF EXISTS datamap_test3")
       }
     
    +  test("test lucene fine grain data map for create datamap with Duplicate Columns") {
    +    sql("DROP TABLE IF EXISTS datamap_test_table")
    +    sql(
    +      """
    +        | CREATE TABLE datamap_test_table(id INT, name STRING, city STRING, age INT)
    +        | STORED BY 'carbondata'
    +        | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='LOCAL_SORT')
    +      """.stripMargin)
    +    val exception_duplicate_column: Exception = intercept[Exception] {
    --- End diff --
   
    Changed Exception to a more specific exception.Please check


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2180: [CARBONDATA-2356] Added UT Scenarios for Luce...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user Indhumathi27 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2180#discussion_r183140729
 
    --- Diff: datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMap.java ---
    @@ -182,7 +182,7 @@ private String getQueryString(Expression expression) {
         // execute index search
         TopDocs result;
         try {
    -      result = indexSearcher.search(query, indexReader.maxDoc());
    +      result = indexSearcher.search(query, Integer.MAX_VALUE);
    --- End diff --
   
    Added


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2180: [CARBONDATA-2356] Added UT Scenarios for LuceneDataM...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2180
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5256/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2180: [CARBONDATA-2356] Added UT Scenarios for LuceneDataM...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2180
 
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4075/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2180: [CARBONDATA-2356] Added UT Scenarios for LuceneDataM...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2180
 
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4084/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2180: [CARBONDATA-2356] Added UT Scenarios for LuceneDataM...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2180
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5265/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2180: [CARBONDATA-2356] Added UT Scenarios for LuceneDataM...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2180
 
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4090/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2180: [CARBONDATA-2356] Added UT Scenarios for Luce...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user akashrn5 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2180#discussion_r183205830
 
    --- Diff: datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMap.java ---
    @@ -182,7 +182,7 @@ private String getQueryString(Expression expression) {
         // execute index search
         TopDocs result;
         try {
    -      result = indexSearcher.search(query, indexReader.maxDoc());
    +      result = indexSearcher.search(query, Integer.MAX_VALUE);
    --- End diff --
   
    @Indhumathi27 i think what jacky means is, when user fires sql like "select * from table wjere text_match('name:a',100)", he should fire with number of records in result from lucene search, if he doesnot give in sql, then we will pass integer max , so please handle according to that in udf


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2180: [CARBONDATA-2356] Added UT Scenarios for LuceneDataM...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2180
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5271/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2180: [CARBONDATA-2356] Added UT Scenarios for Luce...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2180#discussion_r183211232
 
    --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMapSuite.scala ---
    @@ -197,13 +197,353 @@ class LuceneFineGrainDataMapSuite extends QueryTest with BeforeAndAfterAll {
         sql("DROP TABLE IF EXISTS datamap_test3")
       }
     
    +  test("test lucene fine grain data map for create datamap with Duplicate Columns") {
    +    sql("DROP TABLE IF EXISTS datamap_test_table")
    +    sql(
    +      """
    +        | CREATE TABLE datamap_test_table(id INT, name STRING, city STRING, age INT)
    +        | STORED BY 'carbondata'
    +        | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='LOCAL_SORT')
    +      """.stripMargin)
    +    val exception_duplicate_column: Exception = intercept[MalformedDataMapCommandException] {
    +      sql(
    +        s"""
    +           | CREATE DATAMAP dm ON TABLE datamap_test_table
    +           | USING 'lucene'
    +           | DMProperties('TEXT_COLUMNS'='name')
    +      """.stripMargin)
    +      sql(
    +        s"""
    +           | CREATE DATAMAP dm1 ON TABLE datamap_test_table
    +           | USING 'lucene'
    +           | DMProperties('TEXT_COLUMNS'='name')
    +      """.stripMargin)
    +    }
    +    assert(exception_duplicate_column.getMessage
    +      .contains("Create lucene datamap dm1 failed, datamap already exists on column(s) name"))
    +    sql("drop datamap if exists dm on table datamap_test_table")
    +  }
    +
    +  test("test lucene fine grain data map with wildcard matching ") {
    +    sql("DROP TABLE IF EXISTS datamap_test_table")
    +    sql(
    +      """
    +        | CREATE TABLE datamap_test_table(id INT, name STRING, city STRING, age INT)
    +        | STORED BY 'carbondata'
    +        | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='LOCAL_SORT')
    +      """.stripMargin)
    +    sql(
    +      s"""
    +         | CREATE DATAMAP dm ON TABLE datamap_test_table
    +         | USING 'lucene'
    +         | DMProperties('TEXT_COLUMNS'='Name , cIty')
    --- End diff --
   
    change to `name, city`, non-capital


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2180: [CARBONDATA-2356] Added UT Scenarios for Luce...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2180#discussion_r183211234
 
    --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMapSuite.scala ---
    @@ -197,13 +197,353 @@ class LuceneFineGrainDataMapSuite extends QueryTest with BeforeAndAfterAll {
         sql("DROP TABLE IF EXISTS datamap_test3")
       }
     
    +  test("test lucene fine grain data map for create datamap with Duplicate Columns") {
    +    sql("DROP TABLE IF EXISTS datamap_test_table")
    +    sql(
    +      """
    +        | CREATE TABLE datamap_test_table(id INT, name STRING, city STRING, age INT)
    +        | STORED BY 'carbondata'
    +        | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='LOCAL_SORT')
    +      """.stripMargin)
    +    val exception_duplicate_column: Exception = intercept[MalformedDataMapCommandException] {
    +      sql(
    +        s"""
    +           | CREATE DATAMAP dm ON TABLE datamap_test_table
    +           | USING 'lucene'
    +           | DMProperties('TEXT_COLUMNS'='name')
    +      """.stripMargin)
    +      sql(
    +        s"""
    +           | CREATE DATAMAP dm1 ON TABLE datamap_test_table
    +           | USING 'lucene'
    +           | DMProperties('TEXT_COLUMNS'='name')
    +      """.stripMargin)
    +    }
    +    assert(exception_duplicate_column.getMessage
    +      .contains("Create lucene datamap dm1 failed, datamap already exists on column(s) name"))
    +    sql("drop datamap if exists dm on table datamap_test_table")
    +  }
    +
    +  test("test lucene fine grain data map with wildcard matching ") {
    +    sql("DROP TABLE IF EXISTS datamap_test_table")
    +    sql(
    +      """
    +        | CREATE TABLE datamap_test_table(id INT, name STRING, city STRING, age INT)
    +        | STORED BY 'carbondata'
    +        | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='LOCAL_SORT')
    +      """.stripMargin)
    +    sql(
    +      s"""
    +         | CREATE DATAMAP dm ON TABLE datamap_test_table
    +         | USING 'lucene'
    +         | DMProperties('TEXT_COLUMNS'='Name , cIty')
    +      """.stripMargin)
    +    sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE datamap_test_table OPTIONS('header'='false')")
    +    checkAnswer(sql("SELECT * FROM datamap_test_table WHERE TEXT_MATCH('name:n99*')"),
    +      sql("select * from datamap_test_table where name like 'n99%'"))
    +    checkAnswer(sql("SELECT * FROM datamap_test_table WHERE TEXT_MATCH('name:n*9')"),
    +      sql(s"select * from datamap_test_table where name like 'n%9'"))
    +    sql("drop datamap if exists dm on table datamap_test_table")
    +  }
    +
    +  test("test lucene fine grain data map with TEXT_MATCH 'AND' Filter ") {
    +    sql("DROP TABLE IF EXISTS datamap_test_table")
    +    sql(
    +      """
    +        | CREATE TABLE datamap_test_table(id INT, name STRING, city STRING, age INT)
    +        | STORED BY 'carbondata'
    +        | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='LOCAL_SORT')
    +      """.stripMargin)
    +    sql(
    +      s"""
    +         | CREATE DATAMAP dm ON TABLE datamap_test_table
    +         | USING 'lucene'
    +         | DMProperties('TEXT_COLUMNS'='Name , cIty')
    --- End diff --
   
    change to name, city, non-capital


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2180: [CARBONDATA-2356] Added UT Scenarios for Luce...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2180#discussion_r183211275
 
    --- Diff: datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMap.java ---
    @@ -85,6 +85,11 @@
         this.analyzer = analyzer;
       }
     
    +  /**
    +   * default maximum records to return
    +   */
    +  private int max_doc = Integer.MAX_VALUE;
    --- End diff --
   
    I think a local variable is enough, not required to make it class member


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2180: [CARBONDATA-2356] Added UT Scenarios for Luce...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2180#discussion_r183211277
 
    --- Diff: datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMap.java ---
    @@ -85,6 +85,11 @@
         this.analyzer = analyzer;
       }
     
    +  /**
    +   * default maximum records to return
    +   */
    +  private int max_doc = Integer.MAX_VALUE;
    --- End diff --
   
    And change to `maxDoc`


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2180: [CARBONDATA-2356] Added UT Scenarios for Luce...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2180#discussion_r183211288
 
    --- Diff: datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMap.java ---
    @@ -157,6 +162,14 @@ private String getQueryString(Expression expression) {
           return null;
         }
     
    +    //maximum records to return
    +    if (strQuery.contains(",") && strQuery.indexOf(",") != strQuery.length() - 1) {
    --- End diff --
   
    It seems there is no testcase for `maxDoc`, do I miss anything?


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2180: [CARBONDATA-2356] Added UT Scenarios for Luce...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2180#discussion_r183211315
 
    --- Diff: datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMap.java ---
    @@ -85,6 +85,11 @@
         this.analyzer = analyzer;
       }
     
    +  /**
    +   * default maximum records to return
    +   */
    +  private int max_doc = Integer.MAX_VALUE;
    --- End diff --
   
    It seems there is no testcase for this parameter when user creating datamap, do I miss anything?


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2180: [CARBONDATA-2356] Added UT Scenarios for Luce...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2180#discussion_r183211362
 
    --- Diff: datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMap.java ---
    @@ -157,6 +162,14 @@ private String getQueryString(Expression expression) {
           return null;
         }
     
    +    //maximum records to return
    +    if (strQuery.contains(",") && strQuery.indexOf(",") != strQuery.length() - 1) {
    +      max_doc = Integer.parseInt(strQuery.substring(strQuery.lastIndexOf(",") + 1).trim());
    --- End diff --
   
    parseInt may give NumberFormatException, please handle it


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2180: [CARBONDATA-2356] Added UT Scenarios for Luce...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2180#discussion_r183211383
 
    --- Diff: datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMap.java ---
    @@ -157,6 +162,14 @@ private String getQueryString(Expression expression) {
           return null;
         }
     
    +    //maximum records to return
    +    if (strQuery.contains(",") && strQuery.indexOf(",") != strQuery.length() - 1) {
    --- End diff --
   
    What should be the correct format in the strQuery?
    Please add description for it.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2180: [CARBONDATA-2356] Added UT Scenarios for Luce...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2180#discussion_r183211461
 
    --- Diff: datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMap.java ---
    @@ -157,6 +162,14 @@ private String getQueryString(Expression expression) {
           return null;
         }
     
    +    //maximum records to return
    +    if (strQuery.contains(",") && strQuery.indexOf(",") != strQuery.length() - 1) {
    +      max_doc = Integer.parseInt(strQuery.substring(strQuery.lastIndexOf(",") + 1).trim());
    +      strQuery = strQuery.substring(0, strQuery.indexOf(","));
    +    } else if (strQuery.contains(",") && strQuery.indexOf(",") == strQuery.length() - 1) {
    +      strQuery = strQuery.substring(0, strQuery.indexOf(","));
    --- End diff --
   
    I am not sure why this if branch is needed


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2180: [CARBONDATA-2356] Added UT Scenarios for Luce...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user Indhumathi27 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2180#discussion_r183212178
 
    --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMapSuite.scala ---
    @@ -197,13 +197,353 @@ class LuceneFineGrainDataMapSuite extends QueryTest with BeforeAndAfterAll {
         sql("DROP TABLE IF EXISTS datamap_test3")
       }
     
    +  test("test lucene fine grain data map for create datamap with Duplicate Columns") {
    +    sql("DROP TABLE IF EXISTS datamap_test_table")
    +    sql(
    +      """
    +        | CREATE TABLE datamap_test_table(id INT, name STRING, city STRING, age INT)
    +        | STORED BY 'carbondata'
    +        | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='LOCAL_SORT')
    +      """.stripMargin)
    +    val exception_duplicate_column: Exception = intercept[MalformedDataMapCommandException] {
    +      sql(
    +        s"""
    +           | CREATE DATAMAP dm ON TABLE datamap_test_table
    +           | USING 'lucene'
    +           | DMProperties('TEXT_COLUMNS'='name')
    +      """.stripMargin)
    +      sql(
    +        s"""
    +           | CREATE DATAMAP dm1 ON TABLE datamap_test_table
    +           | USING 'lucene'
    +           | DMProperties('TEXT_COLUMNS'='name')
    +      """.stripMargin)
    +    }
    +    assert(exception_duplicate_column.getMessage
    +      .contains("Create lucene datamap dm1 failed, datamap already exists on column(s) name"))
    +    sql("drop datamap if exists dm on table datamap_test_table")
    +  }
    +
    +  test("test lucene fine grain data map with wildcard matching ") {
    +    sql("DROP TABLE IF EXISTS datamap_test_table")
    +    sql(
    +      """
    +        | CREATE TABLE datamap_test_table(id INT, name STRING, city STRING, age INT)
    +        | STORED BY 'carbondata'
    +        | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='LOCAL_SORT')
    +      """.stripMargin)
    +    sql(
    +      s"""
    +         | CREATE DATAMAP dm ON TABLE datamap_test_table
    +         | USING 'lucene'
    +         | DMProperties('TEXT_COLUMNS'='Name , cIty')
    --- End diff --
   
    okay


---
1234