Apache CarbonData Dev Mailing List archive › Apache CarbonData JIRA issues

[GitHub] carbondata pull request #2180: [CARBONDATA-2356] Added UT & SDV Testcases fo...

Classic

List

Threaded

80 messages Options

1234

qiuchenjian-2

[GitHub] carbondata issue #2180: [CARBONDATA-2356] Added UT Scenarios for LuceneDataM...

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2180

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5245/

---

qiuchenjian-2

[GitHub] carbondata pull request #2180: [CARBONDATA-2356] Added UT Scenarios for Luce...

In reply to this post by qiuchenjian-2

Github user Indhumathi27 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2180#discussion_r183140665

--- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMapSuite.scala ---
@@ -197,13 +197,334 @@ class LuceneFineGrainDataMapSuite extends QueryTest with BeforeAndAfterAll {
sql("DROP TABLE IF EXISTS datamap_test3")
}

+ test("test lucene fine grain data map for create datamap with Duplicate Columns") {
+ sql("DROP TABLE IF EXISTS datamap_test_table")
+ sql(
+ """
+ | CREATE TABLE datamap_test_table(id INT, name STRING, city STRING, age INT)
+ | STORED BY 'carbondata'
+ | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='LOCAL_SORT')
+ """.stripMargin)
+ val exception_duplicate_column: Exception = intercept[Exception] {
--- End diff --

Changed Exception to a more specific exception.Please check

---

qiuchenjian-2

[GitHub] carbondata pull request #2180: [CARBONDATA-2356] Added UT Scenarios for Luce...

In reply to this post by qiuchenjian-2

Github user Indhumathi27 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2180#discussion_r183140729

--- Diff: datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMap.java ---
@@ -182,7 +182,7 @@ private String getQueryString(Expression expression) {
// execute index search
TopDocs result;
try {
- result = indexSearcher.search(query, indexReader.maxDoc());
+ result = indexSearcher.search(query, Integer.MAX_VALUE);
--- End diff --

Added

---

qiuchenjian-2

[GitHub] carbondata issue #2180: [CARBONDATA-2356] Added UT Scenarios for LuceneDataM...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2180

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5256/

---

qiuchenjian-2

[GitHub] carbondata issue #2180: [CARBONDATA-2356] Added UT Scenarios for LuceneDataM...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2180

Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4075/

---

qiuchenjian-2

[GitHub] carbondata issue #2180: [CARBONDATA-2356] Added UT Scenarios for LuceneDataM...

In reply to this post by qiuchenjian-2

qiuchenjian-2

[GitHub] carbondata issue #2180: [CARBONDATA-2356] Added UT Scenarios for LuceneDataM...

In reply to this post by qiuchenjian-2

qiuchenjian-2

[GitHub] carbondata issue #2180: [CARBONDATA-2356] Added UT Scenarios for LuceneDataM...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2180

Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4090/

---

qiuchenjian-2

[GitHub] carbondata pull request #2180: [CARBONDATA-2356] Added UT Scenarios for Luce...

In reply to this post by qiuchenjian-2

Github user akashrn5 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2180#discussion_r183205830

--- Diff: datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMap.java ---
@@ -182,7 +182,7 @@ private String getQueryString(Expression expression) {
// execute index search
TopDocs result;
try {
- result = indexSearcher.search(query, indexReader.maxDoc());
+ result = indexSearcher.search(query, Integer.MAX_VALUE);
--- End diff --

@Indhumathi27 i think what jacky means is, when user fires sql like "select * from table wjere text_match('name:a',100)", he should fire with number of records in result from lucene search, if he doesnot give in sql, then we will pass integer max , so please handle according to that in udf

---

qiuchenjian-2

[GitHub] carbondata issue #2180: [CARBONDATA-2356] Added UT Scenarios for LuceneDataM...

In reply to this post by qiuchenjian-2

qiuchenjian-2

[GitHub] carbondata pull request #2180: [CARBONDATA-2356] Added UT Scenarios for Luce...

In reply to this post by qiuchenjian-2

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2180#discussion_r183211232

--- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMapSuite.scala ---
@@ -197,13 +197,353 @@ class LuceneFineGrainDataMapSuite extends QueryTest with BeforeAndAfterAll {
sql("DROP TABLE IF EXISTS datamap_test3")
}

+ test("test lucene fine grain data map for create datamap with Duplicate Columns") {
+ sql("DROP TABLE IF EXISTS datamap_test_table")
+ sql(
+ """
+ | CREATE TABLE datamap_test_table(id INT, name STRING, city STRING, age INT)
+ | STORED BY 'carbondata'
+ | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='LOCAL_SORT')
+ """.stripMargin)
+ val exception_duplicate_column: Exception = intercept[MalformedDataMapCommandException] {
+ sql(
+ s"""
+ | CREATE DATAMAP dm ON TABLE datamap_test_table
+ | USING 'lucene'
+ | DMProperties('TEXT_COLUMNS'='name')
+ """.stripMargin)
+ sql(
+ s"""
+ | CREATE DATAMAP dm1 ON TABLE datamap_test_table
+ | USING 'lucene'
+ | DMProperties('TEXT_COLUMNS'='name')
+ """.stripMargin)
+ }
+ assert(exception_duplicate_column.getMessage
+ .contains("Create lucene datamap dm1 failed, datamap already exists on column(s) name"))
+ sql("drop datamap if exists dm on table datamap_test_table")
+ }
+
+ test("test lucene fine grain data map with wildcard matching ") {
+ sql("DROP TABLE IF EXISTS datamap_test_table")
+ sql(
+ """
+ | CREATE TABLE datamap_test_table(id INT, name STRING, city STRING, age INT)
+ | STORED BY 'carbondata'
+ | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='LOCAL_SORT')
+ """.stripMargin)
+ sql(
+ s"""
+ | CREATE DATAMAP dm ON TABLE datamap_test_table
+ | USING 'lucene'
+ | DMProperties('TEXT_COLUMNS'='Name , cIty')
--- End diff --

change to `name, city`, non-capital

---

qiuchenjian-2

[GitHub] carbondata pull request #2180: [CARBONDATA-2356] Added UT Scenarios for Luce...

In reply to this post by qiuchenjian-2

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2180#discussion_r183211234

--- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMapSuite.scala ---
@@ -197,13 +197,353 @@ class LuceneFineGrainDataMapSuite extends QueryTest with BeforeAndAfterAll {
sql("DROP TABLE IF EXISTS datamap_test3")
}

+ test("test lucene fine grain data map for create datamap with Duplicate Columns") {
+ sql("DROP TABLE IF EXISTS datamap_test_table")
+ sql(
+ """
+ | CREATE TABLE datamap_test_table(id INT, name STRING, city STRING, age INT)
+ | STORED BY 'carbondata'
+ | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='LOCAL_SORT')
+ """.stripMargin)
+ val exception_duplicate_column: Exception = intercept[MalformedDataMapCommandException] {
+ sql(
+ s"""
+ | CREATE DATAMAP dm ON TABLE datamap_test_table
+ | USING 'lucene'
+ | DMProperties('TEXT_COLUMNS'='name')
+ """.stripMargin)
+ sql(
+ s"""
+ | CREATE DATAMAP dm1 ON TABLE datamap_test_table
+ | USING 'lucene'
+ | DMProperties('TEXT_COLUMNS'='name')
+ """.stripMargin)
+ }
+ assert(exception_duplicate_column.getMessage
+ .contains("Create lucene datamap dm1 failed, datamap already exists on column(s) name"))
+ sql("drop datamap if exists dm on table datamap_test_table")
+ }
+
+ test("test lucene fine grain data map with wildcard matching ") {
+ sql("DROP TABLE IF EXISTS datamap_test_table")
+ sql(
+ """
+ | CREATE TABLE datamap_test_table(id INT, name STRING, city STRING, age INT)
+ | STORED BY 'carbondata'
+ | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='LOCAL_SORT')
+ """.stripMargin)
+ sql(
+ s"""
+ | CREATE DATAMAP dm ON TABLE datamap_test_table
+ | USING 'lucene'
+ | DMProperties('TEXT_COLUMNS'='Name , cIty')
+ """.stripMargin)
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE datamap_test_table OPTIONS('header'='false')")
+ checkAnswer(sql("SELECT * FROM datamap_test_table WHERE TEXT_MATCH('name:n99*')"),
+ sql("select * from datamap_test_table where name like 'n99%'"))
+ checkAnswer(sql("SELECT * FROM datamap_test_table WHERE TEXT_MATCH('name:n*9')"),
+ sql(s"select * from datamap_test_table where name like 'n%9'"))
+ sql("drop datamap if exists dm on table datamap_test_table")
+ }
+
+ test("test lucene fine grain data map with TEXT_MATCH 'AND' Filter ") {
+ sql("DROP TABLE IF EXISTS datamap_test_table")
+ sql(
+ """
+ | CREATE TABLE datamap_test_table(id INT, name STRING, city STRING, age INT)
+ | STORED BY 'carbondata'
+ | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='LOCAL_SORT')
+ """.stripMargin)
+ sql(
+ s"""
+ | CREATE DATAMAP dm ON TABLE datamap_test_table
+ | USING 'lucene'
+ | DMProperties('TEXT_COLUMNS'='Name , cIty')
--- End diff --

change to name, city, non-capital

---

qiuchenjian-2

[GitHub] carbondata pull request #2180: [CARBONDATA-2356] Added UT Scenarios for Luce...

In reply to this post by qiuchenjian-2

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2180#discussion_r183211275

--- Diff: datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMap.java ---
@@ -85,6 +85,11 @@
this.analyzer = analyzer;
}

+ /**
+ * default maximum records to return
+ */
+ private int max_doc = Integer.MAX_VALUE;
--- End diff --

I think a local variable is enough, not required to make it class member

---

qiuchenjian-2

[GitHub] carbondata pull request #2180: [CARBONDATA-2356] Added UT Scenarios for Luce...

In reply to this post by qiuchenjian-2

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2180#discussion_r183211277

--- Diff: datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMap.java ---
@@ -85,6 +85,11 @@
this.analyzer = analyzer;
}

+ /**
+ * default maximum records to return
+ */
+ private int max_doc = Integer.MAX_VALUE;
--- End diff --

And change to `maxDoc`

---

qiuchenjian-2

[GitHub] carbondata pull request #2180: [CARBONDATA-2356] Added UT Scenarios for Luce...

In reply to this post by qiuchenjian-2

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2180#discussion_r183211288

--- Diff: datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMap.java ---
@@ -157,6 +162,14 @@ private String getQueryString(Expression expression) {
return null;
}

+ //maximum records to return
+ if (strQuery.contains(",") && strQuery.indexOf(",") != strQuery.length() - 1) {
--- End diff --

It seems there is no testcase for `maxDoc`, do I miss anything?

---

qiuchenjian-2

[GitHub] carbondata pull request #2180: [CARBONDATA-2356] Added UT Scenarios for Luce...

In reply to this post by qiuchenjian-2

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2180#discussion_r183211315

--- Diff: datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMap.java ---
@@ -85,6 +85,11 @@
this.analyzer = analyzer;
}

+ /**
+ * default maximum records to return
+ */
+ private int max_doc = Integer.MAX_VALUE;
--- End diff --

It seems there is no testcase for this parameter when user creating datamap, do I miss anything?

---

qiuchenjian-2

[GitHub] carbondata pull request #2180: [CARBONDATA-2356] Added UT Scenarios for Luce...

In reply to this post by qiuchenjian-2

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2180#discussion_r183211362

--- Diff: datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMap.java ---
@@ -157,6 +162,14 @@ private String getQueryString(Expression expression) {
return null;
}

+ //maximum records to return
+ if (strQuery.contains(",") && strQuery.indexOf(",") != strQuery.length() - 1) {
+ max_doc = Integer.parseInt(strQuery.substring(strQuery.lastIndexOf(",") + 1).trim());
--- End diff --

parseInt may give NumberFormatException, please handle it

---

qiuchenjian-2

[GitHub] carbondata pull request #2180: [CARBONDATA-2356] Added UT Scenarios for Luce...

In reply to this post by qiuchenjian-2

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2180#discussion_r183211383

--- Diff: datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMap.java ---
@@ -157,6 +162,14 @@ private String getQueryString(Expression expression) {
return null;
}

+ //maximum records to return
+ if (strQuery.contains(",") && strQuery.indexOf(",") != strQuery.length() - 1) {
--- End diff --

What should be the correct format in the strQuery?
Please add description for it.

---

qiuchenjian-2

[GitHub] carbondata pull request #2180: [CARBONDATA-2356] Added UT Scenarios for Luce...

In reply to this post by qiuchenjian-2

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2180#discussion_r183211461

--- Diff: datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMap.java ---
@@ -157,6 +162,14 @@ private String getQueryString(Expression expression) {
return null;
}

+ //maximum records to return
+ if (strQuery.contains(",") && strQuery.indexOf(",") != strQuery.length() - 1) {
+ max_doc = Integer.parseInt(strQuery.substring(strQuery.lastIndexOf(",") + 1).trim());
+ strQuery = strQuery.substring(0, strQuery.indexOf(","));
+ } else if (strQuery.contains(",") && strQuery.indexOf(",") == strQuery.length() - 1) {
+ strQuery = strQuery.substring(0, strQuery.indexOf(","));
--- End diff --

I am not sure why this if branch is needed

---

qiuchenjian-2

[GitHub] carbondata pull request #2180: [CARBONDATA-2356] Added UT Scenarios for Luce...

In reply to this post by qiuchenjian-2

Github user Indhumathi27 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2180#discussion_r183212178

--- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMapSuite.scala ---
@@ -197,13 +197,353 @@ class LuceneFineGrainDataMapSuite extends QueryTest with BeforeAndAfterAll {
sql("DROP TABLE IF EXISTS datamap_test3")
}

+ test("test lucene fine grain data map for create datamap with Duplicate Columns") {
+ sql("DROP TABLE IF EXISTS datamap_test_table")
+ sql(
+ """
+ | CREATE TABLE datamap_test_table(id INT, name STRING, city STRING, age INT)
+ | STORED BY 'carbondata'
+ | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='LOCAL_SORT')
+ """.stripMargin)
+ val exception_duplicate_column: Exception = intercept[MalformedDataMapCommandException] {
+ sql(
+ s"""
+ | CREATE DATAMAP dm ON TABLE datamap_test_table
+ | USING 'lucene'
+ | DMProperties('TEXT_COLUMNS'='name')
+ """.stripMargin)
+ sql(
+ s"""
+ | CREATE DATAMAP dm1 ON TABLE datamap_test_table
+ | USING 'lucene'
+ | DMProperties('TEXT_COLUMNS'='name')
+ """.stripMargin)
+ }
+ assert(exception_duplicate_column.getMessage
+ .contains("Create lucene datamap dm1 failed, datamap already exists on column(s) name"))
+ sql("drop datamap if exists dm on table datamap_test_table")
+ }
+
+ test("test lucene fine grain data map with wildcard matching ") {
+ sql("DROP TABLE IF EXISTS datamap_test_table")
+ sql(
+ """
+ | CREATE TABLE datamap_test_table(id INT, name STRING, city STRING, age INT)
+ | STORED BY 'carbondata'
+ | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='LOCAL_SORT')
+ """.stripMargin)
+ sql(
+ s"""
+ | CREATE DATAMAP dm ON TABLE datamap_test_table
+ | USING 'lucene'
+ | DMProperties('TEXT_COLUMNS'='Name , cIty')
--- End diff --

okay

---

1234