Apache CarbonData Dev Mailing List archive › Apache CarbonData JIRA issues

[GitHub] [carbondata] QiangCai opened a new pull request #3753: [WIP] Partition column name should be case insensitive

Classic

List

11 messages Options

Options

GitBox

[GitHub] [carbondata] QiangCai opened a new pull request #3753: [WIP] Partition column name should be case insensitive

QiangCai opened a new pull request #3753:
URL: https://github.com/apache/carbondata/pull/3753

### Why is this PR needed?
when inserting into the static partition, the partition column name is case sensitive now.

### What changes were proposed in this PR?
the partition column name should be case insensitive.
convert all partition column names to low case.

### Does this PR introduce any user interface change?
- No

### Is any new testcase added?
- Yes

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3753: [WIP] Partition column name should be case insensitive

CarbonDataQA1 commented on pull request #3753:
URL: https://github.com/apache/carbondata/pull/3753#issuecomment-625177043

Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1251/

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3753: [WIP] Partition column name should be case insensitive

In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3753:
URL: https://github.com/apache/carbondata/pull/3753#issuecomment-625184095

Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2969/

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3753: [CARBONDATA-3810] Partition column name should be case insensitive

In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #3753:
URL: https://github.com/apache/carbondata/pull/3753#discussion_r422475259

##########
File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/standardpartition/StandardPartitionTableLoadingTestCase.scala
##########
@@ -519,6 +519,40 @@ class StandardPartitionTableLoadingTestCase extends QueryTest with BeforeAndAfte
sql("drop table origin_csv")
}

+ test("test partition column case insensitive: insert into") {
+ sql(
+ """create table cs_insert_p
+ |(id int, Name string)
+ |stored as carbondata
+ |partitioned by (c1 int, c2 int, C3 string)""".stripMargin)
+ sql("alter table cs_insert_p drop if exists partition(C1=1, C2=111, c3='2019-11-18')")

Review comment:
to check case sensitivity, better to have C3 column as some alphabet data instead of date ?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3753: [CARBONDATA-3810] Partition column name should be case insensitive

In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #3753:
URL: https://github.com/apache/carbondata/pull/3753#discussion_r422475311

##########
File path: pom.xml
##########
@@ -548,7 +549,7 @@
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>flatten-maven-plugin</artifactId>
- 
+ <version>1.2.2</version>

Review comment:
why this change ?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3753: [CARBONDATA-3810] Partition column name should be case insensitive

In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #3753:
URL: https://github.com/apache/carbondata/pull/3753#discussion_r422475736

##########
File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/standardpartition/StandardPartitionTableLoadingTestCase.scala
##########
@@ -519,6 +519,40 @@ class StandardPartitionTableLoadingTestCase extends QueryTest with BeforeAndAfte
sql("drop table origin_csv")
}

+ test("test partition column case insensitive: insert into") {
+ sql(
+ """create table cs_insert_p
+ |(id int, Name string)
+ |stored as carbondata
+ |partitioned by (c1 int, c2 int, C3 string)""".stripMargin)
+ sql("alter table cs_insert_p drop if exists partition(C1=1, C2=111, c3='2019-11-18')")
+ sql("alter table cs_insert_p add if not exists partition(C1=1, c2=111, C3='2019-11-18')")
+ sql(
+ """insert into table cs_insert_p
+ | partition(c1=3, C2=111, c3='2019-11-18')
+ | select 200, 'cc'""".stripMargin)
+ checkAnswer(sql("select count(*) from cs_insert_p"), Seq(Row(1)))
+ sql("alter table cs_insert_p drop if exists partition(C1=3, C2=111, c3='2019-11-18')")
+ checkAnswer(sql("select count(*) from cs_insert_p"), Seq(Row(0)))
+ }
+
+ test("test partition column case insensitive: load data") {
+ sql(
+ """
+ | CREATE TABLE cs_load_p (doj Timestamp,
+ | workgroupcategory int, workgroupcategoryname String, deptno int, deptname String,
+ | projectcode int, projectjoindate Timestamp, projectenddate Timestamp,attendance int,
+ | utilization int,salary int)
+ | PARTITIONED BY (empnO int, empnAme String, designaTion String)
+ | STORED AS carbondata
+ """.stripMargin)
+ sql(s"""LOAD DATA local inpath '$resourcesPath/data.csv' INTO TABLE cs_load_p PARTITION(empNo='99', empName='ravi', Designation='xx')""")
+ sql(s"""LOAD DATA local inpath '$resourcesPath/data.csv' INTO TABLE cs_load_p PARTITION(empno='100', emPname='indra', designation='yy')""")
+ checkAnswer(sql("show partitions cs_load_p"), Seq(

Review comment:
can we add filter query with 'indra' and 'INdRA'.
Is filter query should be case-sensitive ?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] QiangCai commented on a change in pull request #3753: [CARBONDATA-3810] Partition column name should be case insensitive

In reply to this post by GitBox

QiangCai commented on a change in pull request #3753:
URL: https://github.com/apache/carbondata/pull/3753#discussion_r422478213

##########
File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/standardpartition/StandardPartitionTableLoadingTestCase.scala
##########
@@ -519,6 +519,40 @@ class StandardPartitionTableLoadingTestCase extends QueryTest with BeforeAndAfte
sql("drop table origin_csv")
}

+ test("test partition column case insensitive: insert into") {
+ sql(
+ """create table cs_insert_p
+ |(id int, Name string)
+ |stored as carbondata
+ |partitioned by (c1 int, c2 int, C3 string)""".stripMargin)
+ sql("alter table cs_insert_p drop if exists partition(C1=1, C2=111, c3='2019-11-18')")

Review comment:
we only check column name

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] QiangCai commented on a change in pull request #3753: [CARBONDATA-3810] Partition column name should be case insensitive

In reply to this post by GitBox

QiangCai commented on a change in pull request #3753:
URL: https://github.com/apache/carbondata/pull/3753#discussion_r422478285

##########
File path: pom.xml
##########
@@ -548,7 +549,7 @@
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>flatten-maven-plugin</artifactId>
- 
+ <version>1.2.2</version>

Review comment:
maven will throw many warning info at the begin of building if we not set it.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] QiangCai commented on a change in pull request #3753: [CARBONDATA-3810] Partition column name should be case insensitive

In reply to this post by GitBox

QiangCai commented on a change in pull request #3753:
URL: https://github.com/apache/carbondata/pull/3753#discussion_r422478411

##########
File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/standardpartition/StandardPartitionTableLoadingTestCase.scala
##########
@@ -519,6 +519,40 @@ class StandardPartitionTableLoadingTestCase extends QueryTest with BeforeAndAfte
sql("drop table origin_csv")
}

+ test("test partition column case insensitive: insert into") {
+ sql(
+ """create table cs_insert_p
+ |(id int, Name string)
+ |stored as carbondata
+ |partitioned by (c1 int, c2 int, C3 string)""".stripMargin)
+ sql("alter table cs_insert_p drop if exists partition(C1=1, C2=111, c3='2019-11-18')")
+ sql("alter table cs_insert_p add if not exists partition(C1=1, c2=111, C3='2019-11-18')")
+ sql(
+ """insert into table cs_insert_p
+ | partition(c1=3, C2=111, c3='2019-11-18')
+ | select 200, 'cc'""".stripMargin)
+ checkAnswer(sql("select count(*) from cs_insert_p"), Seq(Row(1)))
+ sql("alter table cs_insert_p drop if exists partition(C1=3, C2=111, c3='2019-11-18')")
+ checkAnswer(sql("select count(*) from cs_insert_p"), Seq(Row(0)))
+ }
+
+ test("test partition column case insensitive: load data") {
+ sql(
+ """
+ | CREATE TABLE cs_load_p (doj Timestamp,
+ | workgroupcategory int, workgroupcategoryname String, deptno int, deptname String,
+ | projectcode int, projectjoindate Timestamp, projectenddate Timestamp,attendance int,
+ | utilization int,salary int)
+ | PARTITIONED BY (empnO int, empnAme String, designaTion String)
+ | STORED AS carbondata
+ """.stripMargin)
+ sql(s"""LOAD DATA local inpath '$resourcesPath/data.csv' INTO TABLE cs_load_p PARTITION(empNo='99', empName='ravi', Designation='xx')""")
+ sql(s"""LOAD DATA local inpath '$resourcesPath/data.csv' INTO TABLE cs_load_p PARTITION(empno='100', emPname='indra', designation='yy')""")
+ checkAnswer(sql("show partitions cs_load_p"), Seq(

Review comment:
it is not the scope of this PR. the data content should be case-sensitive.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3753: [CARBONDATA-3810] Partition column name should be case insensitive

In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #3753:
URL: https://github.com/apache/carbondata/pull/3753#discussion_r422478666

##########
File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/standardpartition/StandardPartitionTableLoadingTestCase.scala
##########
@@ -519,6 +519,40 @@ class StandardPartitionTableLoadingTestCase extends QueryTest with BeforeAndAfte
sql("drop table origin_csv")
}

+ test("test partition column case insensitive: insert into") {
+ sql(
+ """create table cs_insert_p
+ |(id int, Name string)
+ |stored as carbondata
+ |partitioned by (c1 int, c2 int, C3 string)""".stripMargin)
+ sql("alter table cs_insert_p drop if exists partition(C1=1, C2=111, c3='2019-11-18')")
+ sql("alter table cs_insert_p add if not exists partition(C1=1, c2=111, C3='2019-11-18')")
+ sql(
+ """insert into table cs_insert_p
+ | partition(c1=3, C2=111, c3='2019-11-18')
+ | select 200, 'cc'""".stripMargin)
+ checkAnswer(sql("select count(*) from cs_insert_p"), Seq(Row(1)))
+ sql("alter table cs_insert_p drop if exists partition(C1=3, C2=111, c3='2019-11-18')")
+ checkAnswer(sql("select count(*) from cs_insert_p"), Seq(Row(0)))
+ }
+
+ test("test partition column case insensitive: load data") {
+ sql(
+ """
+ | CREATE TABLE cs_load_p (doj Timestamp,
+ | workgroupcategory int, workgroupcategoryname String, deptno int, deptname String,
+ | projectcode int, projectjoindate Timestamp, projectenddate Timestamp,attendance int,
+ | utilization int,salary int)
+ | PARTITIONED BY (empnO int, empnAme String, designaTion String)
+ | STORED AS carbondata
+ """.stripMargin)
+ sql(s"""LOAD DATA local inpath '$resourcesPath/data.csv' INTO TABLE cs_load_p PARTITION(empNo='99', empName='ravi', Designation='xx')""")
+ sql(s"""LOAD DATA local inpath '$resourcesPath/data.csv' INTO TABLE cs_load_p PARTITION(empno='100', emPname='indra', designation='yy')""")
+ checkAnswer(sql("show partitions cs_load_p"), Seq(

Review comment:
I got it

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] ajantha-bhat commented on pull request #3753: [CARBONDATA-3810] Partition column name should be case insensitive

In reply to this post by GitBox

ajantha-bhat commented on pull request #3753:
URL: https://github.com/apache/carbondata/pull/3753#issuecomment-626142522

LGTM

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]