GitHub user anubhav100 opened a pull request:
https://github.com/apache/carbondata/pull/1777 [CARBONDATA-1973] User Should not Be able to give the duplicate column name in partitio⦠**Jira Link:https://issues.apache.org/jira/browse/CARBONDATA-1973** **User Should not Be able to give the duplicate column name in partition even if its case sensitive,hive also does the same** 1.carbon.sql("CREATE TABLE uniqdata_char2(name char,id int) partitioned by (NAME char)stored by 'carbondata' ") name [uniqdata_char2] 18/01/03 12:44:44 WARN HiveExternalCatalog: Couldn't find corresponding Hive SerDe for data source provider org.apache.spark.sql.CarbonSource. Persisting data source table `default`.`uniqdata_char2` into Hive metastore in Spark SQL specific format, which is NOT compatible with Hive. 18/01/03 12:44:44 AUDIT CarbonCreateTableCommand: [anubhav-Vostro-3559][anubhav][Thread-1]Table created with Database name [default] and Table name [uniqdata_char2] res30: org.apache.spark.sql.DataFrame = [] as we can see table get created successfully 2.try same thing on hive carbon.sql("CREATE TABLE uniqdata_char2_hive(name char,id int) partitioned by (NAME char) ") it gives exception org.apache.spark.sql.AnalysisException: Found duplicate column(s) in table definition of `uniqdata_char2_hive`: name; at org.apache.spark.sql.execution.datasources.AnalyzeCreateTable.org$apache$spark$sql$execution$datasources$AnalyzeCreateTable$$failAnalysis(rules.scala:198) behaviour of carbondata should be similiar to hive You can merge this pull request into a Git repository by running: $ git pull https://github.com/anubhav100/incubator-carbondata CARBONDATA-1973 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1777.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1777 ---- commit 7cb1ca1f93179a4936f828864edd98e61585402f Author: anubhav100 <anubhav.tarar@...> Date: 2018-01-08T11:06:25Z User Should not Be able to give the duplicate column name in partition even if its case sensitive ---- --- |
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1777 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1401/ --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1777#discussion_r161362592 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/parser/CarbonSparkSqlParser.scala --- @@ -200,7 +200,8 @@ class CarbonHelperSqlAstBuilder(conf: SQLConf, throw new MalformedCarbonCommandException("Error: Invalid partition definition") } // partition columns should not be part of the schema - val badPartCols = partitionFields.map(_.partitionColumn).toSet.intersect(colNames.toSet) + val badPartCols = partitionFields.map(_.partitionColumn.toLowerCase). --- End diff -- please move `.` to next line, like: ``` val badPartCols = partitionFields .map(_.partitionColumn.toLowerCase) .toSet .intersect(colNames.map(_.toLowerCase).toSet) ``` --- |
In reply to this post by qiuchenjian-2
Github user anubhav100 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1777#discussion_r161367489 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/parser/CarbonSparkSqlParser.scala --- @@ -200,7 +200,8 @@ class CarbonHelperSqlAstBuilder(conf: SQLConf, throw new MalformedCarbonCommandException("Error: Invalid partition definition") } // partition columns should not be part of the schema - val badPartCols = partitionFields.map(_.partitionColumn).toSet.intersect(colNames.toSet) + val badPartCols = partitionFields.map(_.partitionColumn.toLowerCase). --- End diff -- @jackylk done and pushed --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1777 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2740/ --- |
In reply to this post by qiuchenjian-2
Github user anubhav100 commented on the issue:
https://github.com/apache/carbondata/pull/1777 retest this please --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1777 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1509/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1777 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2745/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1777 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1513/ --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on the issue:
https://github.com/apache/carbondata/pull/1777 @anubhav100 please refer to TestInsertUpdateConcurrentTest modification in #1800 to modify all testcases in DataRetentionConcurrencyTestCase to get the future before checking the result. --- |
In reply to this post by qiuchenjian-2
Github user anubhav100 commented on the issue:
https://github.com/apache/carbondata/pull/1777 @jackylk should i do this in same pr or should i create a new one to refactor test cases --- |
In reply to this post by qiuchenjian-2
Github user anubhav100 commented on the issue:
https://github.com/apache/carbondata/pull/1777 @jackylk and also i will suggest to do the same for InsertUpdateConcurrentTest and TestInsertOverwriteAndCompaction test cases these two also fails randomly and better to club these two class in one as both are using same table schema and have lots of code is dupllicated --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on the issue:
https://github.com/apache/carbondata/pull/1777 sure, please do it --- |
In reply to this post by qiuchenjian-2
Github user anubhav100 commented on the issue:
https://github.com/apache/carbondata/pull/1777 @jackylk i have raised another pr for test cases fix please review here is link https://github.com/apache/carbondata/pull/1801 --- |
In reply to this post by qiuchenjian-2
|
In reply to this post by qiuchenjian-2
|
Free forum by Nabble | Edit this page |