GitHub user Shubh18s opened a pull request:
https://github.com/apache/carbondata/pull/2988 [CARBONDATA-3174] Fix trailing space issue with varchar column for SDK What was the issue? After doing SDK Write, Select * was failing for 'long_string_columns' with trailing space. What has been changed? Removed the trailing space in ColumnName. Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [x] Testing done Added a test case. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/Shubh18s/carbondata master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2988.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2988 ---- commit 7a5d26e6802fef2e40f987e49298a25b78183d20 Author: Shubh18s <singh18shubhdeep@...> Date: 2018-12-14T09:11:06Z varchar column trailing space issue fixed ---- --- |
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2988 Can one of the admins verify this patch? --- |
In reply to this post by qiuchenjian-2
Github user brijoobopanna commented on the issue:
https://github.com/apache/carbondata/pull/2988 add to whitelist --- |
In reply to this post by qiuchenjian-2
Github user brijoobopanna commented on the issue:
https://github.com/apache/carbondata/pull/2988 retest this please --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2988 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1761/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2988 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1972/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2988 Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10020/ --- |
In reply to this post by qiuchenjian-2
Github user qiuchenjian commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2988#discussion_r241934868 --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/createTable/TestNonTransactionalCarbonTable.scala --- @@ -2490,6 +2490,47 @@ class TestNonTransactionalCarbonTable extends QueryTest with BeforeAndAfterAll { FileUtils.deleteDirectory(new File(writerPath)) } + test("check varchar with trailing space") { + FileUtils.deleteDirectory(new File(writerPath)) + val fields: Array[Field] = new Array[Field](8) + + fields(0) = new Field("Event_ID", DataTypes.STRING) + fields(1) = new Field("Event_Time", DataTypes.TIMESTAMP) + fields(2) = new Field("subject", DataTypes.VARCHAR) + fields(3) = new Field("From_Email", DataTypes.STRING) + fields(4) = new Field("To_Email", DataTypes.createArrayType(DataTypes.STRING)) + fields(5) = new Field("CC_Email", DataTypes.createArrayType(DataTypes.STRING)) + fields(6) = new Field("BCC_Email", DataTypes.createArrayType(DataTypes.STRING)) + fields(7) = new Field("messagebody ", DataTypes.VARCHAR) + + var options = Map("bAd_RECords_action" -> "FORCE").asJava + CarbonProperties.getInstance() + .addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, "EEE, d MMM yyyy HH:mm:ss Z") + + val writer: CarbonWriter = CarbonWriter.builder + .outputPath(writerPath) + .withCsvInput(new Schema(fields)).writtenBy("TestNonTransactionalCarbonTable").build() + + writer + .write(Array("aaa", + "Fri, 4 May 2001 13:51:00 -0700 (PDT)", + "Re", + "[hidden email]", + "sd#er", + "sd", + "sds", + "ew")) + writer.close() + + sql("drop table if exists test") + sql( + s"""CREATE TABLE test using carbon options('long_string_columns'='subject,messagebody') + |LOCATION '$writerPath'""" + .stripMargin) + sql("select * from test").show() --- End diff -- Why is this test case not checking the correctness of the results? Such as checkAnswer , assert and so on --- |
In reply to this post by qiuchenjian-2
Github user Shubh18s commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2988#discussion_r242033587 --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/createTable/TestNonTransactionalCarbonTable.scala --- @@ -2490,6 +2490,47 @@ class TestNonTransactionalCarbonTable extends QueryTest with BeforeAndAfterAll { FileUtils.deleteDirectory(new File(writerPath)) } + test("check varchar with trailing space") { + FileUtils.deleteDirectory(new File(writerPath)) + val fields: Array[Field] = new Array[Field](8) + + fields(0) = new Field("Event_ID", DataTypes.STRING) + fields(1) = new Field("Event_Time", DataTypes.TIMESTAMP) + fields(2) = new Field("subject", DataTypes.VARCHAR) + fields(3) = new Field("From_Email", DataTypes.STRING) + fields(4) = new Field("To_Email", DataTypes.createArrayType(DataTypes.STRING)) + fields(5) = new Field("CC_Email", DataTypes.createArrayType(DataTypes.STRING)) + fields(6) = new Field("BCC_Email", DataTypes.createArrayType(DataTypes.STRING)) + fields(7) = new Field("messagebody ", DataTypes.VARCHAR) + + var options = Map("bAd_RECords_action" -> "FORCE").asJava + CarbonProperties.getInstance() + .addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, "EEE, d MMM yyyy HH:mm:ss Z") + + val writer: CarbonWriter = CarbonWriter.builder + .outputPath(writerPath) + .withCsvInput(new Schema(fields)).writtenBy("TestNonTransactionalCarbonTable").build() + + writer + .write(Array("aaa", + "Fri, 4 May 2001 13:51:00 -0700 (PDT)", + "Re", + "[hidden email]", + "sd#er", + "sd", + "sds", + "ew")) + writer.close() + + sql("drop table if exists test") + sql( + s"""CREATE TABLE test using carbon options('long_string_columns'='subject,messagebody') + |LOCATION '$writerPath'""" + .stripMargin) + sql("select * from test").show() --- End diff -- Done --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2988 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1786/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2988 Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10046/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2988 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1998/ --- |
In reply to this post by qiuchenjian-2
Github user Shubh18s commented on the issue:
https://github.com/apache/carbondata/pull/2988 retest this please --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2988 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1793/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2988 Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10053/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2988 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2005/ --- |
In reply to this post by qiuchenjian-2
Github user ajantha-bhat commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2988#discussion_r242157568 --- Diff: store/sdk/src/main/java/org/apache/carbondata/sdk/file/Field.java --- @@ -55,7 +55,7 @@ * @param type datatype of field, specified in strings. */ public Field(String name, String type) { - this.name = name; + this.name = name.toLowerCase().trim(); --- End diff -- CarbonWriterBuilder.updateSchemaFields() is already converting to lowercase, just add trim in that method. No need to handle for each here. --- |
In reply to this post by qiuchenjian-2
Github user ajantha-bhat commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2988#discussion_r242161063 --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/createTable/TestNonTransactionalCarbonTable.scala --- @@ -2490,6 +2490,54 @@ class TestNonTransactionalCarbonTable extends QueryTest with BeforeAndAfterAll { FileUtils.deleteDirectory(new File(writerPath)) } + test("check varchar with trailing space") { --- End diff -- No need to duplicate test cases. In the existing varchar columns test case, add a trailing space to one of the columns. --- |
In reply to this post by qiuchenjian-2
Github user ajantha-bhat commented on the issue:
https://github.com/apache/carbondata/pull/2988 @Shubh18s : why for only varchar columns ? how it was handled other columns ? I guess this problem is there for other columns also --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2988 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1801/ --- |
Free forum by Nabble | Edit this page |