GitHub user manishnalla1994 opened a pull request:
https://github.com/apache/carbondata/pull/2993 [WIP] Map data load failure Problem : Data Load failing for Insert into Select from same table in containing Map datatype. Solution: Map type was not handled for this scenario. Handled it now. Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [x] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/manishnalla1994/carbondata MapDataLoadFailure Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2993.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2993 ---- commit 69010850cccfa4e82f1b70ea954454ff1af1a61a Author: manishnalla1994 <manish.nalla1994@...> Date: 2018-12-07T09:25:58Z Delimiters changed commit de2603041e2a123af1ae403289d2eed7f7c7c24a Author: manishnalla1994 <manish.nalla1994@...> Date: 2018-10-16T09:48:08Z MapDDLSupport commit e35126868e563971c11dcbe82e200adc476f7143 Author: manishnalla1994 <manish.nalla1994@...> Date: 2018-12-14T11:50:15Z Change of Function for all Delimiters ---- --- |
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2993 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1791/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2993 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1792/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2993 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2004/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2993 Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10052/ --- |
In reply to this post by qiuchenjian-2
Github user qiuchenjian commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2993#discussion_r242435030 --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/createTable/TestCreateDDLForComplexMapType.scala --- @@ -442,4 +441,45 @@ class TestCreateDDLForComplexMapType extends QueryTest with BeforeAndAfterAll { "sort_columns is unsupported for map datatype column: mapfield")) } + test("Data Load Fail Issue") { + sql("DROP TABLE IF EXISTS carbon") + sql( + s""" + | CREATE TABLE carbon( + | mapField map<INT,STRING> + | ) + | STORED BY 'carbondata' + | """ + .stripMargin) + sql( + s""" + | LOAD DATA LOCAL INPATH '$path' + | INTO TABLE carbon OPTIONS( + | 'header' = 'false') + """.stripMargin) + sql("INSERT INTO carbon SELECT * FROM carbon") + checkAnswer(sql("select * from carbon"), Seq( + Row(Map(1 -> "Nalla", 2 -> "Singh", 4 -> "Kumar")), + Row(Map(1 -> "Nalla", 2 -> "Singh", 4 -> "Kumar")), + Row(Map(10 -> "Nallaa", 20 -> "Sissngh", 100 -> "Gusspta", 40 -> "Kumar")), + Row(Map(10 -> "Nallaa", 20 -> "Sissngh", 100 -> "Gusspta", 40 -> "Kumar")) + )) + } + + test("Struct inside map") { + sql("DROP TABLE IF EXISTS carbon") --- End diff -- Why is there no result check for this test case of "Stunct inside mapââ --- |
In reply to this post by qiuchenjian-2
Github user qiuchenjian commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2993#discussion_r242436603 --- Diff: streaming/src/main/scala/org/apache/carbondata/streaming/parser/FieldConverter.scala --- @@ -66,30 +65,57 @@ object FieldConverter { case bs: Array[Byte] => new String(bs, Charset.forName(CarbonCommonConstants.DEFAULT_CHARSET)) case s: scala.collection.Seq[Any] => - val delimiter = if (level == 1) { - delimiterLevel1 - } else { - delimiterLevel2 - } + val delimiter = complexDelimiters.get((level)) --- End diff -- line 68 and line 83 should use uniform style --- |
In reply to this post by qiuchenjian-2
Github user manishnalla1994 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2993#discussion_r242448914 --- Diff: streaming/src/main/scala/org/apache/carbondata/streaming/parser/FieldConverter.scala --- @@ -66,30 +65,57 @@ object FieldConverter { case bs: Array[Byte] => new String(bs, Charset.forName(CarbonCommonConstants.DEFAULT_CHARSET)) case s: scala.collection.Seq[Any] => - val delimiter = if (level == 1) { - delimiterLevel1 - } else { - delimiterLevel2 - } + val delimiter = complexDelimiters.get((level)) --- End diff -- Done --- |
In reply to this post by qiuchenjian-2
Github user manishnalla1994 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2993#discussion_r242448934 --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/createTable/TestCreateDDLForComplexMapType.scala --- @@ -442,4 +441,45 @@ class TestCreateDDLForComplexMapType extends QueryTest with BeforeAndAfterAll { "sort_columns is unsupported for map datatype column: mapfield")) } + test("Data Load Fail Issue") { + sql("DROP TABLE IF EXISTS carbon") + sql( + s""" + | CREATE TABLE carbon( + | mapField map<INT,STRING> + | ) + | STORED BY 'carbondata' + | """ + .stripMargin) + sql( + s""" + | LOAD DATA LOCAL INPATH '$path' + | INTO TABLE carbon OPTIONS( + | 'header' = 'false') + """.stripMargin) + sql("INSERT INTO carbon SELECT * FROM carbon") + checkAnswer(sql("select * from carbon"), Seq( + Row(Map(1 -> "Nalla", 2 -> "Singh", 4 -> "Kumar")), + Row(Map(1 -> "Nalla", 2 -> "Singh", 4 -> "Kumar")), + Row(Map(10 -> "Nallaa", 20 -> "Sissngh", 100 -> "Gusspta", 40 -> "Kumar")), + Row(Map(10 -> "Nallaa", 20 -> "Sissngh", 100 -> "Gusspta", 40 -> "Kumar")) + )) + } + + test("Struct inside map") { + sql("DROP TABLE IF EXISTS carbon") --- End diff -- Done --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2993 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1821/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2993 Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10079/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2993 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1824/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2993 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2032/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2993 Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10083/ --- |
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2993#discussion_r243227160 --- Diff: streaming/src/main/scala/org/apache/carbondata/streaming/parser/RowStreamParserImp.scala --- @@ -53,19 +54,21 @@ class RowStreamParserImp extends CarbonStreamParser { this.configuration.get(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT)) this.dateFormat = new SimpleDateFormat( this.configuration.get(CarbonCommonConstants.CARBON_DATE_FORMAT)) - this.complexDelimiterLevel1 = this.configuration.get("carbon_complex_delimiter_level_1") - this.complexDelimiterLevel2 = this.configuration.get("carbon_complex_delimiter_level_2") - this.complexDelimiterLevel3 = this.configuration.get("carbon_complex_delimiter_level_3") + this.complexDelimiters.add(this.configuration.get("carbon_complex_delimiter_level_1")) + this.complexDelimiters.add(this.configuration.get("carbon_complex_delimiter_level_2")) + this.complexDelimiters.add(this.configuration.get("carbon_complex_delimiter_level_3")) + this.complexDelimiters.add(ComplexDelimitersEnum.COMPLEX_DELIMITERS_LEVEL_4.value()) this.serializationNullFormat = this.configuration.get(DataLoadProcessorConstants.SERIALIZATION_NULL_FORMAT) } override def parserRow(value: InternalRow): Array[Object] = { this.encoder.fromRow(value).toSeq.map { x => { FieldConverter.objectToString( - x, serializationNullFormat, complexDelimiterLevel1, complexDelimiterLevel2, + x, serializationNullFormat, complexDelimiters, timeStampFormat, dateFormat) - } }.toArray + } --- End diff -- please format it correctly --- |
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on the issue:
https://github.com/apache/carbondata/pull/2993 @manishnalla1994 Please correct the format and add comments --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2993 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1876/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2993 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2086/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2993 Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10131/ --- |
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on the issue:
https://github.com/apache/carbondata/pull/2993 LGTM --- |
Free forum by Nabble | Edit this page |