Apache CarbonData Dev Mailing List archive › Apache CarbonData JIRA issues

[GitHub] carbondata pull request #2993: [WIP] Map data load failure

Classic

List

21 messages Options

Options

12

[GitHub] carbondata pull request #2993: [WIP] Map data load failure

GitHub user manishnalla1994 opened a pull request:

https://github.com/apache/carbondata/pull/2993

[WIP] Map data load failure

Problem : Data Load failing for Insert into Select from same table in containing Map datatype.

Solution: Map type was not handled for this scenario. Handled it now.

Be sure to do all of the following checklist to help us incorporate
your contribution quickly and easily:

- [ ] Any interfaces changed?

- [ ] Any backward compatibility impacted?

- [ ] Document update required?

- [x] Testing done
Please provide details on
- Whether new unit test cases have been added or why no new tests are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance test report.
- Any additional information to help reviewers in testing this change.

- [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/manishnalla1994/carbondata MapDataLoadFailure

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/2993.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2993

----
commit 69010850cccfa4e82f1b70ea954454ff1af1a61a
Author: manishnalla1994 <manish.nalla1994@...>
Date: 2018-12-07T09:25:58Z

Delimiters changed

commit de2603041e2a123af1ae403289d2eed7f7c7c24a
Author: manishnalla1994 <manish.nalla1994@...>
Date: 2018-10-16T09:48:08Z

MapDDLSupport

commit e35126868e563971c11dcbe82e200adc476f7143
Author: manishnalla1994 <manish.nalla1994@...>
Date: 2018-12-14T11:50:15Z

Change of Function for all Delimiters

----

---

[GitHub] carbondata issue #2993: [WIP] Map data load failure

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2993

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1791/

---

[GitHub] carbondata issue #2993: [WIP] Map data load failure

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2993

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1792/

---

[GitHub] carbondata issue #2993: [WIP] Map data load failure

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2993

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2004/

---

[GitHub] carbondata issue #2993: [WIP] Map data load failure

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2993

Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10052/

---

[GitHub] carbondata pull request #2993: [CARBONDATA-3179] Map data load failure

In reply to this post by qiuchenjian-2

Github user qiuchenjian commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2993#discussion_r242435030

--- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/createTable/TestCreateDDLForComplexMapType.scala ---
@@ -442,4 +441,45 @@ class TestCreateDDLForComplexMapType extends QueryTest with BeforeAndAfterAll {
"sort_columns is unsupported for map datatype column: mapfield"))
}

+ test("Data Load Fail Issue") {
+ sql("DROP TABLE IF EXISTS carbon")
+ sql(
+ s"""
+ | CREATE TABLE carbon(
+ | mapField map<INT,STRING>
+ | )
+ | STORED BY 'carbondata'
+ | """
+ .stripMargin)
+ sql(
+ s"""
+ | LOAD DATA LOCAL INPATH '$path'
+ | INTO TABLE carbon OPTIONS(
+ | 'header' = 'false')
+ """.stripMargin)
+ sql("INSERT INTO carbon SELECT * FROM carbon")
+ checkAnswer(sql("select * from carbon"), Seq(
+ Row(Map(1 -> "Nalla", 2 -> "Singh", 4 -> "Kumar")),
+ Row(Map(1 -> "Nalla", 2 -> "Singh", 4 -> "Kumar")),
+ Row(Map(10 -> "Nallaa", 20 -> "Sissngh", 100 -> "Gusspta", 40 -> "Kumar")),
+ Row(Map(10 -> "Nallaa", 20 -> "Sissngh", 100 -> "Gusspta", 40 -> "Kumar"))
+ ))
+ }
+
+ test("Struct inside map") {
+ sql("DROP TABLE IF EXISTS carbon")
--- End diff --

Why is there no result check for this test case of "Stunct inside mapââ

---

[GitHub] carbondata pull request #2993: [CARBONDATA-3179] Map data load failure

In reply to this post by qiuchenjian-2

Github user qiuchenjian commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2993#discussion_r242436603

--- Diff: streaming/src/main/scala/org/apache/carbondata/streaming/parser/FieldConverter.scala ---
@@ -66,30 +65,57 @@ object FieldConverter {
case bs: Array[Byte] => new String(bs,
Charset.forName(CarbonCommonConstants.DEFAULT_CHARSET))
case s: scala.collection.Seq[Any] =>
- val delimiter = if (level == 1) {
- delimiterLevel1
- } else {
- delimiterLevel2
- }
+ val delimiter = complexDelimiters.get((level))
--- End diff --

line 68 and line 83 should use uniform style

---

[GitHub] carbondata pull request #2993: [CARBONDATA-3179] Map data load failure

In reply to this post by qiuchenjian-2

Github user manishnalla1994 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2993#discussion_r242448914

--- Diff: streaming/src/main/scala/org/apache/carbondata/streaming/parser/FieldConverter.scala ---
@@ -66,30 +65,57 @@ object FieldConverter {
case bs: Array[Byte] => new String(bs,
Charset.forName(CarbonCommonConstants.DEFAULT_CHARSET))
case s: scala.collection.Seq[Any] =>
- val delimiter = if (level == 1) {
- delimiterLevel1
- } else {
- delimiterLevel2
- }
+ val delimiter = complexDelimiters.get((level))
--- End diff --

Done

---

[GitHub] carbondata pull request #2993: [CARBONDATA-3179] Map data load failure

In reply to this post by qiuchenjian-2

Github user manishnalla1994 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2993#discussion_r242448934

--- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/createTable/TestCreateDDLForComplexMapType.scala ---
@@ -442,4 +441,45 @@ class TestCreateDDLForComplexMapType extends QueryTest with BeforeAndAfterAll {
"sort_columns is unsupported for map datatype column: mapfield"))
}

+ test("Data Load Fail Issue") {
+ sql("DROP TABLE IF EXISTS carbon")
+ sql(
+ s"""
+ | CREATE TABLE carbon(
+ | mapField map<INT,STRING>
+ | )
+ | STORED BY 'carbondata'
+ | """
+ .stripMargin)
+ sql(
+ s"""
+ | LOAD DATA LOCAL INPATH '$path'
+ | INTO TABLE carbon OPTIONS(
+ | 'header' = 'false')
+ """.stripMargin)
+ sql("INSERT INTO carbon SELECT * FROM carbon")
+ checkAnswer(sql("select * from carbon"), Seq(
+ Row(Map(1 -> "Nalla", 2 -> "Singh", 4 -> "Kumar")),
+ Row(Map(1 -> "Nalla", 2 -> "Singh", 4 -> "Kumar")),
+ Row(Map(10 -> "Nallaa", 20 -> "Sissngh", 100 -> "Gusspta", 40 -> "Kumar")),
+ Row(Map(10 -> "Nallaa", 20 -> "Sissngh", 100 -> "Gusspta", 40 -> "Kumar"))
+ ))
+ }
+
+ test("Struct inside map") {
+ sql("DROP TABLE IF EXISTS carbon")
--- End diff --

Done

---

[GitHub] carbondata issue #2993: [CARBONDATA-3179] Map data load failure

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2993

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1821/

---

[GitHub] carbondata issue #2993: [CARBONDATA-3179] Map Data Load Failure and Struct P...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2993

Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10079/

---

[GitHub] carbondata issue #2993: [CARBONDATA-3179] Map Data Load Failure and Struct P...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2993

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1824/

---

[GitHub] carbondata issue #2993: [CARBONDATA-3179] Map Data Load Failure and Struct P...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2993

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2032/

---

[GitHub] carbondata issue #2993: [CARBONDATA-3179] Map Data Load Failure and Struct P...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2993

Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10083/

---

[GitHub] carbondata pull request #2993: [CARBONDATA-3179] Map Data Load Failure and S...

In reply to this post by qiuchenjian-2

Github user kumarvishal09 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2993#discussion_r243227160

--- Diff: streaming/src/main/scala/org/apache/carbondata/streaming/parser/RowStreamParserImp.scala ---
@@ -53,19 +54,21 @@ class RowStreamParserImp extends CarbonStreamParser {
this.configuration.get(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT))
this.dateFormat = new SimpleDateFormat(
this.configuration.get(CarbonCommonConstants.CARBON_DATE_FORMAT))
- this.complexDelimiterLevel1 = this.configuration.get("carbon_complex_delimiter_level_1")
- this.complexDelimiterLevel2 = this.configuration.get("carbon_complex_delimiter_level_2")
- this.complexDelimiterLevel3 = this.configuration.get("carbon_complex_delimiter_level_3")
+ this.complexDelimiters.add(this.configuration.get("carbon_complex_delimiter_level_1"))
+ this.complexDelimiters.add(this.configuration.get("carbon_complex_delimiter_level_2"))
+ this.complexDelimiters.add(this.configuration.get("carbon_complex_delimiter_level_3"))
+ this.complexDelimiters.add(ComplexDelimitersEnum.COMPLEX_DELIMITERS_LEVEL_4.value())
this.serializationNullFormat =
this.configuration.get(DataLoadProcessorConstants.SERIALIZATION_NULL_FORMAT)
}

override def parserRow(value: InternalRow): Array[Object] = {
this.encoder.fromRow(value).toSeq.map { x => {
FieldConverter.objectToString(
- x, serializationNullFormat, complexDelimiterLevel1, complexDelimiterLevel2,
+ x, serializationNullFormat, complexDelimiters,
timeStampFormat, dateFormat)
- } }.toArray
+ }
--- End diff --

please format it correctly

---

[GitHub] carbondata issue #2993: [CARBONDATA-3179] Map Data Load Failure and Struct P...

In reply to this post by qiuchenjian-2

Github user kumarvishal09 commented on the issue:

https://github.com/apache/carbondata/pull/2993

@manishnalla1994 Please correct the format and add comments

---

[GitHub] carbondata issue #2993: [CARBONDATA-3179] Map Data Load Failure and Struct P...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2993

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1876/

---

[GitHub] carbondata issue #2993: [CARBONDATA-3179] Map Data Load Failure and Struct P...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2993

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2086/

---

[GitHub] carbondata issue #2993: [CARBONDATA-3179] Map Data Load Failure and Struct P...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2993

Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10131/

---

[GitHub] carbondata issue #2993: [CARBONDATA-3179] Map Data Load Failure and Struct P...

In reply to this post by qiuchenjian-2

Github user kumarvishal09 commented on the issue:

https://github.com/apache/carbondata/pull/2993

LGTM

---

12