[GitHub] carbondata pull request #2993: [WIP] Map data load failure

classic Classic list List threaded Threaded
21 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2993: [WIP] Map data load failure

qiuchenjian-2
GitHub user manishnalla1994 opened a pull request:

    https://github.com/apache/carbondata/pull/2993

    [WIP] Map data load failure

    Problem : Data Load failing for Insert into Select from same table in containing Map datatype.
   
    Solution: Map type was not handled for this scenario. Handled it now.
   
    Be sure to do all of the following checklist to help us incorporate
    your contribution quickly and easily:
   
     - [ ] Any interfaces changed?
     
     - [ ] Any backward compatibility impacted?
     
     - [ ] Document update required?
   
     - [x] Testing done
            Please provide details on
            - Whether new unit test cases have been added or why no new tests are required?
            - How it is tested? Please attach test report.
            - Is it a performance related change? Please attach the performance test report.
            - Any additional information to help reviewers in testing this change.
           
     - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
   


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/manishnalla1994/carbondata MapDataLoadFailure

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/2993.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2993
   
----
commit 69010850cccfa4e82f1b70ea954454ff1af1a61a
Author: manishnalla1994 <manish.nalla1994@...>
Date:   2018-12-07T09:25:58Z

    Delimiters changed

commit de2603041e2a123af1ae403289d2eed7f7c7c24a
Author: manishnalla1994 <manish.nalla1994@...>
Date:   2018-10-16T09:48:08Z

    MapDDLSupport

commit e35126868e563971c11dcbe82e200adc476f7143
Author: manishnalla1994 <manish.nalla1994@...>
Date:   2018-12-14T11:50:15Z

    Change of Function for all Delimiters

----


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2993: [WIP] Map data load failure

qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2993
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1791/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2993: [WIP] Map data load failure

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2993
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1792/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2993: [WIP] Map data load failure

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2993
 
    Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2004/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2993: [WIP] Map data load failure

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2993
 
    Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10052/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2993: [CARBONDATA-3179] Map data load failure

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user qiuchenjian commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2993#discussion_r242435030
 
    --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/createTable/TestCreateDDLForComplexMapType.scala ---
    @@ -442,4 +441,45 @@ class TestCreateDDLForComplexMapType extends QueryTest with BeforeAndAfterAll {
             "sort_columns is unsupported for map datatype column: mapfield"))
       }
     
    +  test("Data Load Fail Issue") {
    +    sql("DROP TABLE IF EXISTS carbon")
    +    sql(
    +      s"""
    +         | CREATE TABLE carbon(
    +         | mapField map<INT,STRING>
    +         | )
    +         | STORED BY 'carbondata'
    +         | """
    +        .stripMargin)
    +    sql(
    +      s"""
    +         | LOAD DATA LOCAL INPATH '$path'
    +         | INTO TABLE carbon OPTIONS(
    +         | 'header' = 'false')
    +       """.stripMargin)
    +    sql("INSERT INTO carbon SELECT * FROM carbon")
    +    checkAnswer(sql("select * from carbon"), Seq(
    +      Row(Map(1 -> "Nalla", 2 -> "Singh", 4 -> "Kumar")),
    +      Row(Map(1 -> "Nalla", 2 -> "Singh", 4 -> "Kumar")),
    +      Row(Map(10 -> "Nallaa", 20 -> "Sissngh", 100 -> "Gusspta", 40 -> "Kumar")),
    +      Row(Map(10 -> "Nallaa", 20 -> "Sissngh", 100 -> "Gusspta", 40 -> "Kumar"))
    +      ))
    +  }
    +
    +  test("Struct inside map") {
    +    sql("DROP TABLE IF EXISTS carbon")
    --- End diff --
   
    Why is there no result check for this test case of "Stunct inside map“”


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2993: [CARBONDATA-3179] Map data load failure

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user qiuchenjian commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2993#discussion_r242436603
 
    --- Diff: streaming/src/main/scala/org/apache/carbondata/streaming/parser/FieldConverter.scala ---
    @@ -66,30 +65,57 @@ object FieldConverter {
             case bs: Array[Byte] => new String(bs,
               Charset.forName(CarbonCommonConstants.DEFAULT_CHARSET))
             case s: scala.collection.Seq[Any] =>
    -          val delimiter = if (level == 1) {
    -            delimiterLevel1
    -          } else {
    -            delimiterLevel2
    -          }
    +          val delimiter = complexDelimiters.get((level))
    --- End diff --
   
    line 68 and line 83 should use uniform style


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2993: [CARBONDATA-3179] Map data load failure

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user manishnalla1994 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2993#discussion_r242448914
 
    --- Diff: streaming/src/main/scala/org/apache/carbondata/streaming/parser/FieldConverter.scala ---
    @@ -66,30 +65,57 @@ object FieldConverter {
             case bs: Array[Byte] => new String(bs,
               Charset.forName(CarbonCommonConstants.DEFAULT_CHARSET))
             case s: scala.collection.Seq[Any] =>
    -          val delimiter = if (level == 1) {
    -            delimiterLevel1
    -          } else {
    -            delimiterLevel2
    -          }
    +          val delimiter = complexDelimiters.get((level))
    --- End diff --
   
    Done


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2993: [CARBONDATA-3179] Map data load failure

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user manishnalla1994 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2993#discussion_r242448934
 
    --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/createTable/TestCreateDDLForComplexMapType.scala ---
    @@ -442,4 +441,45 @@ class TestCreateDDLForComplexMapType extends QueryTest with BeforeAndAfterAll {
             "sort_columns is unsupported for map datatype column: mapfield"))
       }
     
    +  test("Data Load Fail Issue") {
    +    sql("DROP TABLE IF EXISTS carbon")
    +    sql(
    +      s"""
    +         | CREATE TABLE carbon(
    +         | mapField map<INT,STRING>
    +         | )
    +         | STORED BY 'carbondata'
    +         | """
    +        .stripMargin)
    +    sql(
    +      s"""
    +         | LOAD DATA LOCAL INPATH '$path'
    +         | INTO TABLE carbon OPTIONS(
    +         | 'header' = 'false')
    +       """.stripMargin)
    +    sql("INSERT INTO carbon SELECT * FROM carbon")
    +    checkAnswer(sql("select * from carbon"), Seq(
    +      Row(Map(1 -> "Nalla", 2 -> "Singh", 4 -> "Kumar")),
    +      Row(Map(1 -> "Nalla", 2 -> "Singh", 4 -> "Kumar")),
    +      Row(Map(10 -> "Nallaa", 20 -> "Sissngh", 100 -> "Gusspta", 40 -> "Kumar")),
    +      Row(Map(10 -> "Nallaa", 20 -> "Sissngh", 100 -> "Gusspta", 40 -> "Kumar"))
    +      ))
    +  }
    +
    +  test("Struct inside map") {
    +    sql("DROP TABLE IF EXISTS carbon")
    --- End diff --
   
    Done


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2993: [CARBONDATA-3179] Map data load failure

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2993
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1821/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2993: [CARBONDATA-3179] Map Data Load Failure and Struct P...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2993
 
    Build Failed  with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10079/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2993: [CARBONDATA-3179] Map Data Load Failure and Struct P...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2993
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1824/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2993: [CARBONDATA-3179] Map Data Load Failure and Struct P...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2993
 
    Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2032/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2993: [CARBONDATA-3179] Map Data Load Failure and Struct P...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2993
 
    Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10083/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2993: [CARBONDATA-3179] Map Data Load Failure and S...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2993#discussion_r243227160
 
    --- Diff: streaming/src/main/scala/org/apache/carbondata/streaming/parser/RowStreamParserImp.scala ---
    @@ -53,19 +54,21 @@ class RowStreamParserImp extends CarbonStreamParser {
           this.configuration.get(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT))
         this.dateFormat = new SimpleDateFormat(
           this.configuration.get(CarbonCommonConstants.CARBON_DATE_FORMAT))
    -    this.complexDelimiterLevel1 = this.configuration.get("carbon_complex_delimiter_level_1")
    -    this.complexDelimiterLevel2 = this.configuration.get("carbon_complex_delimiter_level_2")
    -    this.complexDelimiterLevel3 = this.configuration.get("carbon_complex_delimiter_level_3")
    +    this.complexDelimiters.add(this.configuration.get("carbon_complex_delimiter_level_1"))
    +    this.complexDelimiters.add(this.configuration.get("carbon_complex_delimiter_level_2"))
    +    this.complexDelimiters.add(this.configuration.get("carbon_complex_delimiter_level_3"))
    +    this.complexDelimiters.add(ComplexDelimitersEnum.COMPLEX_DELIMITERS_LEVEL_4.value())
         this.serializationNullFormat =
           this.configuration.get(DataLoadProcessorConstants.SERIALIZATION_NULL_FORMAT)
       }
     
       override def parserRow(value: InternalRow): Array[Object] = {
         this.encoder.fromRow(value).toSeq.map { x => {
           FieldConverter.objectToString(
    -        x, serializationNullFormat, complexDelimiterLevel1, complexDelimiterLevel2,
    +        x, serializationNullFormat, complexDelimiters,
             timeStampFormat, dateFormat)
    -    } }.toArray
    +    }
    --- End diff --
   
    please format it correctly


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2993: [CARBONDATA-3179] Map Data Load Failure and Struct P...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on the issue:

    https://github.com/apache/carbondata/pull/2993
 
    @manishnalla1994 Please correct the format and add comments


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2993: [CARBONDATA-3179] Map Data Load Failure and Struct P...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2993
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1876/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2993: [CARBONDATA-3179] Map Data Load Failure and Struct P...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2993
 
    Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2086/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2993: [CARBONDATA-3179] Map Data Load Failure and Struct P...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2993
 
    Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10131/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2993: [CARBONDATA-3179] Map Data Load Failure and Struct P...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on the issue:

    https://github.com/apache/carbondata/pull/2993
 
    LGTM


---
12