[GitHub] carbondata pull request #3045: [CARBONDATA-3222]Fix dataload failure after c...

classic Classic list List threaded Threaded
56 messages Options
123
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #3045: [CARBONDATA-3222]Fix dataload failure after c...

qiuchenjian-2
GitHub user shardul-cr7 opened a pull request:

    https://github.com/apache/carbondata/pull/3045

    [CARBONDATA-3222]Fix dataload failure after creation of preaggregate datamap on main table with long_string_columns

    This PR is to Fix dataload failure after creation of preaggregate datamap on main table with long_string_columns.
   
    Dataload is gettling failed because child table properties are not getting modified according to the parent table for long_string_columns.
    This occurs only when long_string_columns is not specified in dmproperties for preaggregate datamap but the datamap was getting created and data load was failing. This PR is to avoid the dataload failure in this scenario.
   
    Be sure to do all of the following checklist to help us incorporate
    your contribution quickly and easily:
   
     - [ ] Any interfaces changed?
     
     - [ ] Any backward compatibility impacted?
     
     - [ ] Document update required?
   
     - [x] Testing done
            added a testcase
     - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
   


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/shardul-cr7/carbondata lsc_preagg

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/3045.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3045
   
----
commit ed588b32ff95a451782c98cae991efd6d148b5c3
Author: shardul-cr7 <shardulsingh22@...>
Date:   2019-01-02T09:17:34Z

    [CARBONDATA-3222]Fix dataload failure after creation of preaggregate datamap on main table with long_string_columns

----


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #3045: [CARBONDATA-3222]Fix dataload failure after creation...

qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/3045
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2110/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #3045: [CARBONDATA-3222]Fix dataload failure after creation...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/3045
 
    Build Failed  with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10364/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #3045: [CARBONDATA-3222]Fix dataload failure after creation...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/3045
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2112/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #3045: [CARBONDATA-3222]Fix dataload failure after creation...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/3045
 
    Build Failed  with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10366/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #3045: [CARBONDATA-3222]Fix dataload failure after creation...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/3045
 
    Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2317/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #3045: [CARBONDATA-3222]Fix dataload failure after creation...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/3045
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2114/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #3045: [CARBONDATA-3222]Fix dataload failure after creation...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/3045
 
    Build Failed  with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10368/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #3045: [CARBONDATA-3222]Fix dataload failure after creation...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/3045
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2115/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #3045: [CARBONDATA-3222]Fix dataload failure after c...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user qiuchenjian commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/3045#discussion_r244706639
 
    --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/longstring/VarcharDataTypesBasicTestCase.scala ---
    @@ -333,6 +333,36 @@ class VarcharDataTypesBasicTestCase extends QueryTest with BeforeAndAfterEach wi
         sql(s"DROP DATAMAP IF EXISTS $datamapName ON TABLE $longStringTable")
       }
     
    +  test("creating datamap with long string column selected and loading data should be success") {
    +
    +    sql(s"drop table if exists $longStringTable")
    +    val datamapName = "pre_agg_dm"
    +    sql(
    +      s"""
    +         | CREATE TABLE if not exists $longStringTable(
    +         | id INT, name STRING, description STRING, address STRING, note STRING
    +         | ) STORED BY 'carbondata'
    +         | TBLPROPERTIES('LONG_STRING_COLUMNS'='description, note', 'SORT_COLUMNS'='name')
    +         |""".stripMargin)
    +
    +    sql(
    +      s"""
    +         | CREATE DATAMAP $datamapName ON TABLE $longStringTable
    +         | USING 'preaggregate'
    +         | AS SELECT id,description,note,count(*) FROM $longStringTable
    +         | GROUP BY id,description,note
    +         |""".
    +        stripMargin)
    +
    +    sql(
    +      s"""
    +         | LOAD DATA LOCAL INPATH '$inputFile' INTO TABLE $longStringTable
    +         | OPTIONS('header'='false')
    +       """.stripMargin)
    +
    +    sql(s"drop table if exists $longStringTable")
    --- End diff --
   
    Better to add a assert for this test case


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #3045: [CARBONDATA-3222]Fix dataload failure after c...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/3045#discussion_r244707751
 
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/preaaggregate/PreAggregateTableHelper.scala ---
    @@ -126,6 +126,12 @@ case class PreAggregateTableHelper(
             newLongStringColumn.mkString(","))
         }
     
    +    //Add long_string_columns properties in child table from the parent.
    +    tableProperties
    --- End diff --
   
    @shardul-cr7 Here you are copying all the long String column...in case of pre aggregate data map may be it will not have all the columns of maintable, its better to add long string properties only for those column which are present in case of pre aggregate datamap......
    example:  maintable long string columns: column1, column2, column3
    datamap1: column1....set long string property for column1
    datamap2:count(column1) no need to set if any udf, udaf is present then no need to set this property


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #3045: [CARBONDATA-3222]Fix dataload failure after creation...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/3045
 
    Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2320/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #3045: [CARBONDATA-3222]Fix dataload failure after creation...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/3045
 
    Build Failed  with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10369/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #3045: [CARBONDATA-3222]Fix dataload failure after creation...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/3045
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2140/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #3045: [CARBONDATA-3222]Fix dataload failure after creation...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/3045
 
    Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2346/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #3045: [CARBONDATA-3222]Fix dataload failure after creation...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/3045
 
    Build Failed  with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10394/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #3045: [CARBONDATA-3222]Fix dataload failure after creation...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/3045
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2144/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #3045: [CARBONDATA-3222]Fix dataload failure after creation...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/3045
 
    Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2352/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #3045: [CARBONDATA-3222]Fix dataload failure after creation...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/3045
 
    Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10398/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #3045: [CARBONDATA-3222]Fix dataload failure after c...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user shardul-cr7 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/3045#discussion_r244992035
 
    --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/longstring/VarcharDataTypesBasicTestCase.scala ---
    @@ -333,6 +333,36 @@ class VarcharDataTypesBasicTestCase extends QueryTest with BeforeAndAfterEach wi
         sql(s"DROP DATAMAP IF EXISTS $datamapName ON TABLE $longStringTable")
       }
     
    +  test("creating datamap with long string column selected and loading data should be success") {
    +
    +    sql(s"drop table if exists $longStringTable")
    +    val datamapName = "pre_agg_dm"
    +    sql(
    +      s"""
    +         | CREATE TABLE if not exists $longStringTable(
    +         | id INT, name STRING, description STRING, address STRING, note STRING
    +         | ) STORED BY 'carbondata'
    +         | TBLPROPERTIES('LONG_STRING_COLUMNS'='description, note', 'SORT_COLUMNS'='name')
    +         |""".stripMargin)
    +
    +    sql(
    +      s"""
    +         | CREATE DATAMAP $datamapName ON TABLE $longStringTable
    +         | USING 'preaggregate'
    +         | AS SELECT id,description,note,count(*) FROM $longStringTable
    +         | GROUP BY id,description,note
    +         |""".
    +        stripMargin)
    +
    +    sql(
    +      s"""
    +         | LOAD DATA LOCAL INPATH '$inputFile' INTO TABLE $longStringTable
    +         | OPTIONS('header'='false')
    +       """.stripMargin)
    +
    +    sql(s"drop table if exists $longStringTable")
    --- End diff --
   
    Added!


---
123