GitHub user shardul-cr7 opened a pull request:
https://github.com/apache/carbondata/pull/3045 [CARBONDATA-3222]Fix dataload failure after creation of preaggregate datamap on main table with long_string_columns This PR is to Fix dataload failure after creation of preaggregate datamap on main table with long_string_columns. Dataload is gettling failed because child table properties are not getting modified according to the parent table for long_string_columns. This occurs only when long_string_columns is not specified in dmproperties for preaggregate datamap but the datamap was getting created and data load was failing. This PR is to avoid the dataload failure in this scenario. Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [x] Testing done added a testcase - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/shardul-cr7/carbondata lsc_preagg Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/3045.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3045 ---- commit ed588b32ff95a451782c98cae991efd6d148b5c3 Author: shardul-cr7 <shardulsingh22@...> Date: 2019-01-02T09:17:34Z [CARBONDATA-3222]Fix dataload failure after creation of preaggregate datamap on main table with long_string_columns ---- --- |
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/3045 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2110/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/3045 Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10364/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/3045 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2112/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/3045 Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10366/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/3045 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2317/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/3045 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2114/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/3045 Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10368/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/3045 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2115/ --- |
In reply to this post by qiuchenjian-2
Github user qiuchenjian commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/3045#discussion_r244706639 --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/longstring/VarcharDataTypesBasicTestCase.scala --- @@ -333,6 +333,36 @@ class VarcharDataTypesBasicTestCase extends QueryTest with BeforeAndAfterEach wi sql(s"DROP DATAMAP IF EXISTS $datamapName ON TABLE $longStringTable") } + test("creating datamap with long string column selected and loading data should be success") { + + sql(s"drop table if exists $longStringTable") + val datamapName = "pre_agg_dm" + sql( + s""" + | CREATE TABLE if not exists $longStringTable( + | id INT, name STRING, description STRING, address STRING, note STRING + | ) STORED BY 'carbondata' + | TBLPROPERTIES('LONG_STRING_COLUMNS'='description, note', 'SORT_COLUMNS'='name') + |""".stripMargin) + + sql( + s""" + | CREATE DATAMAP $datamapName ON TABLE $longStringTable + | USING 'preaggregate' + | AS SELECT id,description,note,count(*) FROM $longStringTable + | GROUP BY id,description,note + |""". + stripMargin) + + sql( + s""" + | LOAD DATA LOCAL INPATH '$inputFile' INTO TABLE $longStringTable + | OPTIONS('header'='false') + """.stripMargin) + + sql(s"drop table if exists $longStringTable") --- End diff -- Better to add a assert for this test case --- |
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/3045#discussion_r244707751 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/preaaggregate/PreAggregateTableHelper.scala --- @@ -126,6 +126,12 @@ case class PreAggregateTableHelper( newLongStringColumn.mkString(",")) } + //Add long_string_columns properties in child table from the parent. + tableProperties --- End diff -- @shardul-cr7 Here you are copying all the long String column...in case of pre aggregate data map may be it will not have all the columns of maintable, its better to add long string properties only for those column which are present in case of pre aggregate datamap...... example: maintable long string columns: column1, column2, column3 datamap1: column1....set long string property for column1 datamap2:count(column1) no need to set if any udf, udaf is present then no need to set this property --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/3045 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2320/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/3045 Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10369/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/3045 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2140/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/3045 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2346/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/3045 Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10394/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/3045 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2144/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/3045 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2352/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/3045 Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10398/ --- |
In reply to this post by qiuchenjian-2
Github user shardul-cr7 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/3045#discussion_r244992035 --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/longstring/VarcharDataTypesBasicTestCase.scala --- @@ -333,6 +333,36 @@ class VarcharDataTypesBasicTestCase extends QueryTest with BeforeAndAfterEach wi sql(s"DROP DATAMAP IF EXISTS $datamapName ON TABLE $longStringTable") } + test("creating datamap with long string column selected and loading data should be success") { + + sql(s"drop table if exists $longStringTable") + val datamapName = "pre_agg_dm" + sql( + s""" + | CREATE TABLE if not exists $longStringTable( + | id INT, name STRING, description STRING, address STRING, note STRING + | ) STORED BY 'carbondata' + | TBLPROPERTIES('LONG_STRING_COLUMNS'='description, note', 'SORT_COLUMNS'='name') + |""".stripMargin) + + sql( + s""" + | CREATE DATAMAP $datamapName ON TABLE $longStringTable + | USING 'preaggregate' + | AS SELECT id,description,note,count(*) FROM $longStringTable + | GROUP BY id,description,note + |""". + stripMargin) + + sql( + s""" + | LOAD DATA LOCAL INPATH '$inputFile' INTO TABLE $longStringTable + | OPTIONS('header'='false') + """.stripMargin) + + sql(s"drop table if exists $longStringTable") --- End diff -- Added! --- |
Free forum by Nabble | Edit this page |