GitHub user manishgupta88 opened a pull request:
https://github.com/apache/carbondata/pull/2294 [CARBONDATA-2443][SDK] Multi level complex type support for AVRO based SDK **Problem:** Problem inferring the complex type schema with boolean array type from the store created using SDK writer **Analysis:** When we create an external table and infer the schema from store created using SDK writer, the operation fails because of complex type field with boolean array dataType. This is because during schema creation by SDK writer, for array type children a child with column name val is added. While parsing the logic to append the parent name with child column name is missing for boolean type which is causing this problem. **Solution:** Handle the parsing for boolean type Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? No - [ ] Any backward compatibility impacted? No - [ ] Document update required? No - [ ] Testing done Manually verified - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. NA You can merge this pull request into a Git repository by running: $ git pull https://github.com/manishgupta88/carbondata sdk_complex_type_boolean_fix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2294.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2294 ---- commit 6369c0d3bd2b773f17b5a3b61c2396f41c7e6933 Author: manishgupta88 <tomanishgupta18@...> Date: 2018-05-10T11:39:17Z Problem: Problem inferring the complex type schema with boolean array type from the store created using SDK writer Analysis: When we create an external table and infer the schema from store created using SDK writer, the operation fails because of complex type field with boolean array dataType. This is because during schema creation by SDK writer, for array type children a child with column name val is added. While parsing the logic to append the parent name with child column name is missing for boolean type which is causing this problem. Solution: Handle the parsing for boolean type ---- --- |
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2294 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4649/ --- |
In reply to this post by qiuchenjian-2
Github user sounakr commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2294#discussion_r187337029 --- Diff: core/src/main/java/org/apache/carbondata/core/metadata/schema/table/TableSchemaBuilder.java --- @@ -114,12 +115,12 @@ public void setSortColumns(List<ColumnSchema> sortColumns) { this.sortColumns = sortColumns; } - public ColumnSchema addColumn(StructField field, boolean isSortColumn) { - return addColumn(field, null, isSortColumn, false); + public ColumnSchema addColumn(StructField field, AtomicInteger valIndex, boolean isSortColumn) { --- End diff -- Do we need Atomic Integer ? This is single threaded. Later phase also even if writer becomes multi threaded, building schema and CarbonTable should remain single threaded only. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2294 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5806/ --- |
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2294#discussion_r187343421 --- Diff: core/src/main/java/org/apache/carbondata/core/metadata/schema/table/TableSchemaBuilder.java --- @@ -114,12 +115,12 @@ public void setSortColumns(List<ColumnSchema> sortColumns) { this.sortColumns = sortColumns; } - public ColumnSchema addColumn(StructField field, boolean isSortColumn) { - return addColumn(field, null, isSortColumn, false); + public ColumnSchema addColumn(StructField field, AtomicInteger valIndex, boolean isSortColumn) { --- End diff -- AtomicInteger is not used here for multi threading purpose. It is used for incrementing a value for complex boolean array type to assign child column name --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2294 LGTM --- |
In reply to this post by qiuchenjian-2
|
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2294 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4858/ --- |
Free forum by Nabble | Edit this page |