GitHub user xuchuanyin opened a pull request:
https://github.com/apache/carbondata/pull/2252 WIP: Support string longer than 32000 characters Add a property in creating table 'long_string_columns' to support string columns that will contains more than 32000 characters. Inside carbondata, it use an integer instead of short to store the length of bytes content. Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [ ] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/xuchuanyin/carbondata 0428_string_longer_than_32000 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2252.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2252 ---- commit 26cbea7f1493d204a1abb4275e052481abccd185 Author: xuchuanyin <xuchuanyin@...> Date: 2018-04-30T15:53:22Z Support string longer than 32000 characters Add a property in creating table 'long_string_columns' to support string columns that will contains more than 32000 characters. Inside carbondata, it use an integer instead of short to store the length of bytes content. ---- --- |
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2252 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5546/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2252 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4383/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2252 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4641/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2252 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4642/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2252 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4644/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2252 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5548/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2252 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4385/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2252 @xuchuanyin Thanks for working on it, but we better have new datatype like varchar(size) or bigstring to support longer strings rather than based on property --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2252 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5564/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2252 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4402/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2252 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4660/ --- |
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on the issue:
https://github.com/apache/carbondata/pull/2252 @ravipesala I've considered to add a datatype such as TEXT, but quit the idea due to that the grammar is not general, at least it is not compatible with Spark/Hive. It will cause problem to migrate from/to Carbondata. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2252 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4405/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2252 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5567/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2252 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4664/ --- |
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on the issue:
https://github.com/apache/carbondata/pull/2252 @xuchuanyin Its better to add one encoder type to store long string. In current code reader does not know which type of data it is reading/storing and chunk store object is created based on encoder (fixed/variable) type. In your PR most of the classes you have added one boolean to check its Long string. It's not required, you can add one encoder type(Text) and instead of handling everything in same class(UnsafevariableLengthChunkStore/SafevariableLengthStore) add one more implementation for handling Long String. --- |
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on the issue:
https://github.com/apache/carbondata/pull/2252 @kumarvishal09 yeah, itâs an option to add an encoder type and make L-V related class abstract to eliminate the duplicate code. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2252 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6017/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2252 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5025/ --- |
Free forum by Nabble | Edit this page |