[GitHub] carbondata pull request #2252: WIP: Support string longer than 32000 charact...

classic Classic list List threaded Threaded
80 messages Options
1234
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2252: WIP: Support string longer than 32000 charact...

qiuchenjian-2
GitHub user xuchuanyin opened a pull request:

    https://github.com/apache/carbondata/pull/2252

    WIP: Support string longer than 32000 characters

    Add a property in creating table 'long_string_columns' to support string columns that will contains more than 32000 characters.
    Inside carbondata, it use an integer instead of short to store the length of bytes content.
   
    Be sure to do all of the following checklist to help us incorporate
    your contribution quickly and easily:
   
     - [ ] Any interfaces changed?
     
     - [ ] Any backward compatibility impacted?
     
     - [ ] Document update required?
   
     - [ ] Testing done
            Please provide details on
            - Whether new unit test cases have been added or why no new tests are required?
            - How it is tested? Please attach test report.
            - Is it a performance related change? Please attach the performance test report.
            - Any additional information to help reviewers in testing this change.
           
     - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
   


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/xuchuanyin/carbondata 0428_string_longer_than_32000

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/2252.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2252
   
----
commit 26cbea7f1493d204a1abb4275e052481abccd185
Author: xuchuanyin <xuchuanyin@...>
Date:   2018-04-30T15:53:22Z

    Support string longer than 32000 characters
   
    Add a property in creating table 'long_string_columns' to support string columns that will contains more than 32000 characters.
    Inside carbondata, it use an integer instead of short to store the length of bytes content.

----


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2252: WIP: Support string longer than 32000 characters

qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2252
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5546/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2252: WIP: Support string longer than 32000 characters

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2252
 
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4383/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2252: WIP: Support string longer than 32000 characters

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2252
 
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4641/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2252: WIP: Support string longer than 32000 characters

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2252
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4642/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2252: WIP:[CARBONDATA-2420] Support string longer than 320...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2252
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4644/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2252: WIP:[CARBONDATA-2420] Support string longer than 320...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2252
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5548/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2252: WIP:[CARBONDATA-2420] Support string longer than 320...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2252
 
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4385/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2252: [CARBONDATA-2420] Support string longer than 32000 c...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2252
 
    @xuchuanyin Thanks for working on it, but we better have new datatype like varchar(size) or bigstring to support longer strings rather than based on property


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2252: [CARBONDATA-2420] Support string longer than 32000 c...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2252
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5564/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2252: [CARBONDATA-2420] Support string longer than 32000 c...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2252
 
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4402/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2252: [CARBONDATA-2420] Support string longer than 32000 c...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2252
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4660/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2252: [CARBONDATA-2420] Support string longer than 32000 c...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/2252
 
    @ravipesala I've considered to add a datatype such as TEXT, but quit the idea due to that the grammar is not general, at least it is not compatible with Spark/Hive. It will cause problem to migrate from/to Carbondata.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2252: [CARBONDATA-2420] Support string longer than 32000 c...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2252
 
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4405/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2252: [CARBONDATA-2420] Support string longer than 32000 c...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2252
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5567/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2252: [CARBONDATA-2420] Support string longer than 32000 c...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2252
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4664/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2252: [CARBONDATA-2420] Support string longer than 32000 c...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on the issue:

    https://github.com/apache/carbondata/pull/2252
 
    @xuchuanyin Its better to add one encoder type to store long string. In current code reader does not know which type of data it is reading/storing and chunk store object is created based on encoder (fixed/variable) type. In your PR most of the classes you have added one boolean to check its Long string. It's not required, you can add one encoder type(Text) and instead of handling everything in same class(UnsafevariableLengthChunkStore/SafevariableLengthStore) add one more implementation for handling Long String.  


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2252: [CARBONDATA-2420] Support string longer than 32000 c...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/2252
 
    @kumarvishal09 yeah, it’s an option to add an encoder type and make L-V related class abstract to eliminate the duplicate code.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2252: [CARBONDATA-2420] Support string longer than 32000 c...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2252
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6017/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2252: [CARBONDATA-2420] Support string longer than 32000 c...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2252
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5025/



---
1234