[
https://issues.apache.org/jira/browse/CARBONDATA-3527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Zhichao Zhang updated CARBONDATA-3527:
---------------------------------------
Description:
*Problem:*
When complex type data is used more than 32000 characters to indicate in csv file, and load data with 'GLOBAL_SORT' from these csv files, it will throw 'String length cannot exceed 32000 characters' exception.
*Cause:*
Use 'GLOBAL_SORT' to load data from csv files, it reads files and firstly store data in StringArrayRow, the type of all data are string, when call 'CarbonScalaUtil.getString' in 'NewRddIterator.next', it will check the length of all data and throw 'String length cannot exceed 32000 characters' exception even if it's complex type data which store as more than 32000 characters in csv files.
*Solution:*
In 'FieldConverter.objectToString' (called in 'CarbonScalaUtil.getString'), if the data type of field is complex type, don't check the length.
> Throw 'String length cannot exceed 32000 characters' exception when load data with 'GLOBAL_SORT' from csv which include big complex type data
> ---------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: CARBONDATA-3527
> URL:
https://issues.apache.org/jira/browse/CARBONDATA-3527> Project: CarbonData
> Issue Type: Bug
> Components: spark-integration
> Affects Versions: 1.6.0
> Reporter: Zhichao Zhang
> Assignee: Zhichao Zhang
> Priority: Major
> Fix For: 1.6.1
>
>
> *Problem:*
> When complex type data is used more than 32000 characters to indicate in csv file, and load data with 'GLOBAL_SORT' from these csv files, it will throw 'String length cannot exceed 32000 characters' exception.
> *Cause:*
> Use 'GLOBAL_SORT' to load data from csv files, it reads files and firstly store data in StringArrayRow, the type of all data are string, when call 'CarbonScalaUtil.getString' in 'NewRddIterator.next', it will check the length of all data and throw 'String length cannot exceed 32000 characters' exception even if it's complex type data which store as more than 32000 characters in csv files.
> *Solution:*
> In 'FieldConverter.objectToString' (called in 'CarbonScalaUtil.getString'), if the data type of field is complex type, don't check the length.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)