[
https://issues.apache.org/jira/browse/CARBONDATA-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16436926#comment-16436926 ]
xuchuanyin commented on CARBONDATA-2340:
----------------------------------------
[~niaoshu] This is a known issue/restriction in carbondata. The reason is that carbondata store the length of string using `short`.
> load数据超过32000byte
> -----------------
>
> Key: CARBONDATA-2340
> URL:
https://issues.apache.org/jira/browse/CARBONDATA-2340> Project: CarbonData
> Issue Type: Bug
> Components: data-load
> Affects Versions: 1.3.0
> Reporter: niaoshu
> Priority: Blocker
> Original Estimate: 12h
> Remaining Estimate: 12h
>
> INFO storage.BlockManagerMasterEndpoint: Registering block manager spark1:12603 with 5.2 GB RAM, BlockManagerId(1, spark1, 12603, None)
> 18/04/11 14:24:23 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on spark1:12603 (size: 34.9 KB, free: 5.2 GB)
> 18/04/11 14:24:34 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, spark1, executor 1): org.apache.carbondata.processing.loading.exception.CarbonDataLoadingException: Dataload failed, String size cannot exceed 32000 bytes
> at org.apache.carbondata.processing.loading.converter.impl.NonDictionaryFieldConverterImpl.convert(NonDictionaryFieldConverterImpl.java:75)
> at org.apache.carbondata.processing.loading.converter.impl.RowConverterImpl.convert(RowConverterImpl.java:162)
> at org.apache.carbondata.processing.loading.steps.DataConverterProcessorStepImpl.processRowBatch(DataConverterProcessorStepImpl.java:104)
> at org.apache.carbondata.processing.loading.steps.DataConverterProcessorStepImpl$1.next(DataConverterProcessorStepImpl.java:91)
> at org.apache.carbondata.processing.loading.steps.DataConverterProcessorStepImpl$1.next(DataConverterProcessorStepImpl.java:77)
> at org.apache.carbondata.processing.loading.sort.impl.ParallelReadMergeSorterImpl$SortIteratorThread.run(ParallelReadMergeSorterImpl.java:214)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:748)
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)