Hello, All!
I am working with the carbondata. When I load data from a csv file, the program throws the ERROR message below: Error: java.lang.Exception: DataLoad failure: Dataload failed, String size cannot exceed 32000 bytes, please consider long string data type (state=,code=0) But, I checked the file, I didn't find the too long string value. And the error msg doesn't show which column or value is wrong, this confuses me. So, please help! PS: I open the bad record logger, but the log file doesn't have the detail info either. Sad face! Thanks! Best Regards! ________________________________ jingych --------------------------------------------------------------------------------------------------- Confidentiality Notice: The information contained in this e-mail and any accompanying attachment(s) is intended only for the use of the intended recipient and may be confidential and/or privileged of Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of this communication is not the intended recipient,unauthorized use,forwarding, printing, storing, disclosure or copying is strictly prohibited, and may be unlawful.If you have received this communication in error,please immediately notify the sender by return e-mail, and delete the original message and all copies from your system. Thank you. --------------------------------------------------------------------------------------------------- |
Hi, jingych,
I think you are using old code because recently we have changed to handle the long strings as bad record and in bad record logger or CSV file you will get the column name and record which has the strings greater than 32000. https://github.com/apache/carbondata/pull/3865 The above PR contains the changes for this. Thanks and Regards Nihal Kumar Ojha -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ |
Hi,Nihai!
Thanks for your replay! It's very helpful. And I checked the version we used is the newest release: apache-carbondata-2.0.1-bin-spark2.4.5-hadoop2.7.2.jar So when the new release version is going to publish with the PR? Thanks! Best regards! jingych -----邮件原件----- 发件人: Nihal [mailto:[hidden email]] 发送时间: 2020年9月29日 21:05 收件人: [hidden email] 主题: Re: About: load data string value too long problem Hi, jingych, I think you are using old code because recently we have changed to handle the long strings as bad record and in bad record logger or CSV file you will get the column name and record which has the strings greater than 32000. https://github.com/apache/carbondata/pull/3865 The above PR contains the changes for this. Thanks and Regards Nihal Kumar Ojha -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ --------------------------------------------------------------------------------------------------- Confidentiality Notice: The information contained in this e-mail and any accompanying attachment(s) is intended only for the use of the intended recipient and may be confidential and/or privileged of Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of this communication is not the intended recipient,unauthorized use,forwarding, printing, storing, disclosure or copying is strictly prohibited, and may be unlawful.If you have received this communication in error,please immediately notify the sender by return e-mail, and delete the original message and all copies from your system. Thank you. --------------------------------------------------------------------------------------------------- |
Free forum by Nabble | Edit this page |