Apache CarbonData Dev Mailing List archive › Apache CarbonData JIRA issues

[jira] [Created] (CARBONDATA-3565) Binary to string issue when loading dataframe data in NewRddIterator

Classic

List

Threaded

1 message

Akash R Nilugal (Jira)

[jira] [Created] (CARBONDATA-3565) Binary to string issue when loading dataframe data in NewRddIterator

ChenKai created CARBONDATA-3565:
-----------------------------------

Summary: Binary to string issue when loading dataframe data in NewRddIterator
Key: CARBONDATA-3565
URL: https://issues.apache.org/jira/browse/CARBONDATA-3565
Project: CarbonData
Issue Type: Bug
Components: spark-integration
Affects Versions: 1.6.0
Reporter: ChenKai

* issue
Spark DataFrame(SQL) load complex binary data to a hive table, the data will be broken when reading out. I see in RddIterator, the data will be converted to a string, and then be converted back.

* test case
Binary data can be *DataOutputStream#writeDouble* and so on.

* discussion
I think *CarbonScalaUtil#getString* operation can be removed now. I dig deep into the code in 2016, the code was used in kettle *CsvInput* (commit: 0018756d). But the code has been removed now, I think this converting operation is a little redundant.

--
This message was sent by Atlassian Jira
(v8.3.4#803005)