Apache CarbonData Dev Mailing List archive › Apache CarbonData JIRA issues

[jira] [Created] (CARBONDATA-2148) Use Row parser to replace current default parser:CSVStreamParserImp

Classic

List

Threaded

1 message

Akash R Nilugal (Jira)

[jira] [Created] (CARBONDATA-2148) Use Row parser to replace current default parser:CSVStreamParserImp

Zhichao Zhang created CARBONDATA-2148:
------------------------------------------

Summary: Use Row parser to replace current default parser:CSVStreamParserImp
Key: CARBONDATA-2148
URL: https://issues.apache.org/jira/browse/CARBONDATA-2148
Project: CarbonData
Issue Type: Improvement
Components: data-load, spark-integration
Affects Versions: 1.3.0
Reporter: Zhichao Zhang
Assignee: Zhichao Zhang
Fix For: 1.3.0

Currently the default value of 'carbon.stream.parser' is CSVStreamParserImp, it transforms InternalRow(0) to Array[Object], InternalRow(0) represents the value of one line which is received from Socket. When it receives data from Kafka, the schema of InternalRow is changed, either it need to assemble the fields of kafka data Row into a String and stored it as InternalRow(0), or define a new parser to convert kafka data Row to Array[Object]. It needs the same operation for every table.

*Solution:*
Use a new parser called RowStreamParserImpl as the default parser instead of CSVStreamParserImpl, this new parser will automatically convert InternalRow to Array[Object] according to the schema. In general, we will transform source data to a structed Row object, using this way, we do not need to define a parser for every table.

--
This message was sent by Atlassian JIRA
(v7.6.3#76005)