Apache CarbonData Dev Mailing List archive › Apache CarbonData JIRA issues

[jira] [Resolved] (CARBONDATA-2148) Use Row parser to replace current default parser:CSVStreamParserImp

Classic

List

Threaded

1 message

Akash R Nilugal (Jira)

[jira] [Resolved] (CARBONDATA-2148) Use Row parser to replace current default parser:CSVStreamParserImp

[ https://issues.apache.org/jira/browse/CARBONDATA-2148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ravindra Pesala resolved CARBONDATA-2148.
-----------------------------------------
Resolution: Fixed
Fix Version/s: 1.4.0

> Use Row parser to replace current default parser:CSVStreamParserImp
> -------------------------------------------------------------------
>
> Key: CARBONDATA-2148
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2148
> Project: CarbonData
> Issue Type: Improvement
> Components: data-load, spark-integration
> Affects Versions: 1.3.1
> Reporter: Zhichao Zhang
> Assignee: Zhichao Zhang
> Priority: Minor
> Fix For: 1.4.0, 1.3.1
>
> Time Spent: 10h
> Remaining Estimate: 0h
>
> Currently the default value of 'carbon.stream.parser' is CSVStreamParserImp, it transforms InternalRow(0) to Array[Object], InternalRow(0) represents the value of one line which is received from Socket. When it receives data from Kafka, the schema of InternalRow is changed, either it need to assemble the fields of kafka data Row into a String and stored it as InternalRow(0), or define a new parser to convert kafka data Row to Array[Object]. It needs the same operation for every table.
> *Solution:*
> Use a new parser called RowStreamParserImpl as the default parser instead of CSVStreamParserImpl, this new parser will automatically convert InternalRow to Array[Object] according to the schema. In general, we will transform source data to a structed Row object, using this way, we do not need to define a parser for every table.
>

--
This message was sent by Atlassian JIRA
(v7.6.3#76005)