Apache CarbonData Dev Mailing List archive › Apache CarbonData JIRA issues

[jira] [Updated] (CARBONDATA-465) Spark streaming dataframe support

Classic

List

Threaded

1 message

Akash R Nilugal (Jira)

[jira] [Updated] (CARBONDATA-465) Spark streaming dataframe support

[ https://issues.apache.org/jira/browse/CARBONDATA-465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Liang Chen updated CARBONDATA-465:
----------------------------------
Description:
Carbondata-1.0.0 support load data with spark data frame api. There is a limit that kettle is still required since DataFrameLoaderRDD still depends on kettle. We provide NewDataFrameLoaderRDD to load data with new flow .

Also,we discovered some bugs:

1. CarbonMetastoreCatalog.createTableFromThrift

```
/**
* schemaFilePath starts with file:// will not create meta files successfully
* while thriftWriter will have no complains.
* This will cause some weired error eg. No table found.
*/
val thriftWriter = new ThriftWriter(schemaFilePath, false)
thriftWriter.open()
thriftWriter.write(thriftTableInfo)
thriftWriter.close()
```

2. There are some exceptions raised even when you have set useKettle to false.

was:
Carbondata-0.3.0 support load data with spark data frame api. There is a limit that kettle is still required since DataFrameLoaderRDD still depends on kettle. We provide NewDataFrameLoaderRDD to load data with new flow .

Also,we discovered some bugs:

1. CarbonMetastoreCatalog.createTableFromThrift

```
/**
* schemaFilePath starts with file:// will not create meta files successfully
* while thriftWriter will have no complains.
* This will cause some weired error eg. No table found.
*/
val thriftWriter = new ThriftWriter(schemaFilePath, false)
thriftWriter.open()
thriftWriter.write(thriftTableInfo)
thriftWriter.close()
```

2. There are some exceptions raised even when you have set useKettle to false.

> Spark streaming dataframe support
> ---------------------------------
>
> Key: CARBONDATA-465
> URL: https://issues.apache.org/jira/browse/CARBONDATA-465
> Project: CarbonData
> Issue Type: Improvement
> Components: data-load
> Affects Versions: 1.0.0-incubating
> Reporter: WilliamZhu
> Assignee: WilliamZhu
> Priority: Minor
> Fix For: 1.0.0-incubating
>
> Time Spent: 4h 20m
> Remaining Estimate: 0h
>
> Carbondata-1.0.0 support load data with spark data frame api. There is a limit that kettle is still required since DataFrameLoaderRDD still depends on kettle. We provide NewDataFrameLoaderRDD to load data with new flow .
> Also,we discovered some bugs:
> 1. CarbonMetastoreCatalog.createTableFromThrift
> ```
> /**
> * schemaFilePath starts with file:// will not create meta files successfully
> * while thriftWriter will have no complains.
> * This will cause some weired error eg. No table found.
> */
> val thriftWriter = new ThriftWriter(schemaFilePath, false)
> thriftWriter.open()
> thriftWriter.write(thriftTableInfo)
> thriftWriter.close()
> ```
> 2. There are some exceptions raised even when you have set useKettle to false.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)