Apache CarbonData Dev Mailing List archive › Apache CarbonData JIRA issues

[jira] [Commented] (CARBONDATA-2773) Load one file for multiple times in one load command cause wrong query result

Classic

List

Threaded

1 message

Akash R Nilugal (Jira)

[jira] [Commented] (CARBONDATA-2773) Load one file for multiple times in one load command cause wrong query result

[ https://issues.apache.org/jira/browse/CARBONDATA-2773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555036#comment-16555036 ]

xuchuanyin commented on CARBONDATA-2773:
----------------------------------------

In spark, SparkSQL does not support loading multiple files separated by comma.

However SparkSQL support reading/loading multiple files through dataframe and it works as expected.

val path = Arrays(path1, path1, path1)
// show tripled content
sparkSession.read.csv(paths: _*).show()
// write tripled content to parquet
sparkSession.read.csv(paths: _*).write.parquet(parquetDir)
// show tripled content in parquet
sparkSession.read.read.parquet(parquetDir)

> Load one file for multiple times in one load command cause wrong query result
> -----------------------------------------------------------------------------
>
> Key: CARBONDATA-2773
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2773
> Project: CarbonData
> Issue Type: Bug
> Reporter: xuchuanyin
> Assignee: wangsen
> Priority: Major
>
> CarbonData now support load multiple files in one load command. The file path can be comma separated.
> But when I try to load one file for multiple times in one load command, the query result is wrong.
> The load command looks like below:
> ```
> LOAD DATA LOCAL INPATH 'file1,file1,file1' INTO TABLE test_table;
> ```
> The expected result should be the triple of the file content, but actually the result is the file content not tripled.

--
This message was sent by Atlassian JIRA
(v7.6.3#76005)