[jira] [Commented] (CARBONDATA-2773) Load one file for multiple times in one load command cause wrong query result

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (CARBONDATA-2773) Load one file for multiple times in one load command cause wrong query result

Akash R Nilugal (Jira)

    [ https://issues.apache.org/jira/browse/CARBONDATA-2773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555036#comment-16555036 ]

xuchuanyin commented on CARBONDATA-2773:
----------------------------------------

In spark, SparkSQL does not support loading multiple files separated by comma.

However SparkSQL support reading/loading multiple files through dataframe and it works as expected.

val path = Arrays(path1, path1, path1)
// show tripled content
sparkSession.read.csv(paths: _*).show()
// write tripled content to parquet
sparkSession.read.csv(paths: _*).write.parquet(parquetDir)
// show tripled content in parquet
sparkSession.read.read.parquet(parquetDir)

> Load one file for multiple times in one load command cause wrong query result
> -----------------------------------------------------------------------------
>
>                 Key: CARBONDATA-2773
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-2773
>             Project: CarbonData
>          Issue Type: Bug
>            Reporter: xuchuanyin
>            Assignee: wangsen
>            Priority: Major
>
> CarbonData now support load multiple files in one load command. The file path can be comma separated.
> But when I try to load one file for multiple times in one load command, the query result is wrong.
> The load command looks like below:
> ```
> LOAD DATA LOCAL INPATH 'file1,file1,file1' INTO TABLE test_table;
> ```
> The expected result should be the triple of the file content, but actually the result is the file content not tripled.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)