Login  Register

[Discussion] Add HEADER option to load data sql

Posted by David CaiQiang on Jul 03, 2017; 8:32am
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Discussion-Add-HEADER-option-to-load-data-sql-tp17080.html

1. Background

a)  load data with FILEHEADER option
load data inpath '<path>' into table <carbon_table_name> options('FILEHEADER'='col1,col2,col3')

It means we will load the CSV files without the file header. So we need the FILEHEADER option to specify the file header.

b)  load data without FILEHEADER option
load data inpath '<path>' into table <carbon_table_name>

It means we will load the CSV files which have the file header. So we will use the file header of the CSV files.

2. Issue

When we load the CSV files without file header and the file header is the same with the table schema, we can combine all column to form the file header. So I think It is unnecessary to let user provide the file header.

3. Solution

Add HEADER option to load data sql.
HEADER option could be true or false. The default value is true.
When we load the CSV files without file header and the file header is the same with the table schema,  add 'header'='false' to load data sql.

please vote,
+1: yes, agree to add 'header' option
±0: abstain or no opinion
-1: no,  veto this action. no need to add 'header' option.

Regards
David Cai
Best Regards
David Cai