Apache CarbonData Dev Mailing List archive

Re: [Discussion] Add HEADER option to load data sql

Posted by Venkata Gollamudi on
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Discussion-Add-HEADER-option-to-load-data-sql-tp17080p17251.html

I agree that user need not provide columns names if no header present in
file and columns order is same as schema order.

instead of option header=true, will not cover all the cases of header
present, not present, override header etc. I have added added intermediate
approach covering all the cases and also taking care of current default
values and backward compatibility.

csv file without header
1. FILEHEADER="col1,col2,col3", default: IGNORE_FIRST_LINE="FALSE"
use given header
2. FILEHEADER="" default: IGNORE_FIRST_LINE="FALSE"
use schema order

csv file with header
1. None default:
IGNORE_FIRST_LINE="FALSE"
expects CSV first line as header.
2. FILEHEADER="col1,col2,col3", IGNORE_FIRST_LINE="TRUE"
uses explicitly given header, ignoring header from file.
3. FILEHEADER="",
IGNORE_FIRST_LINE="TRUE"
uses schema order, ignoring header from file.

Regards,
Ramana

On Tue, Jul 4, 2017 at 6:51 AM, wangbin <[hidden email]> wrote:

> I propose the loading the CSV files by explicitly give a table schema,while
> using a option to ignore csv header if has.
>
>
>
> --
> View this message in context: http://apache-carbondata-dev-
> mailing-list-archive.1130556.n5.nabble.com/Discussion-Add-
> HEADER-option-to-load-data-sql-tp17080p17179.html
> Sent from the Apache CarbonData Dev Mailing List archive mailing list
> archive at Nabble.com.
>