In load data, CSV row contains invalid quote char and results are invalid

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

In load data, CSV row contains invalid quote char and results are invalid

Harmeet Singh
Hi Team,

I am trying to load the CSV fie, which contains invalid quote char. But in the results the row is inserted and values are mix with next row without any waring and error. Following are the details:

CSV File:

name, description, salary, age, dob
tammy,$my name$,$900000$,22,19/10/2019
tammy1,$delhi$,$32345%,22,19/10/2019
tammy2,$banglore$,$543$,$44$,19/10/2019
tammy3,$city$,$343$,$22$,12/10/2019
tammy4,$punjab$,$23423$,$55$,19/10/2019

In CSV row 2 contains invalid quote char.

Table :

create table one (name string, description string, salary double, age int, dob timestamp) stored by 'carbondata';

Load data:

load data local inpath 'hdfs://localhost:54310/home/harmeet/dollarquote3.csv' into table one OPTIONS('QUOTECHAR'="$");

Actual Results:

+---------+--------------+-------+-----------+------+--+
|  name   | description  |  dob  |  salary   | age  |
+---------+--------------+-------+-----------+------+--+
| tammy   | my name      | NULL  | 900000.0  | 22   |
| tammy1  | delhi        | NULL  | NULL      | 543  |
| tammy3  | city         | NULL  | 343.0     | 22   |
| tammy4  | punjab       | NULL  | 23423.0   | 55   |
+---------+--------------+-------+-----------+------+--+

In the result the tammy1 value contains salary of tammy2 record.

Expected Result:

May be an error while reading invalid quote char  OR Just ignore the invalid quote char row.
Reply | Threaded
Open this post in threaded view
|

Re: In load data, CSV row contains invalid quote char and results are invalid

bill.zhou
hi Singh

   quotechar in the csv should be in pairs. like
name, description, salary, age, dob
tammy,$my name$,$900000$,22,19/10/2019
tammy1,$delhi$,$32345%$,22,19/10/2019
tammy2,$banglore$,$543$,$44$,19/10/2019
tammy3,$city$,$343$,$22$,12/10/2019
tammy4,$punjab$,$23423$,$55$,19/10/2019

Harmeet Singh wrote
Hi Team,

I am trying to load the CSV fie, which contains invalid quote char. But in the results the row is inserted and values are mix with next row without any waring and error. Following are the details:

CSV File:

name, description, salary, age, dob
tammy,$my name$,$900000$,22,19/10/2019
tammy1,$delhi$,$32345%,22,19/10/2019
tammy2,$banglore$,$543$,$44$,19/10/2019
tammy3,$city$,$343$,$22$,12/10/2019
tammy4,$punjab$,$23423$,$55$,19/10/2019

In CSV row 2 contains invalid quote char.

Table :

create table one (name string, description string, salary double, age int, dob timestamp) stored by 'carbondata';

Load data:

load data local inpath 'hdfs://localhost:54310/home/harmeet/dollarquote3.csv' into table one OPTIONS('QUOTECHAR'="$");

Actual Results:

+---------+--------------+-------+-----------+------+--+
|  name   | description  |  dob  |  salary   | age  |
+---------+--------------+-------+-----------+------+--+
| tammy   | my name      | NULL  | 900000.0  | 22   |
| tammy1  | delhi        | NULL  | NULL      | 543  |
| tammy3  | city         | NULL  | 343.0     | 22   |
| tammy4  | punjab       | NULL  | 23423.0   | 55   |
+---------+--------------+-------+-----------+------+--+

In the result the tammy1 value contains salary of tammy2 record.

Expected Result:

May be an error while reading invalid quote char  OR Just ignore the invalid quote char row.