Apache CarbonData Dev Mailing List archive › Apache CarbonData JIRA issues

[jira] [Updated] (CARBONDATA-665) Comparision Failure occurs when we execute the same query in hive and Carbondata

Classic

List

Threaded

1 message

Akash R Nilugal (Jira)

[jira] [Updated] (CARBONDATA-665) Comparision Failure occurs when we execute the same query in hive and Carbondata

[ https://issues.apache.org/jira/browse/CARBONDATA-665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

SWATI RAO updated CARBONDATA-665:
---------------------------------
Attachment: Test_Data1.csv
Description:
Orderby is not working , so records are not coming in sequence as well there is data difference and some values being stored as null

Data itself is stored incorrectly and is different from Hive
Spark version :1.6.2

Create 1 query : create table Test_Boundary (c1_int int,c2_Bigint Bigint,c3_Decimal Decimal(38,30),c4_double double,c5_string string,c6_Timestamp Timestamp,c7_Datatype_Desc string) STORED BY 'org.apache.carbondata.format'

Load 1 Query : LOAD DATA INPATH 'HDFS_URL/BabuStore/Data/Test_Data1.csv' INTO table Test_Boundary OPTIONS('DELIMITER'=',','QUOTECHAR'='','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='')

Create 2 query : create table Test_Boundary1 (c1_int int,c2_Bigint Bigint,c3_Decimal Decimal(38,30),c4_double double,c5_string string,c6_Timestamp Timestamp,c7_Datatype_Desc string) STORED BY 'org.apache.carbondata.format'

Load 2 query: LOAD DATA INPATH 'HDFS_URL/BabuStore/Data/Test_Data1.csv' INTO table Test_Boundary1 OPTIONS('DELIMITER'=',','QUOTECHAR'='','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='')

Select Query :
select c1_int,c2_Bigint,c3_Decimal,c4_double,c5_string,c6_Timestamp,c7_Datatype_Desc from Test_Boundary where c2_bigint=c2_bigint

was:
Create 1 query : create table Test_Boundary (c1_int int,c2_Bigint Bigint,c3_Decimal Decimal(38,30),c4_double double,c5_string string,c6_Timestamp Timestamp,c7_Datatype_Desc string) STORED BY 'org.apache.carbondata.format'

Load 1 Query : LOAD DATA INPATH 'HDFS_URL/BabuStore/Data/Test_Data1.csv' INTO table Test_Boundary OPTIONS('DELIMITER'=',','QUOTECHAR'='','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='')

Create 2 query : create table Test_Boundary1 (c1_int int,c2_Bigint Bigint,c3_Decimal Decimal(38,30),c4_double double,c5_string string,c6_Timestamp Timestamp,c7_Datatype_Desc string) STORED BY 'org.apache.carbondata.format'

Load 2 query: LOAD DATA INPATH 'HDFS_URL/BabuStore/Data/Test_Data1.csv' INTO table Test_Boundary1 OPTIONS('DELIMITER'=',','QUOTECHAR'='','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='')

Select Query :
select c1_int,c2_Bigint,c3_Decimal,c4_double,c5_string,c6_Timestamp,c7_Datatype_Desc from Test_Boundary where c2_bigint=c2_bigint

> Comparision Failure occurs when we execute the same query in hive and Carbondata
> --------------------------------------------------------------------------------
>
> Key: CARBONDATA-665
> URL: https://issues.apache.org/jira/browse/CARBONDATA-665
> Project: CarbonData
> Issue Type: Bug
> Reporter: SWATI RAO
> Attachments: Test_Data1.csv
>
>
> Orderby is not working , so records are not coming in sequence as well there is data difference and some values being stored as null
> Data itself is stored incorrectly and is different from Hive
> Spark version :1.6.2
> Create 1 query : create table Test_Boundary (c1_int int,c2_Bigint Bigint,c3_Decimal Decimal(38,30),c4_double double,c5_string string,c6_Timestamp Timestamp,c7_Datatype_Desc string) STORED BY 'org.apache.carbondata.format'
> Load 1 Query : LOAD DATA INPATH 'HDFS_URL/BabuStore/Data/Test_Data1.csv' INTO table Test_Boundary OPTIONS('DELIMITER'=',','QUOTECHAR'='','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='')
> Create 2 query : create table Test_Boundary1 (c1_int int,c2_Bigint Bigint,c3_Decimal Decimal(38,30),c4_double double,c5_string string,c6_Timestamp Timestamp,c7_Datatype_Desc string) STORED BY 'org.apache.carbondata.format'
> Load 2 query: LOAD DATA INPATH 'HDFS_URL/BabuStore/Data/Test_Data1.csv' INTO table Test_Boundary1 OPTIONS('DELIMITER'=',','QUOTECHAR'='','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='')
> Select Query :
> select c1_int,c2_Bigint,c3_Decimal,c4_double,c5_string,c6_Timestamp,c7_Datatype_Desc from Test_Boundary where c2_bigint=c2_bigint

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)