[jira] [Updated] (CARBONDATA-665) Comparision Failure occurs when we execute the same query in hive and Carbondata

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Updated] (CARBONDATA-665) Comparision Failure occurs when we execute the same query in hive and Carbondata

Akash R Nilugal (Jira)

     [ https://issues.apache.org/jira/browse/CARBONDATA-665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

SWATI RAO updated CARBONDATA-665:
---------------------------------
     Attachment: Test_Data1.csv
    Description:
Orderby is not working , so records are not coming in sequence as well there is data difference and some values being stored as null

Data itself is stored incorrectly and is different from Hive
Spark version :1.6.2


Create 1 query : create table Test_Boundary (c1_int int,c2_Bigint Bigint,c3_Decimal Decimal(38,30),c4_double double,c5_string string,c6_Timestamp Timestamp,c7_Datatype_Desc string) STORED BY 'org.apache.carbondata.format'

Load 1 Query : LOAD DATA INPATH 'HDFS_URL/BabuStore/Data/Test_Data1.csv' INTO table Test_Boundary OPTIONS('DELIMITER'=',','QUOTECHAR'='','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='')

Create 2 query : create table Test_Boundary1 (c1_int int,c2_Bigint Bigint,c3_Decimal Decimal(38,30),c4_double double,c5_string string,c6_Timestamp Timestamp,c7_Datatype_Desc string) STORED BY 'org.apache.carbondata.format'

Load 2 query:  LOAD DATA INPATH 'HDFS_URL/BabuStore/Data/Test_Data1.csv' INTO table Test_Boundary1 OPTIONS('DELIMITER'=',','QUOTECHAR'='','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='')

Select Query :
select c1_int,c2_Bigint,c3_Decimal,c4_double,c5_string,c6_Timestamp,c7_Datatype_Desc from Test_Boundary where c2_bigint=c2_bigint

  was:
 Create 1 query : create table Test_Boundary (c1_int int,c2_Bigint Bigint,c3_Decimal Decimal(38,30),c4_double double,c5_string string,c6_Timestamp Timestamp,c7_Datatype_Desc string) STORED BY 'org.apache.carbondata.format'

Load 1 Query : LOAD DATA INPATH 'HDFS_URL/BabuStore/Data/Test_Data1.csv' INTO table Test_Boundary OPTIONS('DELIMITER'=',','QUOTECHAR'='','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='')

Create 2 query : create table Test_Boundary1 (c1_int int,c2_Bigint Bigint,c3_Decimal Decimal(38,30),c4_double double,c5_string string,c6_Timestamp Timestamp,c7_Datatype_Desc string) STORED BY 'org.apache.carbondata.format'

Load 2 query:  LOAD DATA INPATH 'HDFS_URL/BabuStore/Data/Test_Data1.csv' INTO table Test_Boundary1 OPTIONS('DELIMITER'=',','QUOTECHAR'='','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='')

Select Query :
select c1_int,c2_Bigint,c3_Decimal,c4_double,c5_string,c6_Timestamp,c7_Datatype_Desc from Test_Boundary where c2_bigint=c2_bigint


> Comparision Failure occurs when we execute the same query in hive and Carbondata
> --------------------------------------------------------------------------------
>
>                 Key: CARBONDATA-665
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-665
>             Project: CarbonData
>          Issue Type: Bug
>            Reporter: SWATI RAO
>         Attachments: Test_Data1.csv
>
>
> Orderby is not working , so records are not coming in sequence as well there is data difference and some values being stored as null
> Data itself is stored incorrectly and is different from Hive
> Spark version :1.6.2
> Create 1 query : create table Test_Boundary (c1_int int,c2_Bigint Bigint,c3_Decimal Decimal(38,30),c4_double double,c5_string string,c6_Timestamp Timestamp,c7_Datatype_Desc string) STORED BY 'org.apache.carbondata.format'
> Load 1 Query : LOAD DATA INPATH 'HDFS_URL/BabuStore/Data/Test_Data1.csv' INTO table Test_Boundary OPTIONS('DELIMITER'=',','QUOTECHAR'='','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='')
> Create 2 query : create table Test_Boundary1 (c1_int int,c2_Bigint Bigint,c3_Decimal Decimal(38,30),c4_double double,c5_string string,c6_Timestamp Timestamp,c7_Datatype_Desc string) STORED BY 'org.apache.carbondata.format'
> Load 2 query:  LOAD DATA INPATH 'HDFS_URL/BabuStore/Data/Test_Data1.csv' INTO table Test_Boundary1 OPTIONS('DELIMITER'=',','QUOTECHAR'='','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='')
> Select Query :
> select c1_int,c2_Bigint,c3_Decimal,c4_double,c5_string,c6_Timestamp,c7_Datatype_Desc from Test_Boundary where c2_bigint=c2_bigint



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)