[ https://issues.apache.org/jira/browse/CARBONDATA-658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Geetika Gupta updated CARBONDATA-658: ------------------------------------- Description: I tried to load data into a table having bigInt as a column. Firstly I loaded small bigint values to the table and noted down the carbondata file size then I loaded max bigint values to the table and again noted the carbondata file size. For large bigint values the carbondata file size was 684.25 Kb and for small bigint values it was 684.26 Kb. So I could not figure out whether compression is performed or not. I tried the same scenario with int datatype as well. For large int values the carbondata file size was 684.24 Kb and for small int values it was 684.26 Kb. Below are the queries: For BigInt table: Create table test(a BigInt, b String) stored by 'carbondata'; LOAD DATA INPATH 'hdfs://localhost:54311/testFiles/100000_LargeBigInt.csv' into table test OPTIONS('DELIMITER'=',' , 'QUOTECHAR'='"','FILEHEADER'='b,a'); LOAD DATA INPATH 'hdfs://localhost:54311/testFiles/100000_SmallBigInt.csv' into table test OPTIONS('DELIMITER'=',' , 'QUOTECHAR'='"','FILEHEADER'='b,a'); For Int table: Create table test(a Int, b String) stored by 'carbondata'; LOAD DATA INPATH 'hdfs://localhost:54311/testFiles/100000_LargeInt.csv' into table test OPTIONS('DELIMITER'=',' , 'QUOTECHAR'='"','FILEHEADER'='b,a'); LOAD DATA INPATH 'hdfs://localhost:54311/testFiles/100000_SmallInt.csv' into table test OPTIONS('DELIMITER'=',' , 'QUOTECHAR'='"','FILEHEADER'='b,a'); was:I tried to load data into a table having bigInt as a column. Firstly I loaded small bigint values to the table and noted down the carbondata file size then I loaded max bigint values to the table and again noted the carbondata file size. Attachment: 100000_SmallInt.csv 100000_LargeInt.csv 100000_SmallBigInt.csv 100000_LargeBigInt.csv Environment: spark 1.6, 2.0 (was: spark 1.6) Summary: Compression is not working for BigInt and Int datatype (was: Compression is not working for BigInt and Int) > Compression is not working for BigInt and Int datatype > ------------------------------------------------------ > > Key: CARBONDATA-658 > URL: https://issues.apache.org/jira/browse/CARBONDATA-658 > Project: CarbonData > Issue Type: Bug > Components: data-load > Affects Versions: 1.0.0-incubating > Environment: spark 1.6, 2.0 > Reporter: Geetika Gupta > Attachments: 100000_LargeBigInt.csv, 100000_LargeInt.csv, 100000_SmallBigInt.csv, 100000_SmallInt.csv > > > I tried to load data into a table having bigInt as a column. Firstly I loaded small bigint values to the table and noted down the carbondata file size then I loaded max bigint values to the table and again noted the carbondata file size. > For large bigint values the carbondata file size was 684.25 Kb and for small bigint values it was 684.26 Kb. So I could not figure out whether compression is performed or not. > I tried the same scenario with int datatype as well. For large int values the carbondata file size was 684.24 Kb and for small int values it was 684.26 Kb. > Below are the queries: > For BigInt table: > Create table test(a BigInt, b String) stored by 'carbondata'; > LOAD DATA INPATH 'hdfs://localhost:54311/testFiles/100000_LargeBigInt.csv' into table test OPTIONS('DELIMITER'=',' , 'QUOTECHAR'='"','FILEHEADER'='b,a'); > LOAD DATA INPATH 'hdfs://localhost:54311/testFiles/100000_SmallBigInt.csv' into table test OPTIONS('DELIMITER'=',' , 'QUOTECHAR'='"','FILEHEADER'='b,a'); > For Int table: > Create table test(a Int, b String) stored by 'carbondata'; > LOAD DATA INPATH 'hdfs://localhost:54311/testFiles/100000_LargeInt.csv' into table test OPTIONS('DELIMITER'=',' , 'QUOTECHAR'='"','FILEHEADER'='b,a'); > LOAD DATA INPATH 'hdfs://localhost:54311/testFiles/100000_SmallInt.csv' into table test OPTIONS('DELIMITER'=',' , 'QUOTECHAR'='"','FILEHEADER'='b,a'); -- This message was sent by Atlassian JIRA (v6.3.4#6332) |
Free forum by Nabble | Edit this page |