Neha Bhardwaj created CARBONDATA-917:
---------------------------------------- Summary: count(*) doesn't work Key: CARBONDATA-917 URL: https://issues.apache.org/jira/browse/CARBONDATA-917 Project: CarbonData Issue Type: Bug Components: data-query Environment: scala 2.1, Hive 1.2.1 Reporter: Neha Bhardwaj Priority: Minor Attachments: abc.csv Select query with count(*) fails to render output Steps to reproduce: 1) In Spark Shell : a) Create Table - import org.apache.spark.sql.SparkSession import org.apache.spark.sql.CarbonSession._ val carbon = SparkSession.builder().config(sc.getConf).getOrCreateCarbonSession("hdfs://localhost:54310/opt/data") scala> carbon.sql(" create table abc(id int, name string) stored by 'carbondata' ").show b) Load Data - scala> carbon.sql(""" load data inpath 'hdfs://localhost:54310/Files/abc.csv' into table abc """ ).show 2) In Hive : a) Add Jars - add jar /home/neha/incubator-carbondata/assembly/target/scala-2.11/carbondata_2.11-1.1.0-incubating-SNAPSHOT-shade-hadoop2.7.2.jar; add jar /opt/spark-2.1.0-bin-hadoop2.7/jars/spark-catalyst_2.11-2.1.0.jar; add jar /home/neha/incubator-carbondata/integration/hive/carbondata-hive-1.1.0-incubating-SNAPSHOT.jar; b) Create Table - create table abc(id int,name string); c) Alter location - hive> alter table abc set LOCATION 'hdfs://localhost:54310/opt/data/default/abc' ; d) Set Properties - set hive.mapred.supports.subdirectories=true; set mapreduce.input.fileinputformat.input.dir.recursive=true; d) Alter FileFormat - alter table abc set FILEFORMAT INPUTFORMAT "org.apache.carbondata.hive.MapredCarbonInputFormat" OUTPUTFORMAT "org.apache.carbondata.hive.MapredCarbonOutputFormat" SERDE "org.apache.carbondata.hive.CarbonHiveSerDe"; e) Query - hive> select count(*) from abc; Expected Output : ResultSet should display the count of the number of rows in the table. Result: Query ID = hduser_20170412181449_85a7db42-42a1-450c-9931-dc7b3b00b412 Total jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapreduce.job.reduces=<number> Job running in-process (local Hadoop) 2017-04-12 18:14:53,949 Stage-1 map = 0%, reduce = 0% Ended Job = job_local220086106_0001 with errors Error during job, obtaining debugging information... Job Tracking URL: http://localhost:8080/ FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask MapReduce Jobs Launched: Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 FAIL Total MapReduce CPU Time Spent: 0 msec -- This message was sent by Atlassian JIRA (v6.3.15#6346) |
Free forum by Nabble | Edit this page |