Hi, all:
Now carbondata is not working in hive which is the most widely used query engine. In my company, if I want to use carbon, I need to query carbondata table in hive. I think we should implement the following features in hive: 1. DDL create/drop/alter carbondata table 2. DML insert(overwrite) /select What do you think? |
Hi,
Yes, we have plans for integrating carbondata to hive engine but it is not our high priority work now so we will take it up this task gradually. Any contributions towards it are welcome. Regards, Ravi On 4 December 2016 at 12:30, Sea <[hidden email]> wrote: > Hi, all: > Now carbondata is not working in hive which is the most widely used > query engine. In my company, if I want to use carbon, I need to query > carbondata table in hive. > I think we should implement the following features in hive: > 1. DDL create/drop/alter carbondata table > 2. DML insert(overwrite) /select > > > What do you think? -- Thanks & Regards, Ravi |
Administrator
|
In reply to this post by cenyuhai
Hi
Agree. Hive has been widely used, this is a consensus。 Apache CarbonData community already have the plan to support hive integration, look forward to seeing your contribution on hive integration also :) Regards Liang
|
It looks like that we just need to implement CarbonFileStorageFomartDescriptor and CarbonHiveSerde
CarbonInputformat/CarbonOutputformat already exists in master branch @Liang, can you create a module for hive? import java.util.Set; import org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat; import org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat; import org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe; import com.google.common.collect.ImmutableSet; public class ParquetFileStorageFormatDescriptor extends AbstractStorageFormatDescriptor { @Override public Set<String> getNames() { return ImmutableSet.of(IOConstants.PARQUETFILE, IOConstants.PARQUET); } @Override public String getInputFormat() { return MapredParquetInputFormat.class.getName(); } @Override public String getOutputFormat() { return MapredParquetOutputFormat.class.getName(); } @Override public String getSerde() { return ParquetHiveSerDe.class.getName(); } } ------------------ Original ------------------ From: "Liang Chen";<[hidden email]>; Date: Fri, Dec 9, 2016 11:56 AM To: "dev"<[hidden email]>; Subject: Re: About hive integration Hi Agree. Hive has been widely used, this is a consensus。 Apache CarbonData community already have the plan to support hive integration, look forward to seeing your contribution on hive integration also :) Regards Liang cenyuhai wrote > Hi, all: > Now carbondata is not working in hive which is the most widely used > query engine. In my company, if I want to use carbon, I need to query > carbondata table in hive. > I think we should implement the following features in hive: > 1. DDL create/drop/alter carbondata table > 2. DML insert(overwrite) /select > > > What do you think? -- View this message in context: http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/About-hive-integration-tp3626p3976.html Sent from the Apache CarbonData Mailing List archive mailing list archive at Nabble.com. |
Free forum by Nabble | Edit this page |