Hi,all:
For the first phase, I think support reading carbon tables in hive is enough. We still have something to do 1、Make carbon schema compatible with hive(CARBONDATA-1008)(create table and alter table) 2、Filter pushdown (especially partition filter FilterPushdownDev) 3、A tool to update the existing tables' schema to be compatible with hive. Do you have any idea? |
Administrator
|
This post was updated on .
Hi cenyuhai
Thanks for you started this discussion about hive integration: 1、Make carbon schema compatible with hive(CARBONDATA-1008)(create table and alter table) Liang: Like you mentioned, for first phase(1.2.0), supports read carbondata files in hive. so can i understand the flow should be liking this : a)all steps of preparing carbondata files be handled in Spark, so "create table and alter table" would be handled in Spark.) In hive, only read(query). So can you explain a little more about what you mentioned that schema compatible with hive is for what part ? 2、Filter pushdown (especially partition filter FilterPushdownDev) Liang : LGTM for this point. 3、A tool to update the existing tables' schema to be compatible with hive. Liang : comment same as question1. can you give some examples for "the existing tables' schema". For hive integration feature in Apache CarbonData 1.2.0, i propose the scope as below: 1.Only support read/query carbondata files in hive. write carbondata(create carbon table,alter carbon table, load data etc.) in hive will be supported in future.(new mailing list topic can be discussed for this plan) 2.Can utilize CarbonData's good features(index, dictionary...... ) to get good query performance. Hive+CarbonData performance should be better than Hive+ORC 3.Provide a solution/tool to migrate all hive tables&data to carbon tables&data in Spark. Regards Liang |
Hi cenyuhai,can you tell why tool will be required,you already a pr
for it Making carbon schema compatible with hive(CARBONDATA-1008)what will this tool do? @liang HI,By existing tables' schema".cenyuhai means that when you are reading carbondata table from hive you need to alter schema of that carbontable to use mapredcarboninputformat and mapredcarbonoutput format which are compatible with hive using following steps alter table CHARTYPES31 set FILEFORMAT INPUTFORMAT "org.apache.carbondata.hive.MapredCarbonInputFormat" OUTPUTFORMAT "org.apache.carbondata.hive.MapredCarbonOutputFormat" SERDE "org.apache.carbondata.hive.CarbonHiveSerDe"; alter table CHARTYPES3 set LOCATION 'hdfs://localhost:54310/opt/carbonStore/default/CHARTYPES3' ; On Fri, Jun 2, 2017 at 5:41 AM, Liang Chen <[hidden email]> wrote: > Hi cenyuhai > > Thanks for you started this discussion about hive integration: > > 1、Make carbon schema compatible with hive(CARBONDATA-1008)(create table > and alter table) > > Liang: Like you mentioned, for first phase(1.2.0), supports read > carbondata files in hive. so can i understand the flow should be liking > this > : a)all steps of preparing carbondata files be handled in Spark, so "create > table and alter table" would be handled in Spark.) In hive, only > read(query). So can you explain at little more about what you mentioned > that schema compatible with hive is for what part ? > > 2、Filter pushdown (especially partition filter FilterPushdownDev) > Liang : LGTM for this point. > > 3、A tool to update the existing tables' schema to be compatible with > hive. > Liang : comment same as question1. can you give some examples for > "the > existing tables' schema". > > > For hive integration feature in Apache CarbonData 1.2.0, i propose the > scope > as below: > 1.Only support read/query carbondata files in hive. write > carbondata(create > carbon table,alter carbon table, load data etc.) in hive will be supported > in future.(new mailing topic can discuss the plan) > 2.Can utilize CarbonData's good features(index, dictionary...... ) to get > good query performance. Hive+CarbonData performance should be better than > Hive+ORC > 3.Provide a solution/tool to migrate all hive tables&data to carbon > tables&data in Spark. > > Regards > Liang > > > > > > > > -- > View this message in context: http://apache-carbondata-dev- > mailing-list-archive.1130556.n5.nabble.com/Carbondata-integration-Plan- > tp13450p13647.html > Sent from the Apache CarbonData Dev Mailing List archive mailing list > archive at Nabble.com. > -- Thanks and Regards * Anubhav Tarar * * Software Consultant* *Knoldus Software LLP <http://www.knoldus.com/home.knol> * LinkedIn <http://in.linkedin.com/in/rahulforallp> Twitter <https://twitter.com/RahulKu71223673> fb <[hidden email]> mob : 8588915184 |
hi, anubhav:
you are right, this tool is unnecessary. ------------------ Original ------------------ From: "anubhav.tarar";<[hidden email]>; Date: Wed, Jun 7, 2017 06:44 PM To: "dev"<[hidden email]>; Subject: Re: Carbondata hive integration Plan Hi cenyuhai,can you tell why tool will be required,you already a pr for it Making carbon schema compatible with hive(CARBONDATA-1008)what will this tool do? @liang HI,By existing tables' schema".cenyuhai means that when you are reading carbondata table from hive you need to alter schema of that carbontable to use mapredcarboninputformat and mapredcarbonoutput format which are compatible with hive using following steps alter table CHARTYPES31 set FILEFORMAT INPUTFORMAT "org.apache.carbondata.hive.MapredCarbonInputFormat" OUTPUTFORMAT "org.apache.carbondata.hive.MapredCarbonOutputFormat" SERDE "org.apache.carbondata.hive.CarbonHiveSerDe"; alter table CHARTYPES3 set LOCATION 'hdfs://localhost:54310/opt/carbonStore/default/CHARTYPES3' ; On Fri, Jun 2, 2017 at 5:41 AM, Liang Chen <[hidden email]> wrote: > Hi cenyuhai > > Thanks for you started this discussion about hive integration: > > 1、Make carbon schema compatible with hive(CARBONDATA-1008)(create table > and alter table) > > Liang: Like you mentioned, for first phase(1.2.0), supports read > carbondata files in hive. so can i understand the flow should be liking > this > : a)all steps of preparing carbondata files be handled in Spark, so "create > table and alter table" would be handled in Spark.) In hive, only > read(query). So can you explain at little more about what you mentioned > that schema compatible with hive is for what part ? > > 2、Filter pushdown (especially partition filter FilterPushdownDev) > Liang : LGTM for this point. > > 3、A tool to update the existing tables' schema to be compatible with > hive. > Liang : comment same as question1. can you give some examples for > "the > existing tables' schema". > > > For hive integration feature in Apache CarbonData 1.2.0, i propose the > scope > as below: > 1.Only support read/query carbondata files in hive. write > carbondata(create > carbon table,alter carbon table, load data etc.) in hive will be supported > in future.(new mailing topic can discuss the plan) > 2.Can utilize CarbonData's good features(index, dictionary...... ) to get > good query performance. Hive+CarbonData performance should be better than > Hive+ORC > 3.Provide a solution/tool to migrate all hive tables&data to carbon > tables&data in Spark. > > Regards > Liang > > > > > > > > -- > View this message in context: http://apache-carbondata-dev- > mailing-list-archive.1130556.n5.nabble.com/Carbondata-integration-Plan- > tp13450p13647.html > Sent from the Apache CarbonData Dev Mailing List archive mailing list > archive at Nabble.com. > -- Thanks and Regards * Anubhav Tarar * * Software Consultant* *Knoldus Software LLP <http://www.knoldus.com/home.knol> * LinkedIn <http://in.linkedin.com/in/rahulforallp> Twitter <https://twitter.com/RahulKu71223673> fb <[hidden email]> mob : 8588915184 |
Free forum by Nabble | Edit this page |