Hi Community,
As we know the CarbonDataisan indexed columnar data format for fast analytics on big data platforms. So we have already integrated with the query engines like spark and even presto. Currently with presto we only support the querying of carbon data files. But we don’t yet support the writing of carbon data files through presto engine. Currentlypresto is integrated with carbondata for reading the carbondata files via presto. For this, we should be having the store already ready which may be written carbon in spark and the table should be hive metastore. So using carbondata connector we are able to read the carbondata files. But we cannot create a table or load the data to the table in presto. So it will somewhat hectic job to read the carbon files, by writing first with other engines. So here I will be trying to support the transactional load support in presto integration for carbon. I have attached the design document in the Jira, please refer and any suggestions or input is most welcome. https://issues.apache.org/jira/browse/CARBONDATA-3831 Regards, Akash R. |
+1,
It would be great to have write support from presto Thanks Kunal Kapoor On Tue, Jul 14, 2020 at 6:08 PM Akash Nilugal <[hidden email]> wrote: > Hi Community, > > As we know the CarbonDataisan indexed columnar data format for fast > analytics on big data platforms. So > we have already integrated with the query engines like spark and even > presto. Currently with presto we > only support the querying of carbon data files. But we don’t yet support > the writing of carbon data files > through presto engine. > > Currentlypresto is integrated with carbondata for reading the carbondata > files via presto. > For this, we should be having the store already ready which may be written > carbon in spark and the table > should be hive metastore. So using carbondata connector we are able to read > the carbondata files. But we > cannot create a table or load the data to the table in presto. So it will > somewhat hectic job to read the > carbon files, by writing first with other engines. > > So here I will be trying to support the transactional load support in > presto integration for carbon. > > I have attached the design document in the Jira, please refer and any > suggestions or input is most welcome. > > https://issues.apache.org/jira/browse/CARBONDATA-3831 > > > Regards, > Akash R. > |
+1,
I have some suggestions and questions, a) you mentioned, currently creating a table form presto and inserting data will be a non-transactional table. so, to create a transactional table, we still depend on spark ? *I feel we should support transactional table creation with all table properties also from presto (to remove dependency on spark). * b) If spark has created global sort table and presto inserts data. what will happen ? Do we ignore those table properties in presto write ? c) Partition and complex type can be handled as immediate follow up of this as it is a very common feature now a days. Thanks, Ajantha On Mon, 27 Jul, 2020, 1:07 pm Kunal Kapoor, <[hidden email]> wrote: > +1, > It would be great to have write support from presto > > Thanks > Kunal Kapoor > > On Tue, Jul 14, 2020 at 6:08 PM Akash Nilugal <[hidden email]> > wrote: > > > Hi Community, > > > > As we know the CarbonDataisan indexed columnar data format for fast > > analytics on big data platforms. So > > we have already integrated with the query engines like spark and even > > presto. Currently with presto we > > only support the querying of carbon data files. But we don’t yet support > > the writing of carbon data files > > through presto engine. > > > > Currentlypresto is integrated with carbondata for reading the carbondata > > files via presto. > > For this, we should be having the store already ready which may be > written > > carbon in spark and the table > > should be hive metastore. So using carbondata connector we are able to > read > > the carbondata files. But we > > cannot create a table or load the data to the table in presto. So it will > > somewhat hectic job to read the > > carbon files, by writing first with other engines. > > > > So here I will be trying to support the transactional load support in > > presto integration for carbon. > > > > I have attached the design document in the Jira, please refer and any > > suggestions or input is most welcome. > > > > https://issues.apache.org/jira/browse/CARBONDATA-3831 > > > > > > Regards, > > Akash R. > > > |
In reply to this post by akashnilugal@gmail.com
+1
Good to enhance Carbon with Presto engine support. Regards, Venu -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ |
In reply to this post by Ajantha Bhat
Hi Ajantha,
Thanks for the inputs, please check the comments. a) you mentioned, currently creating a table form presto and inserting data will be a non-transactional table. so, to create a transactional table, we still depend on spark? ====> currently it's dependent on spark, but I'm planning to support create table from presto and remove the dependency I will start once I finish this requirement. b) If spark has created a global sort table and presto inserts data. what will happen? Do we ignore those table properties in presto write? ====> Currenlty we are ignoring the table properties and considering only the system-level properties. Once we support create table, we can consider these. c) Partition and complex type can be handled as immediate follow up of this as it is a very common feature now a days. ====> This is already planned, so I will create subtasks in Jira to follow up with these. Thanks, Regards, Akash R -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ |
Free forum by Nabble | Edit this page |