Login  Register

Re: [Discussion] Roadmap for Apache CarbonData 2

Posted by kumarvishal09 on Aug 13, 2019; 11:21am
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Discussion-Roadmap-for-Apache-CarbonData-2-tp82223p83528.html

Hi Ravi,

We can add below requirements in 2.0:

1. Data Loading performance improvement.(Need to analyze and improve)
2. Unify reading for carbon data file, currently data is read in two parts
dimension and measure because of this number of IO is more.
3. Carbon Store size optimization(Already PR is raised need to revisit) and
we can explore some more optimization(like RLE hybrid Bit Packing).
4. Presto enhancement(Like write support, Presto SQL adaptation, Complex
type read support)
5. Spark Data Source V2 integration.
6. Spatial Index Support.


-Regards
Kumar Vishal

On Thu, Jul 18, 2019 at 8:20 PM Ravindra Pesala <[hidden email]>
wrote:

> Hi Kevin,
>
> Yes, we can improve it. The implementation is closely related to supporting
> pre-aggregate datamaps on the streaming table which we have already
> implemented some time ago. And same will be reimplemented for MV datamap
> soon as well.
> The implementation allows using of pre-aggregate datamap for non-streaming
> segments and main table for streaming segments. We update the query plan to
> do union on both the tables and query only the streaming segments for main
> table.
> So even in our case also we can use the same way, we can do the union query
> of MV table and main table(only non loaded datamap segments) and execute
> the query.  We can definitely consider after we support streaming table for
> MV datamap.
>
> Regards,
> Ravindra.
>
> On Wed, 17 Jul 2019 at 07:55, kevinjmh <[hidden email]> wrote:
>
> > currently, datamap in carbon applys to all segments.
> > The roadmap refers to commands like add/drop segment, and also maybe
> > something
> > about incremental loading for MV. For these scenes, it is better to make
> > datamap can be use on segment level instead of disable the datamap when
> any
> > datamap data is not ready for any segment. Also this can make datamap
> > fail-safe and enhance carbon's stablility.
> > Maybe we can consider about this also.
> >
> >
> >
> >
> > -----
> > Regards
> > Manhua
> >
> >
> >
> > ---Original---
> > From: "Ravindra Pesala"<[hidden email]>
> > Date: Tue, Jul 16, 2019 22:31 PM
> > To: "dev"<[hidden email]>;
> > Subject: [Discussion] Roadmap for Apache CarbonData 2
> >
> >
> > Hi Community,
> >
> > Three years have passed since the launching of the Apache CarbonData
> > project, CarbonData has become a popular data management solution for
> > various scenarios. As new workload like AI and new runtime environment
> like
> > the cloud is emerging quickly, I think we are reaching a point that needs
> > to discuss the future of CarbonData.
> >
> > To bring CarbonData to a new level to satisfy those new requirements,
> Jacky
> > and I drafted a roadmap for CarbonData 2 in the cwiki website.
> > - English Version:
> >
> >
> https://cwiki.apache.org/confluence/display/CARBONDATA/Apache+CarbonData+2+Roadmap+Proposal
> > - Chinese Version:
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=120737492
> >
> > Please feel free to discuss the roadmap in this thread, and we welcome
> > every feedback to make CarbonData better.
> >
> > Thanks and Regards,
> > Ravindra.
>
>
>
> --
> Thanks & Regards,
> Ravi
>
kumar vishal