Login  Register

Re: [Discussion] Carbon Store abstraction

Posted by sraghunandan on Oct 20, 2017; 8:56am
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Discussion-Carbon-Store-abstraction-tp24337p24427.html

I think we need to integrate with presto hive and then refactor.this gives
clear idea on what we want to achieve.each processing engine is different
in its own way and integrating first would give us a clear idea on what’s
required in CarbonData
On Fri, 20 Oct 2017 at 1:01 PM, Liang Chen <[hidden email]> wrote:

> Hi
>
> Thank you started this discussion. agree,  for exposing the clear interface
> to users, there are some optimization works.
>
> Can you list the more detail about your proposal? for example: what class
> you propose to move to carbon store, what api you propose to create and
> expose to users.
> I suggest we can discuss and confirm your proposal  in dev first, then
> start
> to create sub task in Jira.
>
> Regards
> Liang
>
>
> Jacky Li wrote
> > Hi community,
> >
> > I am proposing to create a carbondata-store module to abstract the carbon
> > store concept. The reason is:
> >
> > 1. Initially, carbon is designed as a file format, as it evolves to
> > provide more features, it implemented more and more functionalities in
> the
> > spark integration module. However, as community is trying to integrate
> > more and more compute framework with carbon, these functionalities is
> > duplicated across integration layer. Idealy, these functionality can be
> > unified and provided in one place.
> >
> > 2. The current interface of carbondata exposed to user is through SQL,
> but
> > the developer interface for developers who want to do compute engine
> > integration is not very clear.
> >
> > 3. There are many SQL command that carbon supported, but they are
> > implemented through spark RDD only. It is not sharable across compute
> > framework.
> >
> > Due to these reasons, for the long term future of carbondata, I think it
> > is better to abstract the interface for compute engine integration within
> > a new module called carbondata-store. It can wrap all store level
> > functionalities that above file format in an independent module of
> compute
> > engine, so that every integration module can depends on it and duplicate
> > code is removed.
> >
> > This is a continuous effort for long term, I will break this work into
> > subtask and start it by creating JIRA issue, if you agree.
> >
> > Regards,
> > Jacky Li
>
>
>
>
>
> --
> Sent from:
> http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
>