http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Discussion-Support-pre-aggregate-table-to-improve-OLAP-performance-tp24040p24165.html
for whole data. With the above approach as the data increases in the main
table the Loading time will also be increasing substantially. Other way is
> +1 , i agree with Jacky points.
> As we know, carbondata already be able to get very good performance for
> filter query scenarios through MDK index. supports pre-aggregate in 1.3.0
> would improve aggregated query scenarios. so users can use one carbondata
> to support all query cases(both filter and agg).
>
> To Lu cao, you mentioned this solution to build cube schema, it is too
> complex and there are many limitations, for example: the CUBE data can't
> support query detail data etc.
>
> Regards
> Liang
>
>
> Jacky Li wrote
> > Hi Lu Cao,
> >
> > In my previous experience on “cube” engine, no matter it is ROLAP or
> > MOLAP, it is something above SQL layer, because it not only need user to
> > establish cube schema by transform metadata from datawarehouse star
> schema
> > but also the engine defines its own query language like MDX, and many
> > times these languages are not standardized so that different vendor need
> > to provide different BI tools or adaptors for it.
> > So, although some vendor provides easy-to-use cube management tool, but
> it
> > at least has two problems: vendor locking and the rigid of the cube mode
> > once it defines. I think these problems are similar as in other vendor
> > specific solution.
> >
> > Currently one of the strength that carbon store provides is that it
> > complies to standard SQL support by integrating with SparkSQL, Hive, etc.
> > The intention of providing pre-aggregate table support is, it can enable
> > carbon improve OLAP query performance but still stick with standard SQL
> > support, it means all users still can use the same BI/JDBC
> > application/tool which can connect to SparkSQL, Hive, etc.
> >
> > If carbon should support “cube”, not only need to defines its
> > configuration which may be very complex and non-standard, but also will
> > force user to use vendor specific tools for management and visualization.
> > So, I think before going to this complexity, it is better to provide
> > pre-agg table as the first step.
> >
> > Although we do not want the full complexity of “cube” on arbitrary data
> > schema, but one special case is for timeseries data. Because time
> > dimension hierarchy (year/month/day/hour/minute/second) is naturally
> > understandable and it is consistent in all scenarios, so we can provide
> > native support for pre-aggregate table on time dimension. Actually it is
> a
> > cube on time and we can do automatic rollup for all levels in time.
> >
> > Finally, please note that, by using CTAS syntax, we are not restricting
> > carbon to support pre-aggreagate table only, but also arbitrary
> > materialized view, if we want in the future.
> >
> > Hope this make things more clear.
> >
> > Regards,
> > Jacky
> >
> >
> >
> > like mandarin provides, Actually, as you can see in the document, I am
> > avoiding to call this “cube”.
> >
> >
> >> 在 2017年10月15日,下午9:18,Lu Cao <
>
> > whucaolu@
>
> > > 写道:
> >>
> >> Hi Jacky,
> >> If user want to create a cube on main table, does he/she have to create
> >> multiple pre-aggregate tables? It will be a heavy workload to write so
> >> many
> >> CTAS commands. If user only need create a few pre-agg tables, current
> >> carbon already can support this requirement, user can create table first
> >> and then use insert into select statement. The only different is user
> >> need
> >> to query the pre-agg table instead of main table.
> >>
> >> So maybe we can enable user to create a cube model( in schema or
> >> metafile?)
> >> which contains multiple pre-aggregation definition and carbon can create
> >> those pre-agg tables automatically according to the model. That would be
> >> more easy for using and maintenance.
> >>
> >> Regards,
> >> Lionel
> >>
> >> On Sun, Oct 15, 2017 at 3:56 PM, Jacky Li <
>
> > jacky.likun@
>
> > > wrote:
> >>
> >>> Hi Liang,
> >>>
> >>> For alter table, data update/delete, and delete segment, they are the
> >>> same.
> >>> So I write in document “ User can manually perform this operation and
> >>> rebuild pre-aggregate table as
> >>> update scenario”
> >>> User need to drop the associated aggregate table and perform alter
> >>> table,
> >>> or data update/delete, or delete segment operation, then he can create
> >>> the
> >>> pre-agg table using CTAS command again, and the pre-aggregate table
> will
> >>> be
> >>> rebuilt.
> >>>
> >>> Regards,
> >>> Jacky
> >>>
> >>>> 在 2017年10月15日,下午2:50,Liang Chen <
>
> > chenliang6136@
>
> > > 写道:
> >>>>
> >>>> Hi Jacky
> >>>>
> >>>> Thanks for you started this discussion, this is a great feature in
> >>>> carbondata.
> >>>>
> >>>> One question:
> >>>> For sub_jar "Handle alter table scenarios for aggregation table",
> >>>> please
> >>>> give more detail info.
> >>>> Just i viewed the pdf attachment as below, looks no need to do any
> >>> handles
> >>>> for agg table if users do alter for main table. so can you provide
> more
> >>>> detail, which scenarios need to be handled?
> >>>> ------------------------------------------------------------
> >>> ------------------------------
> >>>> Adding of new column will not impact agg table.
> >>>> Deleting or renaming existing column may invalidate agg tables, if it
> >>>> invalidate, the operation
> >>>> will be rejected.
> >>>> User can manually perform this operation and rebuild pre-aggregate
> >>>> table
> >>> as
> >>>> update
> >>>> scenario.
> >>>>
> >>>> Regards
> >>>> Liang
> >>>>
> >>>>
> >>>> --
> >>>> Sent from:
http://apache-carbondata-dev-mailing-list-archive.1130556.
> >>> n5.nabble.com/
> >>>
> >>>
> >>>
> >>>
>
>
>
>
>
> --
> Sent from:
http://apache-carbondata-dev-mailing-list-archive.1130556.
> n5.nabble.com/
>