Login  Register

[DISCUSSION] Support Incremental load in datamap and other MV datamap enhancement

Posted by akashrn5 on Feb 15, 2019; 1:18pm
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/DISCUSSION-Support-Incremental-load-in-datamap-and-other-MV-datamap-enhancement-tp75160.html

Hi,

Currently in carbondata we have datamaps like preaggregate, lucene, bloom,
mv and we have
lazy and non-lazy methods to load data to datamaps. But lazy load is not
allowed for datamaps
like preagg, lucene, bloom.but, it is allowed for mv datamap. In lazy load
of mv datamap, for
every rebuild(load to datamap) we load the complete data of main table and
overwrite the existing
segment in datamap based on datamap query.

This is very costly in terms of performance and we also need to support the
lazy and non-lazy load
for all the datamaps. This can help in reduce the actual dataload time to
main table and whenever
user wants, he can do the lazy load for the datamaps present for that table.

Basically we need not overwrite the existing data every time we load to
datamap, we need to increment
the data in new segments similar to main table. This will help to get
better performance.

Please giveyour inputs or get back for any clarifications.

JIRA is created to track https://issues.apache.org/jira/browse/CARBONDATA-3296

Design document is present at https://docs.google.com/document/d/13XgEBUIqaAKdrlQftebr5BNOplL3u9qxuFe-IJUB3cM/edit#heading=h.h311u6t3pve9

Regards,
Akash