Login  Register

Re: [Feature Proposal] Proposal for offline and DDL local dictionary support

Posted by Jacky Li on Nov 06, 2018; 8:14am
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Feature-Proposal-Proposal-for-offline-and-DDL-local-dictionary-support-tp67620p67871.html

+1
Yes, I think SDK should provide local dictionary support also.

Regards,
Jacky

> 在 2018年11月5日,下午2:14,manish gupta <[hidden email]> 写道:
>
> Hi Dev
>
> Currently we are supporting LOCAL DICTIONARY feature during data load
> operation. The feature is very helpful in terms that it reduces the store
> size which helps is reducing the IO thereby enhancing the query performance.
> *This proposal is to extend LOCAL DICTIONARY feature and provide a separate
> DDL and offline support for this feature. This is will make this feature
> usage more flexible. The reason for proposing this feature is*:
>
> 1. DDL support which can enable stores without local dictionary to add this
> feature for the already loaded data. This can be helpful for customers to
> leverage the functionality of LOCAL  DICTIONARY  feature for their data
> which is written in carbondata format without local dictionary.
> 2. We know that when Local dictionary is enabled, though small but there is
> degrade in data load performance. So there can be applications/customers
> who want to fine tune the loaded data in off-peak time. This feature can be
> helpful for those kind of scenarios.
> 3. Offline support is proposed for SDK like features where In we do not
> have spark driver executor model and there can be only a single thread used
> for loading data. So for this scenario we can provide an offline support
> thereby not impacting the existing data load performance.
>
> Please let me know your suggestions for this proposal. If most of the
> community members feel the idea is good and it will make the usage of this
> feature more flexible I can come up with a design and further discuss on
> this platform.
>
> Regards
> Manish Gupta
>