Login  Register

Re: Propose feature change in CarbonData 2.0

Posted by kumarvishal09 on Dec 06, 2019; 8:00am
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Propose-feature-change-in-CarbonData-2-0-tp87540p87829.html

Please find my comment inline
Bucketing +1
Carbon custom partition +1
BATCH_SORT +1
old preaggregate and time series datamap implementation +1
STORED BY +1

Global dictionary:
Data loading with global dictionary is slow but aggregation, filtering,
compression is better than any other type, storing raw value or with
location dictionary. So it might be useful feature
Vote: 0

Page Level Inverted Index: -1
If user know column on which he/she is going to use IN filter it is very
useful

Lucene datamap: Performance is bad because of some code/design issue which
can be fixed -1

And there are some internal refactory we can do:
1. Unify dimension and measure:
     It may improve IO performance but effort is high. 0

3. Spark integration refactory based on Spark extension interface +1

4. Store optimization PR2729 +1

-Regards
Kumar Vishal

On Thu, Dec 5, 2019 at 3:28 PM Jacky Li <[hidden email]> wrote:

> Hi,
>
> Thanks for all your input, the voting summary is as below:
>
> 1. Global dictionary
> No -1
>
> 2. Bucket
> Two -1
>
> 3. Carbon custom partition
> No -1
>
> 4. BATCH_SORT
> No -1
>
> 5. Page level inverse index
> One -1
>
> 5. old preaggregate and time series datamap implementation
> No -1
>
> 6. Lucene datamap
> Five -1
>
> 7. STORED BY
> No -1
>
> So, I have created an umbrella JIRA (CARBONDATA-3603) for these items.
> Please feel free to response if anyone interested working on them
>
> Regards,
> Jacky
>
>
>
>
>
>
>
>
> --
> Sent from:
> http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
>
kumar vishal