Apache CarbonData Dev Mailing List archive

Introducing V3 format.

Classic

List

Threaded

9 messages Options

ravipesala

Introducing V3 format.

Problems in current format.

1. IO read is slower since it needs to go for multiple seeks on the file to read column blocklets. Current size of blocklet is 120000, so it needs to read multiple times from file to scan the data on that column. Alternatively we can increase the blocklet size but it suffers for filter queries as it gets big blocklet to filter.

2. Decompression is slower in current format, we are using inverted index for faster filter queries and using NumberCompressor to compress the inverted index in bit wise packing. It becomes slower so we should avoid number compressor. One alternative is to keep blocklet size with in 32000 so that inverted index can be written with short, but IO read suffers a lot.

To overcome from above 2 issues we are introducing new format V3.

Here each blocklet has multiple pages with size 32000, number of pages in blocklet is configurable. Since we keep the page with in short limit so no need compress the inverted index here.

And maintain the max/min for each page to further prune the filter queries.

Read the blocklet with pages at once and keep in offheap memory.

During filter first check the max/min range and if it is valid then go for decompressing the page to filter further.

Please find the attached V3 format thrift file.

Thanks & Regards,
Ravi

ravipesala

Re: Introducing V3 format.

Please find the thrift file in below location.
https://drive.google.com/open?id=0B4TWTVbFSTnqZEdDRHRncVItQ242b1NqSTU2b2g4dkhkVDRj

On 15 February 2017 at 17:14, Ravindra Pesala <[hidden email]> wrote:

> Problems in current format.
> 1. IO read is slower since it needs to go for multiple seeks on the file
> to read column blocklets. Current size of blocklet is 120000, so it needs
> to read multiple times from file to scan the data on that column.
> Alternatively we can increase the blocklet size but it suffers for filter
> queries as it gets big blocklet to filter.
> 2. Decompression is slower in current format, we are using inverted index
> for faster filter queries and using NumberCompressor to compress the
> inverted index in bit wise packing. It becomes slower so we should avoid
> number compressor. One alternative is to keep blocklet size with in 32000
> so that inverted index can be written with short, but IO read suffers a lot.
>
> To overcome from above 2 issues we are introducing new format V3.
> Here each blocklet has multiple pages with size 32000, number of pages in
> blocklet is configurable. Since we keep the page with in short limit so no
> need compress the inverted index here.
> And maintain the max/min for each page to further prune the filter queries.
> Read the blocklet with pages at once and keep in offheap memory.
> During filter first check the max/min range and if it is valid then go for
> decompressing the page to filter further.
>
> Please find the attached V3 format thrift file.
>
> --
> Thanks & Regards,
> Ravi
>

--
Thanks & Regards,
Ravi

kumarvishal09

Re: Introducing V3 format.

+1
This will improve the IO bottleneck. Page level min max will improve the
block pruning and less number of false positive blocks will improve the
filter query performance. Separating uncompression of data from reader
layer will improve the overall query performance.

-Regards
Kumar Vishal

On Wed, Feb 15, 2017 at 7:50 PM, Ravindra Pesala <[hidden email]>
wrote:

> Please find the thrift file in below location.
> https://drive.google.com/open?id=0B4TWTVbFSTnqZEdDRHRncVItQ242b
> 1NqSTU2b2g4dkhkVDRj
>
> On 15 February 2017 at 17:14, Ravindra Pesala <[hidden email]>
> wrote:
>
> > Problems in current format.
> > 1. IO read is slower since it needs to go for multiple seeks on the file
> > to read column blocklets. Current size of blocklet is 120000, so it needs
> > to read multiple times from file to scan the data on that column.
> > Alternatively we can increase the blocklet size but it suffers for filter
> > queries as it gets big blocklet to filter.
> > 2. Decompression is slower in current format, we are using inverted index
> > for faster filter queries and using NumberCompressor to compress the
> > inverted index in bit wise packing. It becomes slower so we should avoid
> > number compressor. One alternative is to keep blocklet size with in 32000
> > so that inverted index can be written with short, but IO read suffers a
> lot.
> >
> > To overcome from above 2 issues we are introducing new format V3.
> > Here each blocklet has multiple pages with size 32000, number of pages in
> > blocklet is configurable. Since we keep the page with in short limit so
> no
> > need compress the inverted index here.
> > And maintain the max/min for each page to further prune the filter
> queries.
> > Read the blocklet with pages at once and keep in offheap memory.
> > During filter first check the max/min range and if it is valid then go
> for
> > decompressing the page to filter further.
> >
> > Please find the attached V3 format thrift file.
> >
> > --
> > Thanks & Regards,
> > Ravi
> >
>
>
>
> --
> Thanks & Regards,
> Ravi
>

kumar vishal

Jean-Baptiste Onofré

Re: Introducing V3 format.

Agree.

+1

Regards
JB

On Feb 15, 2017, 09:09, at 09:09, Kumar Vishal <[hidden email]> wrote:

>+1
>This will improve the IO bottleneck. Page level min max will improve
>the
>block pruning and less number of false positive blocks will improve the
>filter query performance. Separating uncompression of data from reader
>layer will improve the overall query performance.
>
>-Regards
>Kumar Vishal
>
>On Wed, Feb 15, 2017 at 7:50 PM, Ravindra Pesala
><[hidden email]>
>wrote:
>
>> Please find the thrift file in below location.
>> https://drive.google.com/open?id=0B4TWTVbFSTnqZEdDRHRncVItQ242b
>> 1NqSTU2b2g4dkhkVDRj
>>
>> On 15 February 2017 at 17:14, Ravindra Pesala <[hidden email]>
>> wrote:
>>
>> > Problems in current format.
>> > 1. IO read is slower since it needs to go for multiple seeks on the
>file
>> > to read column blocklets. Current size of blocklet is 120000, so it
>needs
>> > to read multiple times from file to scan the data on that column.
>> > Alternatively we can increase the blocklet size but it suffers for
>filter
>> > queries as it gets big blocklet to filter.
>> > 2. Decompression is slower in current format, we are using inverted
>index
>> > for faster filter queries and using NumberCompressor to compress
>the
>> > inverted index in bit wise packing. It becomes slower so we should
>avoid
>> > number compressor. One alternative is to keep blocklet size with in
>32000
>> > so that inverted index can be written with short, but IO read
>suffers a
>> lot.
>> >
>> > To overcome from above 2 issues we are introducing new format V3.
>> > Here each blocklet has multiple pages with size 32000, number of
>pages in
>> > blocklet is configurable. Since we keep the page with in short
>limit so
>> no
>> > need compress the inverted index here.
>> > And maintain the max/min for each page to further prune the filter
>> queries.
>> > Read the blocklet with pages at once and keep in offheap memory.
>> > During filter first check the max/min range and if it is valid then
>go
>> for
>> > decompressing the page to filter further.
>> >
>> > Please find the attached V3 format thrift file.
>> >
>> > --
>> > Thanks & Regards,
>> > Ravi
>> >
>>
>>
>>
>> --
>> Thanks & Regards,
>> Ravi
>>

Liang Chen

Re: Introducing V3 format.

Administrator

Hi Ravi

Thank you bringing the discussion to mailing list, i have one question: how to ensure backward-compatible after introducing the new format.

Regards
Liang

Jean-Baptiste Onofré wrote

Agree.

+1

Regards
JB

On Feb 15, 2017, 09:09, at 09:09, Kumar Vishal <[hidden email]> wrote:
>+1
>This will improve the IO bottleneck. Page level min max will improve
>the
>block pruning and less number of false positive blocks will improve the
>filter query performance. Separating uncompression of data from reader
>layer will improve the overall query performance.
>
>-Regards
>Kumar Vishal
>
>On Wed, Feb 15, 2017 at 7:50 PM, Ravindra Pesala
><[hidden email]>
>wrote:
>
>> Please find the thrift file in below location.
>> https://drive.google.com/open?id=0B4TWTVbFSTnqZEdDRHRncVItQ242b
>> 1NqSTU2b2g4dkhkVDRj
>>
>> On 15 February 2017 at 17:14, Ravindra Pesala <[hidden email]>
>> wrote:
>>
>> > Problems in current format.
>> > 1. IO read is slower since it needs to go for multiple seeks on the
>file
>> > to read column blocklets. Current size of blocklet is 120000, so it
>needs
>> > to read multiple times from file to scan the data on that column.
>> > Alternatively we can increase the blocklet size but it suffers for
>filter
>> > queries as it gets big blocklet to filter.
>> > 2. Decompression is slower in current format, we are using inverted
>index
>> > for faster filter queries and using NumberCompressor to compress
>the
>> > inverted index in bit wise packing. It becomes slower so we should
>avoid
>> > number compressor. One alternative is to keep blocklet size with in
>32000
>> > so that inverted index can be written with short, but IO read
>suffers a
>> lot.
>> >
>> > To overcome from above 2 issues we are introducing new format V3.
>> > Here each blocklet has multiple pages with size 32000, number of
>pages in
>> > blocklet is configurable. Since we keep the page with in short
>limit so
>> no
>> > need compress the inverted index here.
>> > And maintain the max/min for each page to further prune the filter
>> queries.
>> > Read the blocklet with pages at once and keep in offheap memory.
>> > During filter first check the max/min range and if it is valid then
>go
>> for
>> > decompressing the page to filter further.
>> >
>> > Please find the attached V3 format thrift file.
>> >
>> > --
>> > Thanks & Regards,
>> > Ravi
>> >
>>
>>
>>
>> --
>> Thanks & Regards,
>> Ravi
>>

ravipesala

Re: Introducing V3 format.

Hi Liang,

Backward compatibility is already handled in 1.0.0 version, so to read old
store then it uses V1/V2 format readers to read data from old store. So
backward compatibility works even though we jump to V3 format.

Regards,
Ravindra.

On 16 February 2017 at 04:18, Liang Chen <[hidden email]> wrote:

> Hi Ravi
>
> Thank you bringing the discussion to mailing list, i have one question: how
> to ensure backward-compatible after introducing the new format.
>
> Regards
> Liang
>
> Jean-Baptiste Onofré wrote
> > Agree.
> >
> > +1
> >
> > Regards
> > JB
> >
> > On Feb 15, 2017, 09:09, at 09:09, Kumar Vishal <
>
> > kumarvishal1802@
>
> > > wrote:
> >>+1
> >>This will improve the IO bottleneck. Page level min max will improve
> >>the
> >>block pruning and less number of false positive blocks will improve the
> >>filter query performance. Separating uncompression of data from reader
> >>layer will improve the overall query performance.
> >>
> >>-Regards
> >>Kumar Vishal
> >>
> >>On Wed, Feb 15, 2017 at 7:50 PM, Ravindra Pesala
> >><
>
> > ravi.pesala@
>
> > >
> >>wrote:
> >>
> >>> Please find the thrift file in below location.
> >>> https://drive.google.com/open?id=0B4TWTVbFSTnqZEdDRHRncVItQ242b
> >>> 1NqSTU2b2g4dkhkVDRj
> >>>
> >>> On 15 February 2017 at 17:14, Ravindra Pesala <
>
> > ravi.pesala@
>
> > >
> >>> wrote:
> >>>
> >>> > Problems in current format.
> >>> > 1. IO read is slower since it needs to go for multiple seeks on the
> >>file
> >>> > to read column blocklets. Current size of blocklet is 120000, so it
> >>needs
> >>> > to read multiple times from file to scan the data on that column.
> >>> > Alternatively we can increase the blocklet size but it suffers for
> >>filter
> >>> > queries as it gets big blocklet to filter.
> >>> > 2. Decompression is slower in current format, we are using inverted
> >>index
> >>> > for faster filter queries and using NumberCompressor to compress
> >>the
> >>> > inverted index in bit wise packing. It becomes slower so we should
> >>avoid
> >>> > number compressor. One alternative is to keep blocklet size with in
> >>32000
> >>> > so that inverted index can be written with short, but IO read
> >>suffers a
> >>> lot.
> >>> >
> >>> > To overcome from above 2 issues we are introducing new format V3.
> >>> > Here each blocklet has multiple pages with size 32000, number of
> >>pages in
> >>> > blocklet is configurable. Since we keep the page with in short
> >>limit so
> >>> no
> >>> > need compress the inverted index here.
> >>> > And maintain the max/min for each page to further prune the filter
> >>> queries.
> >>> > Read the blocklet with pages at once and keep in offheap memory.
> >>> > During filter first check the max/min range and if it is valid then
> >>go
> >>> for
> >>> > decompressing the page to filter further.
> >>> >
> >>> > Please find the attached V3 format thrift file.
> >>> >
> >>> > --
> >>> > Thanks & Regards,
> >>> > Ravi
> >>> >
> >>>
> >>>
> >>>
> >>> --
> >>> Thanks & Regards,
> >>> Ravi
> >>>
>
>
>
>
>
> --
> View this message in context: http://apache-carbondata-
> mailing-list-archive.1130556.n5.nabble.com/Introducing-V3-
> format-tp7609p7622.html
> Sent from the Apache CarbonData Mailing List archive mailing list archive
> at Nabble.com.
>

--
Thanks & Regards,
Ravi

Liang Chen

Re: Introducing V3 format.

Administrator

Hi

Thanks for detail explanation.
+1 for introducing new format to improve performance further.

Regards
Liang

ravipesala wrote

Hi Liang,

Backward compatibility is already handled in 1.0.0 version, so to read old
store then it uses V1/V2 format readers to read data from old store. So
backward compatibility works even though we jump to V3 format.

Regards,
Ravindra.

On 16 February 2017 at 04:18, Liang Chen <[hidden email]> wrote:

> Hi Ravi
>
> Thank you bringing the discussion to mailing list, i have one question: how
> to ensure backward-compatible after introducing the new format.
>
> Regards
> Liang
>
> Jean-Baptiste Onofré wrote
> > Agree.
> >
> > +1
> >
> > Regards
> > JB
> >
> > On Feb 15, 2017, 09:09, at 09:09, Kumar Vishal <
>
> > kumarvishal1802@
>
> > > wrote:
> >>+1
> >>This will improve the IO bottleneck. Page level min max will improve
> >>the
> >>block pruning and less number of false positive blocks will improve the
> >>filter query performance. Separating uncompression of data from reader
> >>layer will improve the overall query performance.
> >>
> >>-Regards
> >>Kumar Vishal
> >>
> >>On Wed, Feb 15, 2017 at 7:50 PM, Ravindra Pesala
> >><
>
> > ravi.pesala@
>
> > >
> >>wrote:
> >>
> >>> Please find the thrift file in below location.
> >>> https://drive.google.com/open?id=0B4TWTVbFSTnqZEdDRHRncVItQ242b
> >>> 1NqSTU2b2g4dkhkVDRj
> >>>
> >>> On 15 February 2017 at 17:14, Ravindra Pesala <
>
> > ravi.pesala@
>
> > >
> >>> wrote:
> >>>
> >>> > Problems in current format.
> >>> > 1. IO read is slower since it needs to go for multiple seeks on the
> >>file
> >>> > to read column blocklets. Current size of blocklet is 120000, so it
> >>needs
> >>> > to read multiple times from file to scan the data on that column.
> >>> > Alternatively we can increase the blocklet size but it suffers for
> >>filter
> >>> > queries as it gets big blocklet to filter.
> >>> > 2. Decompression is slower in current format, we are using inverted
> >>index
> >>> > for faster filter queries and using NumberCompressor to compress
> >>the
> >>> > inverted index in bit wise packing. It becomes slower so we should
> >>avoid
> >>> > number compressor. One alternative is to keep blocklet size with in
> >>32000
> >>> > so that inverted index can be written with short, but IO read
> >>suffers a
> >>> lot.
> >>> >
> >>> > To overcome from above 2 issues we are introducing new format V3.
> >>> > Here each blocklet has multiple pages with size 32000, number of
> >>pages in
> >>> > blocklet is configurable. Since we keep the page with in short
> >>limit so
> >>> no
> >>> > need compress the inverted index here.
> >>> > And maintain the max/min for each page to further prune the filter
> >>> queries.
> >>> > Read the blocklet with pages at once and keep in offheap memory.
> >>> > During filter first check the max/min range and if it is valid then
> >>go
> >>> for
> >>> > decompressing the page to filter further.
> >>> >
> >>> > Please find the attached V3 format thrift file.
> >>> >
> >>> > --
> >>> > Thanks & Regards,
> >>> > Ravi
> >>> >
> >>>
> >>>
> >>>
> >>> --
> >>> Thanks & Regards,
> >>> Ravi
> >>>
>
>
>
>
>
> --
> View this message in context: http://apache-carbondata-
> mailing-list-archive.1130556.n5.nabble.com/Introducing-V3-
> format-tp7609p7622.html
> Sent from the Apache CarbonData Mailing List archive mailing list archive
> at Nabble.com.
>

--
Thanks & Regards,
Ravi

bill.zhou

Re: Introducing V3 format.

In reply to this post by ravipesala

hi Ravindra

As you description V3 will be benefit for IO scenairo(means more filter), what's about for CPU scenario(no filter, full scan with aggregation), is there any advantage for that.

Regards
Bill

ravipesala wrote

Problems in current format.
1. IO read is slower since it needs to go for multiple seeks on the file to
read column blocklets. Current size of blocklet is 120000, so it needs to
read multiple times from file to scan the data on that column.
Alternatively we can increase the blocklet size but it suffers for filter
queries as it gets big blocklet to filter.
2. Decompression is slower in current format, we are using inverted index
for faster filter queries and using NumberCompressor to compress the
inverted index in bit wise packing. It becomes slower so we should avoid
number compressor. One alternative is to keep blocklet size with in 32000
so that inverted index can be written with short, but IO read suffers a lot.

To overcome from above 2 issues we are introducing new format V3.
Here each blocklet has multiple pages with size 32000, number of pages in
blocklet is configurable. Since we keep the page with in short limit so no
need compress the inverted index here.
And maintain the max/min for each page to further prune the filter queries.
Read the blocklet with pages at once and keep in offheap memory.
During filter first check the max/min range and if it is valid then go for
decompressing the page to filter further.

Please find the attached V3 format thrift file.

--
Thanks & Regards,
Ravi

kumarvishal09

Re: Introducing V3 format.

Hi Bill,
In case of non filter query (full scan query) in V3 format carbon can read
more data in single IO as we can increase number of pages in blocklet, it
will reduce the IO time as number of IO will be less.

-Regards
Kumar Vishal

On Wed, Mar 1, 2017 at 5:39 PM, bill.zhou <[hidden email]> wrote:

> hi Ravindra
>
> As you description V3 will be benefit for IO scenairo(means more filter),
> what's about for CPU scenario(no filter, full scan with aggregation), is
> there any advantage for that.
>
> Regards
> Bill
>
> ravipesala wrote
> > Problems in current format.
> > 1. IO read is slower since it needs to go for multiple seeks on the file
> > to
> > read column blocklets. Current size of blocklet is 120000, so it needs to
> > read multiple times from file to scan the data on that column.
> > Alternatively we can increase the blocklet size but it suffers for filter
> > queries as it gets big blocklet to filter.
> > 2. Decompression is slower in current format, we are using inverted index
> > for faster filter queries and using NumberCompressor to compress the
> > inverted index in bit wise packing. It becomes slower so we should avoid
> > number compressor. One alternative is to keep blocklet size with in 32000
> > so that inverted index can be written with short, but IO read suffers a
> > lot.
> >
> > To overcome from above 2 issues we are introducing new format V3.
> > Here each blocklet has multiple pages with size 32000, number of pages in
> > blocklet is configurable. Since we keep the page with in short limit so
> no
> > need compress the inverted index here.
> > And maintain the max/min for each page to further prune the filter
> > queries.
> > Read the blocklet with pages at once and keep in offheap memory.
> > During filter first check the max/min range and if it is valid then go
> for
> > decompressing the page to filter further.
> >
> > Please find the attached V3 format thrift file.
> >
> > --
> > Thanks & Regards,
> > Ravi
>
>
>
>
>
> --
> View this message in context: http://apache-carbondata-
> mailing-list-archive.1130556.n5.nabble.com/Introducing-V3-
> format-tp7609p8137.html
> Sent from the Apache CarbonData Mailing List archive mailing list archive
> at Nabble.com.
>

kumar vishal