http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Introducing-V3-format-tp7609p8142.html
will reduce the IO time as number of IO will be less.
> hi Ravindra
>
> As you description V3 will be benefit for IO scenairo(means more filter),
> what's about for CPU scenario(no filter, full scan with aggregation), is
> there any advantage for that.
>
> Regards
> Bill
>
> ravipesala wrote
> > Problems in current format.
> > 1. IO read is slower since it needs to go for multiple seeks on the file
> > to
> > read column blocklets. Current size of blocklet is 120000, so it needs to
> > read multiple times from file to scan the data on that column.
> > Alternatively we can increase the blocklet size but it suffers for filter
> > queries as it gets big blocklet to filter.
> > 2. Decompression is slower in current format, we are using inverted index
> > for faster filter queries and using NumberCompressor to compress the
> > inverted index in bit wise packing. It becomes slower so we should avoid
> > number compressor. One alternative is to keep blocklet size with in 32000
> > so that inverted index can be written with short, but IO read suffers a
> > lot.
> >
> > To overcome from above 2 issues we are introducing new format V3.
> > Here each blocklet has multiple pages with size 32000, number of pages in
> > blocklet is configurable. Since we keep the page with in short limit so
> no
> > need compress the inverted index here.
> > And maintain the max/min for each page to further prune the filter
> > queries.
> > Read the blocklet with pages at once and keep in offheap memory.
> > During filter first check the max/min range and if it is valid then go
> for
> > decompressing the page to filter further.
> >
> > Please find the attached V3 format thrift file.
> >
> > --
> > Thanks & Regards,
> > Ravi
>
>
>
>
>
> --
> View this message in context:
http://apache-carbondata-> mailing-list-archive.1130556.n5.nabble.com/Introducing-V3-
> format-tp7609p8137.html
> Sent from the Apache CarbonData Mailing List archive mailing list archive
> at Nabble.com.
>