Login  Register

Carbondata vs Parquet

classic Classic list List threaded Threaded
2 messages Options Options
Embed post
Permalink
Reply | Threaded
Open this post in threaded view
| More
Print post
Permalink

Carbondata vs Parquet

Rana Faisal Munir
8 posts
Hi all,


I have a question about carbondata.


In Parquet, if we do projection and selection together, then minimum
read unit will be page instead of row group. However, if we do only
filter then minimum read unit will be Row Group instead of Page.

I assume that it is also similar in CarbonData. But my question is how
carbondata handles reading one record. Does it read whole page or
blocklet to read one record or it reads just one record from the disk.


Thank you


Regards

Faisal


Reply | Threaded
Open this post in threaded view
| More
Print post
Permalink

Re: Carbondata vs Parquet

ravipesala
300 posts
Hi,

In CarbonData minimum read unit from disk is column chunk inside
blocklet(rowgroup).
For filter or projection it is same, and maximum read unit would be group
of column chunks which are exists contiguous on disk.

Regards,
Ravindra.
On Tue, 2 May 2017 at 5:57 PM, Rana Faisal <[hidden email]> wrote:

> Hi all,
>
>
> I have a question about carbondata.
>
>
> In Parquet, if we do projection and selection together, then minimum
> read unit will be page instead of row group. However, if we do only
> filter then minimum read unit will be Row Group instead of Page.
>
> I assume that it is also similar in CarbonData. But my question is how
> carbondata handles reading one record. Does it read whole page or
> blocklet to read one record or it reads just one record from the disk.
>
>
> Thank you
>
>
> Regards
>
> Faisal
>
>
>