Carbondata vs Parquet

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Carbondata vs Parquet

Rana Faisal Munir
Hi all,


I have a question about carbondata.


In Parquet, if we do projection and selection together, then minimum
read unit will be page instead of row group. However, if we do only
filter then minimum read unit will be Row Group instead of Page.

I assume that it is also similar in CarbonData. But my question is how
carbondata handles reading one record. Does it read whole page or
blocklet to read one record or it reads just one record from the disk.


Thank you


Regards

Faisal


Reply | Threaded
Open this post in threaded view
|

Re: Carbondata vs Parquet

ravipesala
Hi,

In CarbonData minimum read unit from disk is column chunk inside
blocklet(rowgroup).
For filter or projection it is same, and maximum read unit would be group
of column chunks which are exists contiguous on disk.

Regards,
Ravindra.
On Tue, 2 May 2017 at 5:57 PM, Rana Faisal <[hidden email]> wrote:

> Hi all,
>
>
> I have a question about carbondata.
>
>
> In Parquet, if we do projection and selection together, then minimum
> read unit will be page instead of row group. However, if we do only
> filter then minimum read unit will be Row Group instead of Page.
>
> I assume that it is also similar in CarbonData. But my question is how
> carbondata handles reading one record. Does it read whole page or
> blocklet to read one record or it reads just one record from the disk.
>
>
> Thank you
>
>
> Regards
>
> Faisal
>
>
>