http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Discussion-regrading-design-of-data-load-after-kettle-removal-tp1672p1730.html
1. Yes it calls child step to execute and apply its logic to return
iterator just like spark sql. For CarbonOutputFormat it will use
2. Yes,this interface relies on processing row by row. But we can also
> Hi Ravindra,
>
> I have following questions:
>
> 1. How does DataLoadProcessorStep inteface work? For each step, it will
> call
> its child step to execute and apply its logic to the returned iterator of
> the child? And how does it map to OutputFormat in hadoop interface?
>
> 2. This step interface relies on iterator to do the encoding row by row,
> will it be convinient to add batch encoder support now or later?
>
> 3. for the ditionary part, besides generator I think it is better also
> considering the interface for the reading of dictionary while querying. Are
> you planning to use the same interface? If so, it is not just a Generator.
> If the dictionary interface is well designed, other developer can also add
> new dictionary type. For example:
> - based on usage frequency to assign dictionary value, for better
> compression, similar to huffman encoding
> - order-preserving dictionary which can do range filter on dictionary value
> directly
>
> Regards,
> Jacky
>
>
>
> --
> View this message in context:
http://apache-carbondata-> mailing-list-archive.1130556.n5.nabble.com/Discussion-
> regrading-design-of-data-load-after-kettle-removal-tp1672p1726.html
> Sent from the Apache CarbonData Mailing List archive mailing list archive
> at Nabble.com.
>