Login  Register

Re: [Discussion] CarbonOutputFormat Implementation

Posted by Jacky Li on Jul 05, 2017; 3:44am
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Discussion-CarbonOutputFormat-Implementation-tp17113p17314.html

+1.

For carbon data files, I think there should be at least two OutputFormat,
1) FileOutputFormat, which will not do sorting and write to carbondata file only. This will be used in GLOBAL_SORT option
2) TableOutputFormat, which will do sorting according to SORT_SCOPE option, and use Single Pass to load

And I think dictionary should be another OutputFormat.
So user can combine to use dictionary output format and carbondata file output format.

I suggest to firstly check the usage scenario and decide the class hierarchy of this feature.

Regards,
Jacky

> 在 2017年7月4日,下午8:37,Venkata Gollamudi <[hidden email]> 写道:
>
> +1
> OutputFormat should be based on single pass and with similar job
> configurations as CarbonInputFormat.
> Please output initial design and code skeleton, for review before
> proceeding for implementation.
>
> On Tue, Jul 4, 2017 at 4:30 PM, Kumar Vishal <[hidden email]>
> wrote:
>
>> +1
>> It's a long pending task.
>> -Regards
>> Kumar Vishal
>>
>> Sent from my iPhone
>>
>>> On 04-Jul-2017, at 16:26, Erlu Chen <[hidden email]> wrote:
>>>
>>> Thanks very much.
>>>
>>> After you have raised a PR, we can start review.
>>>
>>>
>>> Regards.
>>> Chenerlu.
>>>
>>>
>>>
>>> --
>>> View this message in context: http://apache-carbondata-dev-
>> mailing-list-archive.1130556.n5.nabble.com/Discussion-CarbonOutputFormat-
>> Implementation-tp17113p17239.html
>>> Sent from the Apache CarbonData Dev Mailing List archive mailing list
>> archive at Nabble.com.
>>