[Discussion] CarbonOutputFormat Implementation

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

[Discussion] CarbonOutputFormat Implementation

Divya Gupta
CarbonData has implemented CarbonInputFomat, which enable applications
using Hive, Presto and other similar tools to read data from Carbon.

Similarly there should be implementation for CarbonOutputFomat also. This
will enable Hive, Presto or similar applications, using Carbondata as a
datasource, to write and load data to Carbondata files.

Regards
Divya Gupta
Reply | Threaded
Open this post in threaded view
|

Re: [Discussion] CarbonOutputFormat Implementation

Erlu Chen
Hi Divya

Thanks for your suggestion.

Carbondata may support it in the near future.

If you want to contribute this feature, I think it will benefit community a lot.


Regards.
Chenerlu.
Reply | Threaded
Open this post in threaded view
|

Re: [Discussion] CarbonOutputFormat Implementation

Divya Gupta
Thanks for the quick reply Chenerlu.

I would surely like to contribute this feature and will start working
towards CARBONDATA-729.

Regards
Divya Gupta

Regards
Divya Gupta
Project Lead


*Knoldus Software LLP <http://www.knoldus.com/>*
India <http://www.knoldus.in/> - US <http://www.knoldus.com/> - Canada
<http://www.knoldus.ca/>
<http://www.knoldus.com/>
Blog <http://blog.knoldus.com/> | Twitter <https://twitter.com/knolspeak> |
FB <https://www.facebook.com/KnoldusSoftware> | LinkedIn
<http://www.linkedin.com/company/knoldus-software-llp->

On Tue, Jul 4, 2017 at 2:37 PM, Erlu Chen <[hidden email]> wrote:

> Hi Divya
>
> Thanks for your suggestion.
>
> Carbondata may support it in the near future.
>
> If you want to contribute this feature, I think it will benefit community a
> lot.
>
>
> Regards.
> Chenerlu.
>
>
>
> --
> View this message in context: http://apache-carbondata-dev-
> mailing-list-archive.1130556.n5.nabble.com/Discussion-CarbonOutputFormat-
> Implementation-tp17113p17214.html
> Sent from the Apache CarbonData Dev Mailing List archive mailing list
> archive at Nabble.com.
>
Reply | Threaded
Open this post in threaded view
|

Re: [Discussion] CarbonOutputFormat Implementation

Erlu Chen
Thanks very much.

After you have raised a PR, we can start review.


Regards.
Chenerlu.
Reply | Threaded
Open this post in threaded view
|

Re: [Discussion] CarbonOutputFormat Implementation

kumarvishal09
+1
It's a long pending task.
-Regards
Kumar Vishal

Sent from my iPhone

> On 04-Jul-2017, at 16:26, Erlu Chen <[hidden email]> wrote:
>
> Thanks very much.
>
> After you have raised a PR, we can start review.
>
>
> Regards.
> Chenerlu.
>
>
>
> --
> View this message in context: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Discussion-CarbonOutputFormat-Implementation-tp17113p17239.html
> Sent from the Apache CarbonData Dev Mailing List archive mailing list archive at Nabble.com.
kumar vishal
Reply | Threaded
Open this post in threaded view
|

Re: [Discussion] CarbonOutputFormat Implementation

Venkata Gollamudi
+1
OutputFormat should be based on single pass and with similar job
configurations as CarbonInputFormat.
Please output initial design and code skeleton, for review before
proceeding for implementation.

On Tue, Jul 4, 2017 at 4:30 PM, Kumar Vishal <[hidden email]>
wrote:

> +1
> It's a long pending task.
> -Regards
> Kumar Vishal
>
> Sent from my iPhone
>
> > On 04-Jul-2017, at 16:26, Erlu Chen <[hidden email]> wrote:
> >
> > Thanks very much.
> >
> > After you have raised a PR, we can start review.
> >
> >
> > Regards.
> > Chenerlu.
> >
> >
> >
> > --
> > View this message in context: http://apache-carbondata-dev-
> mailing-list-archive.1130556.n5.nabble.com/Discussion-CarbonOutputFormat-
> Implementation-tp17113p17239.html
> > Sent from the Apache CarbonData Dev Mailing List archive mailing list
> archive at Nabble.com.
>
Reply | Threaded
Open this post in threaded view
|

Re: [Discussion] CarbonOutputFormat Implementation

Jacky Li
+1.

For carbon data files, I think there should be at least two OutputFormat,
1) FileOutputFormat, which will not do sorting and write to carbondata file only. This will be used in GLOBAL_SORT option
2) TableOutputFormat, which will do sorting according to SORT_SCOPE option, and use Single Pass to load

And I think dictionary should be another OutputFormat.
So user can combine to use dictionary output format and carbondata file output format.

I suggest to firstly check the usage scenario and decide the class hierarchy of this feature.

Regards,
Jacky

> 在 2017年7月4日,下午8:37,Venkata Gollamudi <[hidden email]> 写道:
>
> +1
> OutputFormat should be based on single pass and with similar job
> configurations as CarbonInputFormat.
> Please output initial design and code skeleton, for review before
> proceeding for implementation.
>
> On Tue, Jul 4, 2017 at 4:30 PM, Kumar Vishal <[hidden email]>
> wrote:
>
>> +1
>> It's a long pending task.
>> -Regards
>> Kumar Vishal
>>
>> Sent from my iPhone
>>
>>> On 04-Jul-2017, at 16:26, Erlu Chen <[hidden email]> wrote:
>>>
>>> Thanks very much.
>>>
>>> After you have raised a PR, we can start review.
>>>
>>>
>>> Regards.
>>> Chenerlu.
>>>
>>>
>>>
>>> --
>>> View this message in context: http://apache-carbondata-dev-
>> mailing-list-archive.1130556.n5.nabble.com/Discussion-CarbonOutputFormat-
>> Implementation-tp17113p17239.html
>>> Sent from the Apache CarbonData Dev Mailing List archive mailing list
>> archive at Nabble.com.
>>



Reply | Threaded
Open this post in threaded view
|

Re: [Discussion] CarbonOutputFormat Implementation

Divya Gupta
Thanks Jacky and Venkata for the suggestions. I am working on the design
part and will post on this discussion in case of any queries. I will share
the design soon.

Regards
Divya Gupta
Project Lead


*Knoldus Software LLP <http://www.knoldus.com/>*
India <http://www.knoldus.in/> - US <http://www.knoldus.com/> - Canada
<http://www.knoldus.ca/>
<http://www.knoldus.com/>
Blog <http://blog.knoldus.com/> | Twitter <https://twitter.com/knolspeak> |
FB <https://www.facebook.com/KnoldusSoftware> | LinkedIn
<http://www.linkedin.com/company/knoldus-software-llp->

On Wed, Jul 5, 2017 at 9:14 AM, Jacky Li <[hidden email]> wrote:

> +1.
>
> For carbon data files, I think there should be at least two OutputFormat,
> 1) FileOutputFormat, which will not do sorting and write to carbondata
> file only. This will be used in GLOBAL_SORT option
> 2) TableOutputFormat, which will do sorting according to SORT_SCOPE
> option, and use Single Pass to load
>
> And I think dictionary should be another OutputFormat.
> So user can combine to use dictionary output format and carbondata file
> output format.
>
> I suggest to firstly check the usage scenario and decide the class
> hierarchy of this feature.
>
> Regards,
> Jacky
>
> > 在 2017年7月4日,下午8:37,Venkata Gollamudi <[hidden email]> 写道:
> >
> > +1
> > OutputFormat should be based on single pass and with similar job
> > configurations as CarbonInputFormat.
> > Please output initial design and code skeleton, for review before
> > proceeding for implementation.
> >
> > On Tue, Jul 4, 2017 at 4:30 PM, Kumar Vishal <[hidden email]>
> > wrote:
> >
> >> +1
> >> It's a long pending task.
> >> -Regards
> >> Kumar Vishal
> >>
> >> Sent from my iPhone
> >>
> >>> On 04-Jul-2017, at 16:26, Erlu Chen <[hidden email]> wrote:
> >>>
> >>> Thanks very much.
> >>>
> >>> After you have raised a PR, we can start review.
> >>>
> >>>
> >>> Regards.
> >>> Chenerlu.
> >>>
> >>>
> >>>
> >>> --
> >>> View this message in context: http://apache-carbondata-dev-
> >> mailing-list-archive.1130556.n5.nabble.com/Discussion-
> CarbonOutputFormat-
> >> Implementation-tp17113p17239.html
> >>> Sent from the Apache CarbonData Dev Mailing List archive mailing list
> >> archive at Nabble.com.
> >>
>
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [Discussion] CarbonOutputFormat Implementation

Liang Chen
Administrator
Hi

+1 for supporting OutputFormat.

Regards
Liang

Divya Gupta wrote
Thanks Jacky and Venkata for the suggestions. I am working on the design
part and will post on this discussion in case of any queries. I will share
the design soon.

Regards
Divya Gupta
Project Lead


*Knoldus Software LLP <http://www.knoldus.com/>*
India <http://www.knoldus.in/> - US <http://www.knoldus.com/> - Canada
<http://www.knoldus.ca/>
<http://www.knoldus.com/>
Blog <http://blog.knoldus.com/> | Twitter <https://twitter.com/knolspeak> |
FB <https://www.facebook.com/KnoldusSoftware> | LinkedIn
<http://www.linkedin.com/company/knoldus-software-llp->

On Wed, Jul 5, 2017 at 9:14 AM, Jacky Li <[hidden email]> wrote:

> +1.
>
> For carbon data files, I think there should be at least two OutputFormat,
> 1) FileOutputFormat, which will not do sorting and write to carbondata
> file only. This will be used in GLOBAL_SORT option
> 2) TableOutputFormat, which will do sorting according to SORT_SCOPE
> option, and use Single Pass to load
>
> And I think dictionary should be another OutputFormat.
> So user can combine to use dictionary output format and carbondata file
> output format.
>
> I suggest to firstly check the usage scenario and decide the class
> hierarchy of this feature.
>
> Regards,
> Jacky
>
> > 在 2017年7月4日,下午8:37,Venkata Gollamudi <[hidden email]> 写道:
> >
> > +1
> > OutputFormat should be based on single pass and with similar job
> > configurations as CarbonInputFormat.
> > Please output initial design and code skeleton, for review before
> > proceeding for implementation.
> >
> > On Tue, Jul 4, 2017 at 4:30 PM, Kumar Vishal <[hidden email]>
> > wrote:
> >
> >> +1
> >> It's a long pending task.
> >> -Regards
> >> Kumar Vishal
> >>
> >> Sent from my iPhone
> >>
> >>> On 04-Jul-2017, at 16:26, Erlu Chen <[hidden email]> wrote:
> >>>
> >>> Thanks very much.
> >>>
> >>> After you have raised a PR, we can start review.
> >>>
> >>>
> >>> Regards.
> >>> Chenerlu.
> >>>
> >>>
> >>>
> >>> --
> >>> View this message in context: http://apache-carbondata-dev-
> >> mailing-list-archive.1130556.n5.nabble.com/Discussion-
> CarbonOutputFormat-
> >> Implementation-tp17113p17239.html
> >>> Sent from the Apache CarbonData Dev Mailing List archive mailing list
> >> archive at Nabble.com.
> >>
>
>
>
>