CarbonData has implemented CarbonInputFomat, which enable applications
using Hive, Presto and other similar tools to read data from Carbon. Similarly there should be implementation for CarbonOutputFomat also. This will enable Hive, Presto or similar applications, using Carbondata as a datasource, to write and load data to Carbondata files. Regards Divya Gupta |
Hi Divya
Thanks for your suggestion. Carbondata may support it in the near future. If you want to contribute this feature, I think it will benefit community a lot. Regards. Chenerlu. |
Thanks for the quick reply Chenerlu.
I would surely like to contribute this feature and will start working towards CARBONDATA-729. Regards Divya Gupta Regards Divya Gupta Project Lead *Knoldus Software LLP <http://www.knoldus.com/>* India <http://www.knoldus.in/> - US <http://www.knoldus.com/> - Canada <http://www.knoldus.ca/> <http://www.knoldus.com/> Blog <http://blog.knoldus.com/> | Twitter <https://twitter.com/knolspeak> | FB <https://www.facebook.com/KnoldusSoftware> | LinkedIn <http://www.linkedin.com/company/knoldus-software-llp-> On Tue, Jul 4, 2017 at 2:37 PM, Erlu Chen <[hidden email]> wrote: > Hi Divya > > Thanks for your suggestion. > > Carbondata may support it in the near future. > > If you want to contribute this feature, I think it will benefit community a > lot. > > > Regards. > Chenerlu. > > > > -- > View this message in context: http://apache-carbondata-dev- > mailing-list-archive.1130556.n5.nabble.com/Discussion-CarbonOutputFormat- > Implementation-tp17113p17214.html > Sent from the Apache CarbonData Dev Mailing List archive mailing list > archive at Nabble.com. > |
Thanks very much.
After you have raised a PR, we can start review. Regards. Chenerlu. |
+1
It's a long pending task. -Regards Kumar Vishal Sent from my iPhone > On 04-Jul-2017, at 16:26, Erlu Chen <[hidden email]> wrote: > > Thanks very much. > > After you have raised a PR, we can start review. > > > Regards. > Chenerlu. > > > > -- > View this message in context: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Discussion-CarbonOutputFormat-Implementation-tp17113p17239.html > Sent from the Apache CarbonData Dev Mailing List archive mailing list archive at Nabble.com.
kumar vishal
|
+1
OutputFormat should be based on single pass and with similar job configurations as CarbonInputFormat. Please output initial design and code skeleton, for review before proceeding for implementation. On Tue, Jul 4, 2017 at 4:30 PM, Kumar Vishal <[hidden email]> wrote: > +1 > It's a long pending task. > -Regards > Kumar Vishal > > Sent from my iPhone > > > On 04-Jul-2017, at 16:26, Erlu Chen <[hidden email]> wrote: > > > > Thanks very much. > > > > After you have raised a PR, we can start review. > > > > > > Regards. > > Chenerlu. > > > > > > > > -- > > View this message in context: http://apache-carbondata-dev- > mailing-list-archive.1130556.n5.nabble.com/Discussion-CarbonOutputFormat- > Implementation-tp17113p17239.html > > Sent from the Apache CarbonData Dev Mailing List archive mailing list > archive at Nabble.com. > |
+1.
For carbon data files, I think there should be at least two OutputFormat, 1) FileOutputFormat, which will not do sorting and write to carbondata file only. This will be used in GLOBAL_SORT option 2) TableOutputFormat, which will do sorting according to SORT_SCOPE option, and use Single Pass to load And I think dictionary should be another OutputFormat. So user can combine to use dictionary output format and carbondata file output format. I suggest to firstly check the usage scenario and decide the class hierarchy of this feature. Regards, Jacky > 在 2017年7月4日,下午8:37,Venkata Gollamudi <[hidden email]> 写道: > > +1 > OutputFormat should be based on single pass and with similar job > configurations as CarbonInputFormat. > Please output initial design and code skeleton, for review before > proceeding for implementation. > > On Tue, Jul 4, 2017 at 4:30 PM, Kumar Vishal <[hidden email]> > wrote: > >> +1 >> It's a long pending task. >> -Regards >> Kumar Vishal >> >> Sent from my iPhone >> >>> On 04-Jul-2017, at 16:26, Erlu Chen <[hidden email]> wrote: >>> >>> Thanks very much. >>> >>> After you have raised a PR, we can start review. >>> >>> >>> Regards. >>> Chenerlu. >>> >>> >>> >>> -- >>> View this message in context: http://apache-carbondata-dev- >> mailing-list-archive.1130556.n5.nabble.com/Discussion-CarbonOutputFormat- >> Implementation-tp17113p17239.html >>> Sent from the Apache CarbonData Dev Mailing List archive mailing list >> archive at Nabble.com. >> |
Thanks Jacky and Venkata for the suggestions. I am working on the design
part and will post on this discussion in case of any queries. I will share the design soon. Regards Divya Gupta Project Lead *Knoldus Software LLP <http://www.knoldus.com/>* India <http://www.knoldus.in/> - US <http://www.knoldus.com/> - Canada <http://www.knoldus.ca/> <http://www.knoldus.com/> Blog <http://blog.knoldus.com/> | Twitter <https://twitter.com/knolspeak> | FB <https://www.facebook.com/KnoldusSoftware> | LinkedIn <http://www.linkedin.com/company/knoldus-software-llp-> On Wed, Jul 5, 2017 at 9:14 AM, Jacky Li <[hidden email]> wrote: > +1. > > For carbon data files, I think there should be at least two OutputFormat, > 1) FileOutputFormat, which will not do sorting and write to carbondata > file only. This will be used in GLOBAL_SORT option > 2) TableOutputFormat, which will do sorting according to SORT_SCOPE > option, and use Single Pass to load > > And I think dictionary should be another OutputFormat. > So user can combine to use dictionary output format and carbondata file > output format. > > I suggest to firstly check the usage scenario and decide the class > hierarchy of this feature. > > Regards, > Jacky > > > 在 2017年7月4日,下午8:37,Venkata Gollamudi <[hidden email]> 写道: > > > > +1 > > OutputFormat should be based on single pass and with similar job > > configurations as CarbonInputFormat. > > Please output initial design and code skeleton, for review before > > proceeding for implementation. > > > > On Tue, Jul 4, 2017 at 4:30 PM, Kumar Vishal <[hidden email]> > > wrote: > > > >> +1 > >> It's a long pending task. > >> -Regards > >> Kumar Vishal > >> > >> Sent from my iPhone > >> > >>> On 04-Jul-2017, at 16:26, Erlu Chen <[hidden email]> wrote: > >>> > >>> Thanks very much. > >>> > >>> After you have raised a PR, we can start review. > >>> > >>> > >>> Regards. > >>> Chenerlu. > >>> > >>> > >>> > >>> -- > >>> View this message in context: http://apache-carbondata-dev- > >> mailing-list-archive.1130556.n5.nabble.com/Discussion- > CarbonOutputFormat- > >> Implementation-tp17113p17239.html > >>> Sent from the Apache CarbonData Dev Mailing List archive mailing list > >> archive at Nabble.com. > >> > > > > |
Administrator
|
Hi
+1 for supporting OutputFormat. Regards Liang
|
Free forum by Nabble | Edit this page |