Discussing to add carbondata-tools module

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Discussing to add carbondata-tools module

David CaiQiang
Hi all,

  To improve the CarbonData system's usability and maintainability, I suggest to add carbondata-tools module.
  I think this module should provide some command tools as following.

  1. import
  import a data file/folder to any existing table

  2. export
  export the given columns to a file

  3. schema
  show the detail information of the specified table schema file

  4. metadata
  show tablestatus metadata
  show the history track for dataloading and compaction

  5. footer
  show blocklet metadata list
  show start/end key, min/max value, row number, total size  for the specified blocklet

  6. blocklet
  show blocket list
  show blocket data, RLE map, Inverted index for the given columns

  7. index
  show BTree node list
  show node information:start/end key, min/max value)

  8. dictionary
  show  key-value list of specified gloabl/local dictionary file
  show sort index
  show dictionary metadata

Thank you and look forward to having your opinion on this carbon-tools module.

David Cai
Best Regards
David Cai
Reply | Threaded
Open this post in threaded view
|

Re: Discussing to add carbondata-tools module

Liang Chen
Administrator
Thanks for you started the discussion, these tools look good.

I am trying to understand why need these tools : Can you provide some scenarios info, these tools could help users to solve what issues?

Regards
Liang
Reply | Threaded
Open this post in threaded view
|

Re: Discussing to add carbondata-tools module

Jean-Baptiste Onofré
In reply to this post by David CaiQiang
+1

Regards
JB

On 08/05/2016 08:31 AM, QiangCai wrote:

> Hi all,
>
>   To improve the CarbonData system's usability and maintainability, I
> suggest to add carbondata-tools module.
>   I think this module should provide some command tools as following.
>
>   1. import
>   import a data file/folder to any existing table
>
>   2. export
>   export the given columns to a file
>
>   3. schema
>   show the detail information of the specified table schema file
>
>   4. metadata
>   show tablestatus metadata
>   show the history track for dataloading and compaction
>
>   5. footer
>   show blocklet metadata list
>   show start/end key, min/max value, row number, total size  for the
> specified blocklet
>
>   6. blocklet
>   show blocket list
>   show blocket data, RLE map, Inverted index for the given columns
>
>   7. index
>   show BTree node list
>   show node information:start/end key, min/max value)
>
>   8. dictionary
>   show  key-value list of specified gloabl/local dictionary file
>   show sort index
>   show dictionary metadata
>
> Thank you and look forward to having your opinion on this carbon-tools
> module.
>
> David Cai
>
>
>
> --
> View this message in context: http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/Discussing-to-add-carbondata-tools-module-tp4.html
> Sent from the Apache CarbonData Mailing List archive mailing list archive at Nabble.com.
>

--
Jean-Baptiste Onofré
[hidden email]
http://blog.nanthrax.net
Talend - http://www.talend.com
Reply | Threaded
Open this post in threaded view
|

Re: Discussing to add carbondata-tools module

Jean-Baptiste Onofré
In reply to this post by David CaiQiang
I guess it's where we can put the checkstyle, ... (what we have in the
dev folder right now), correct ?

Regards
JB

On 08/05/2016 08:31 AM, QiangCai wrote:

> Hi all,
>
>   To improve the CarbonData system's usability and maintainability, I
> suggest to add carbondata-tools module.
>   I think this module should provide some command tools as following.
>
>   1. import
>   import a data file/folder to any existing table
>
>   2. export
>   export the given columns to a file
>
>   3. schema
>   show the detail information of the specified table schema file
>
>   4. metadata
>   show tablestatus metadata
>   show the history track for dataloading and compaction
>
>   5. footer
>   show blocklet metadata list
>   show start/end key, min/max value, row number, total size  for the
> specified blocklet
>
>   6. blocklet
>   show blocket list
>   show blocket data, RLE map, Inverted index for the given columns
>
>   7. index
>   show BTree node list
>   show node information:start/end key, min/max value)
>
>   8. dictionary
>   show  key-value list of specified gloabl/local dictionary file
>   show sort index
>   show dictionary metadata
>
> Thank you and look forward to having your opinion on this carbon-tools
> module.
>
> David Cai
>
>
>
> --
> View this message in context: http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/Discussing-to-add-carbondata-tools-module-tp4.html
> Sent from the Apache CarbonData Mailing List archive mailing list archive at Nabble.com.
>

--
Jean-Baptiste Onofré
[hidden email]
http://blog.nanthrax.net
Talend - http://www.talend.com
Reply | Threaded
Open this post in threaded view
|

Re: Discussing to add carbondata-tools module

Vimal Das Kammath
+1
On Aug 5, 2016 2:45 PM, "Jean-Baptiste Onofré" <[hidden email]> wrote:

> I guess it's where we can put the checkstyle, ... (what we have in the dev
> folder right now), correct ?
>
> Regards
> JB
>
> On 08/05/2016 08:31 AM, QiangCai wrote:
>
>> Hi all,
>>
>>   To improve the CarbonData system's usability and maintainability, I
>> suggest to add carbondata-tools module.
>>   I think this module should provide some command tools as following.
>>
>>   1. import
>>   import a data file/folder to any existing table
>>
>>   2. export
>>   export the given columns to a file
>>
>>   3. schema
>>   show the detail information of the specified table schema file
>>
>>   4. metadata
>>   show tablestatus metadata
>>   show the history track for dataloading and compaction
>>
>>   5. footer
>>   show blocklet metadata list
>>   show start/end key, min/max value, row number, total size  for the
>> specified blocklet
>>
>>   6. blocklet
>>   show blocket list
>>   show blocket data, RLE map, Inverted index for the given columns
>>
>>   7. index
>>   show BTree node list
>>   show node information:start/end key, min/max value)
>>
>>   8. dictionary
>>   show  key-value list of specified gloabl/local dictionary file
>>   show sort index
>>   show dictionary metadata
>>
>> Thank you and look forward to having your opinion on this carbon-tools
>> module.
>>
>> David Cai
>>
>>
>>
>> --
>> View this message in context: http://apache-carbondata-maili
>> ng-list-archive.1130556.n5.nabble.com/Discussing-to-add-
>> carbondata-tools-module-tp4.html
>> Sent from the Apache CarbonData Mailing List archive mailing list archive
>> at Nabble.com.
>>
>>
> --
> Jean-Baptiste Onofré
> [hidden email]
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>
Reply | Threaded
Open this post in threaded view
|

Re: Discussing to add carbondata-tools module

金铸
In reply to this post by Jean-Baptiste Onofré
+1


在 2016/8/5 17:15, Jean-Baptiste Onofré 写道:

> I guess it's where we can put the checkstyle, ... (what we have in the
> dev folder right now), correct ?
>
> Regards
> JB
>
> On 08/05/2016 08:31 AM, QiangCai wrote:
>> Hi all,
>>
>>   To improve the CarbonData system's usability and maintainability, I
>> suggest to add carbondata-tools module.
>>   I think this module should provide some command tools as following.
>>
>>   1. import
>>   import a data file/folder to any existing table
>>
>>   2. export
>>   export the given columns to a file
>>
>>   3. schema
>>   show the detail information of the specified table schema file
>>
>>   4. metadata
>>   show tablestatus metadata
>>   show the history track for dataloading and compaction
>>
>>   5. footer
>>   show blocklet metadata list
>>   show start/end key, min/max value, row number, total size  for the
>> specified blocklet
>>
>>   6. blocklet
>>   show blocket list
>>   show blocket data, RLE map, Inverted index for the given columns
>>
>>   7. index
>>   show BTree node list
>>   show node information:start/end key, min/max value)
>>
>>   8. dictionary
>>   show  key-value list of specified gloabl/local dictionary file
>>   show sort index
>>   show dictionary metadata
>>
>> Thank you and look forward to having your opinion on this carbon-tools
>> module.
>>
>> David Cai
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/Discussing-to-add-carbondata-tools-module-tp4.html
>> Sent from the Apache CarbonData Mailing List archive mailing list
>> archive at Nabble.com.
>>
>

--
金铸
技术发展部(TDD)
东软集团股份有限公司
沈阳浑南新区新秀街2号东软软件园A2-105A
Postcode:110179
Tel: (86 24)8366 2049
Mobile:13897999526

 



---------------------------------------------------------------------------------------------------
Confidentiality Notice: The information contained in this e-mail and any accompanying attachment(s)
is intended only for the use of the intended recipient and may be confidential and/or privileged of
Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of this communication is
not the intended recipient, unauthorized use, forwarding, printing,  storing, disclosure or copying
is strictly prohibited, and may be unlawful.If you have received this communication in error,please
immediately notify the sender by return e-mail, and delete the original message and all copies from
your system. Thank you.
---------------------------------------------------------------------------------------------------
Reply | Threaded
Open this post in threaded view
|

Re: Discussing to add carbondata-tools module

Venkata Gollamudi
+1
These list looks like both user needed tools for maintenance and dev tools.
Mostly they can be separated.

On Fri, Aug 5, 2016, 6:03 PM 金铸 <[hidden email]> wrote:

> +1
>
>
> 在 2016/8/5 17:15, Jean-Baptiste Onofré 写道:
> > I guess it's where we can put the checkstyle, ... (what we have in the
> > dev folder right now), correct ?
> >
> > Regards
> > JB
> >
> > On 08/05/2016 08:31 AM, QiangCai wrote:
> >> Hi all,
> >>
> >>   To improve the CarbonData system's usability and maintainability, I
> >> suggest to add carbondata-tools module.
> >>   I think this module should provide some command tools as following.
> >>
> >>   1. import
> >>   import a data file/folder to any existing table
> >>
> >>   2. export
> >>   export the given columns to a file
> >>
> >>   3. schema
> >>   show the detail information of the specified table schema file
> >>
> >>   4. metadata
> >>   show tablestatus metadata
> >>   show the history track for dataloading and compaction
> >>
> >>   5. footer
> >>   show blocklet metadata list
> >>   show start/end key, min/max value, row number, total size  for the
> >> specified blocklet
> >>
> >>   6. blocklet
> >>   show blocket list
> >>   show blocket data, RLE map, Inverted index for the given columns
> >>
> >>   7. index
> >>   show BTree node list
> >>   show node information:start/end key, min/max value)
> >>
> >>   8. dictionary
> >>   show  key-value list of specified gloabl/local dictionary file
> >>   show sort index
> >>   show dictionary metadata
> >>
> >> Thank you and look forward to having your opinion on this carbon-tools
> >> module.
> >>
> >> David Cai
> >>
> >>
> >>
> >> --
> >> View this message in context:
> >>
> http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/Discussing-to-add-carbondata-tools-module-tp4.html
> >> Sent from the Apache CarbonData Mailing List archive mailing list
> >> archive at Nabble.com.
> >>
> >
>
> --
> 金铸
> 技术发展部(TDD)
> 东软集团股份有限公司
> 沈阳浑南新区新秀街2号东软软件园A2-105A
> Postcode:110179
> Tel: (86 24)8366 2049
> Mobile:13897999526
>
>
>
>
>
>
> ---------------------------------------------------------------------------------------------------
> Confidentiality Notice: The information contained in this e-mail and any
> accompanying attachment(s)
> is intended only for the use of the intended recipient and may be
> confidential and/or privileged of
> Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader
> of this communication is
> not the intended recipient, unauthorized use, forwarding, printing,
> storing, disclosure or copying
> is strictly prohibited, and may be unlawful.If you have received this
> communication in error,please
> immediately notify the sender by return e-mail, and delete the original
> message and all copies from
> your system. Thank you.
>
> ---------------------------------------------------------------------------------------------------
>
Reply | Threaded
Open this post in threaded view
|

Re: Discussing to add carbondata-tools module

sraghunandan
+1
I suggest we separate the tools as usability and Dev/debugging tools
Take up the usability tools with highest priority

On Mon, 8 Aug 2016 at 10:32 AM, Venkata Gollamudi <[hidden email]>
wrote:

> +1
> These list looks like both user needed tools for maintenance and dev tools.
> Mostly they can be separated.
>
> On Fri, Aug 5, 2016, 6:03 PM 金铸 <[hidden email]> wrote:
>
> > +1
> >
> >
> > 在 2016/8/5 17:15, Jean-Baptiste Onofré 写道:
> > > I guess it's where we can put the checkstyle, ... (what we have in the
> > > dev folder right now), correct ?
> > >
> > > Regards
> > > JB
> > >
> > > On 08/05/2016 08:31 AM, QiangCai wrote:
> > >> Hi all,
> > >>
> > >>   To improve the CarbonData system's usability and maintainability, I
> > >> suggest to add carbondata-tools module.
> > >>   I think this module should provide some command tools as following.
> > >>
> > >>   1. import
> > >>   import a data file/folder to any existing table
> > >>
> > >>   2. export
> > >>   export the given columns to a file
> > >>
> > >>   3. schema
> > >>   show the detail information of the specified table schema file
> > >>
> > >>   4. metadata
> > >>   show tablestatus metadata
> > >>   show the history track for dataloading and compaction
> > >>
> > >>   5. footer
> > >>   show blocklet metadata list
> > >>   show start/end key, min/max value, row number, total size  for the
> > >> specified blocklet
> > >>
> > >>   6. blocklet
> > >>   show blocket list
> > >>   show blocket data, RLE map, Inverted index for the given columns
> > >>
> > >>   7. index
> > >>   show BTree node list
> > >>   show node information:start/end key, min/max value)
> > >>
> > >>   8. dictionary
> > >>   show  key-value list of specified gloabl/local dictionary file
> > >>   show sort index
> > >>   show dictionary metadata
> > >>
> > >> Thank you and look forward to having your opinion on this carbon-tools
> > >> module.
> > >>
> > >> David Cai
> > >>
> > >>
> > >>
> > >> --
> > >> View this message in context:
> > >>
> >
> http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/Discussing-to-add-carbondata-tools-module-tp4.html
> > >> Sent from the Apache CarbonData Mailing List archive mailing list
> > >> archive at Nabble.com.
> > >>
> > >
> >
> > --
> > 金铸
> > 技术发展部(TDD)
> > 东软集团股份有限公司
> > 沈阳浑南新区新秀街2号东软软件园A2-105A
> > Postcode:110179
> > Tel: (86 24)8366 2049
> > Mobile:13897999526
> >
> >
> >
> >
> >
> >
> >
> ---------------------------------------------------------------------------------------------------
> > Confidentiality Notice: The information contained in this e-mail and any
> > accompanying attachment(s)
> > is intended only for the use of the intended recipient and may be
> > confidential and/or privileged of
> > Neusoft Corporation, its subsidiaries and/or its affiliates. If any
> reader
> > of this communication is
> > not the intended recipient, unauthorized use, forwarding, printing,
> > storing, disclosure or copying
> > is strictly prohibited, and may be unlawful.If you have received this
> > communication in error,please
> > immediately notify the sender by return e-mail, and delete the original
> > message and all copies from
> > your system. Thank you.
> >
> >
> ---------------------------------------------------------------------------------------------------
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Discussing to add carbondata-tools module

Jihong Ma
In reply to this post by David CaiQiang
+1

It is a good idea to develop tools for easy interacting with Carbon. I would like to suggest to have sub-module under tools to further categorize them based on their usage: command line tools, util for printing metadata...

Jihong

Sent from HUAWEI AnyOffice
From: Venkata Gollamudi
To: [hidden email];
Subject: Re: Discussing to add carbondata-tools module

Time: 2016-08-08 13:02:16
+1
These list looks like both user needed tools for maintenance and dev tools.
Mostly they can be separated.

On Fri, Aug 5, 2016, 6:03 PM 金铸 <[hidden email]> wrote:

> +1
>
>
> 在 2016/8/5 17:15, Jean-Baptiste Onofré 写道:
> > I guess it's where we can put the checkstyle, ... (what we have in the
> > dev folder right now), correct ?
> >
> > Regards
> > JB
> >
> > On 08/05/2016 08:31 AM, QiangCai wrote:
> >> Hi all,
> >>
> >>   To improve the CarbonData system's usability and maintainability, I
> >> suggest to add carbondata-tools module.
> >>   I think this module should provide some command tools as following.
> >>
> >>   1. import
> >>   import a data file/folder to any existing table
> >>
> >>   2. export
> >>   export the given columns to a file
> >>
> >>   3. schema
> >>   show the detail information of the specified table schema file
> >>
> >>   4. metadata
> >>   show tablestatus metadata
> >>   show the history track for dataloading and compaction
> >>
> >>   5. footer
> >>   show blocklet metadata list
> >>   show start/end key, min/max value, row number, total size  for the
> >> specified blocklet
> >>
> >>   6. blocklet
> >>   show blocket list
> >>   show blocket data, RLE map, Inverted index for the given columns
> >>
> >>   7. index
> >>   show BTree node list
> >>   show node information:start/end key, min/max value)
> >>
> >>   8. dictionary
> >>   show  key-value list of specified gloabl/local dictionary file
> >>   show sort index
> >>   show dictionary metadata
> >>
> >> Thank you and look forward to having your opinion on this carbon-tools
> >> module.
> >>
> >> David Cai
> >>
> >>
> >>
> >> --
> >> View this message in context:
> >>
> http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/Discussing-to-add-carbondata-tools-module-tp4.html
> >> Sent from the Apache CarbonData Mailing List archive mailing list
> >> archive at Nabble.com.
> >>
> >
>
> --
> 金铸
> 技术发展部(TDD)
> 东软集团股份有限公司
> 沈阳浑南新区新秀街2号东软软件园A2-105A
> Postcode:110179
> Tel: (86 24)8366 2049
> Mobile:13897999526
>
>
>
>
>
>
> ---------------------------------------------------------------------------------------------------
> Confidentiality Notice: The information contained in this e-mail and any
> accompanying attachment(s)
> is intended only for the use of the intended recipient and may be
> confidential and/or privileged of
> Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader
> of this communication is
> not the intended recipient, unauthorized use, forwarding, printing,
> storing, disclosure or copying
> is strictly prohibited, and may be unlawful.If you have received this
> communication in error,please
> immediately notify the sender by return e-mail, and delete the original
> message and all copies from
> your system. Thank you.
>
> ---------------------------------------------------------------------------------------------------
>
Reply | Threaded
Open this post in threaded view
|

Re: Discussing to add carbondata-tools module

Jacky Li
+1


As already pointed out by Jihong and Venkata, I think these tools can be separated for end user and developer:
For end user: care about the schema, file size, number of records, and maybe some stats about the table, like the “desc formatted table” command in SQL
For developer: including above and more information to help debuging and reasoning about performance issue, for example, calculate and print out how good is the multidimensional index in the sense of order and sparsity, or how good is it for each column.

Feel free to create a ticket and separate them in phases for implementation.

Regards,
Jacky

> 在 2016年8月8日,下午7:47,Jihong Ma <[hidden email]> 写道:
>
> +1
>
> It is a good idea to develop tools for easy interacting with Carbon. I would like to suggest to have sub-module under tools to further categorize them based on their usage: command line tools, util for printing metadata...
>
> Jihong
>
> Sent from HUAWEI AnyOffice
> From: Venkata Gollamudi
> To: [hidden email];
> Subject: Re: Discussing to add carbondata-tools module
>
> Time: 2016-08-08 13:02:16
> +1
> These list looks like both user needed tools for maintenance and dev tools.
> Mostly they can be separated.
>
> On Fri, Aug 5, 2016, 6:03 PM 金铸 <[hidden email]> wrote:
>
>> +1
>>
>>
>> 在 2016/8/5 17:15, Jean-Baptiste Onofré 写道:
>>> I guess it's where we can put the checkstyle, ... (what we have in the
>>> dev folder right now), correct ?
>>>
>>> Regards
>>> JB
>>>
>>> On 08/05/2016 08:31 AM, QiangCai wrote:
>>>> Hi all,
>>>>
>>>>  To improve the CarbonData system's usability and maintainability, I
>>>> suggest to add carbondata-tools module.
>>>>  I think this module should provide some command tools as following.
>>>>
>>>>  1. import
>>>>  import a data file/folder to any existing table
>>>>
>>>>  2. export
>>>>  export the given columns to a file
>>>>
>>>>  3. schema
>>>>  show the detail information of the specified table schema file
>>>>
>>>>  4. metadata
>>>>  show tablestatus metadata
>>>>  show the history track for dataloading and compaction
>>>>
>>>>  5. footer
>>>>  show blocklet metadata list
>>>>  show start/end key, min/max value, row number, total size  for the
>>>> specified blocklet
>>>>
>>>>  6. blocklet
>>>>  show blocket list
>>>>  show blocket data, RLE map, Inverted index for the given columns
>>>>
>>>>  7. index
>>>>  show BTree node list
>>>>  show node information:start/end key, min/max value)
>>>>
>>>>  8. dictionary
>>>>  show  key-value list of specified gloabl/local dictionary file
>>>>  show sort index
>>>>  show dictionary metadata
>>>>
>>>> Thank you and look forward to having your opinion on this carbon-tools
>>>> module.
>>>>
>>>> David Cai
>>>>
>>>>
>>>>
>>>> --
>>>> View this message in context:
>>>>
>> http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/Discussing-to-add-carbondata-tools-module-tp4.html
>>>> Sent from the Apache CarbonData Mailing List archive mailing list
>>>> archive at Nabble.com.
>>>>
>>>
>>
>> --
>> 金铸
>> 技术发展部(TDD)
>> 东软集团股份有限公司
>> 沈阳浑南新区新秀街2号东软软件园A2-105A
>> Postcode:110179
>> Tel: (86 24)8366 2049
>> Mobile:13897999526
>>
>>
>>
>>
>>
>>
>> ---------------------------------------------------------------------------------------------------
>> Confidentiality Notice: The information contained in this e-mail and any
>> accompanying attachment(s)
>> is intended only for the use of the intended recipient and may be
>> confidential and/or privileged of
>> Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader
>> of this communication is
>> not the intended recipient, unauthorized use, forwarding, printing,
>> storing, disclosure or copying
>> is strictly prohibited, and may be unlawful.If you have received this
>> communication in error,please
>> immediately notify the sender by return e-mail, and delete the original
>> message and all copies from
>> your system. Thank you.
>>
>> ---------------------------------------------------------------------------------------------------
>>