[DISCUSSION] Update the function of show segments

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

[DISCUSSION] Update the function of show segments

Erlu Chen
Hi, dev

Currently, I am thinking about the function of show segments. We can see
segments of carbon table by executing this command, but it can only return
segmentId, status, load start time and load end time, and all this
information is from tablestatus, which I think it may be not enough for
users to know better about the situation of each segment, so now I want to
add two parameters, one is the number of carbon data file under segment
folder, another is the number of carbon index file under segment folder.

Any suggestion about my idea ?

Welcome to communicate.

Regards.
Chenerlu.



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSSION] Update the function of show segments

xuchuanyin-2
Nice.

what about the update status of the segment? Maybe someone are interested in the last modify time of a segment.





On 09/16/2017 17:14, Erlu Chen wrote:
Hi, dev

Currently, I am thinking about the function of show segments. We can see
segments of carbon table by executing this command, but it can only return
segmentId, status, load start time and load end time, and all this
information is from tablestatus, which I think it may be not enough for
users to know better about the situation of each segment, so now I want to
add two parameters, one is the number of carbon data file under segment
folder, another is the number of carbon index file under segment folder.

Any suggestion about my idea ?

Welcome to communicate.

Regards.
Chenerlu.



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSSION] Update the function of show segments

sraghunandan
What is the use case? When user would be interested in knowing number of
files?
On Sat, 16 Sep 2017 at 3:12 PM, xuchuanyin <[hidden email]> wrote:

> Nice.
>
> what about the update status of the segment? Maybe someone are interested
> in the last modify time of a segment.
>
>
>
>
>
> On 09/16/2017 17:14, Erlu Chen wrote:
> Hi, dev
>
> Currently, I am thinking about the function of show segments. We can see
> segments of carbon table by executing this command, but it can only return
> segmentId, status, load start time and load end time, and all this
> information is from tablestatus, which I think it may be not enough for
> users to know better about the situation of each segment, so now I want to
> add two parameters, one is the number of carbon data file under segment
> folder, another is the number of carbon index file under segment folder.
>
> Any suggestion about my idea ?
>
> Welcome to communicate.
>
> Regards.
> Chenerlu.
>
>
>
> --
> Sent from:
> http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSSION] Update the function of show segments

Jacky Li
I think it is good to have this feature, it may help user to decide whether manual compaction is needed.

Instead of outputting number of carbon index file per segment, I suggest output data size of the segment is more helpful.

Regards,
Jacky

> 在 2017年9月20日,上午11:46,Raghunandan S <[hidden email]> 写道:
>
> What is the use case? When user would be interested in knowing number of
> files?
> On Sat, 16 Sep 2017 at 3:12 PM, xuchuanyin <[hidden email]> wrote:
>
>> Nice.
>>
>> what about the update status of the segment? Maybe someone are interested
>> in the last modify time of a segment.
>>
>>
>>
>>
>>
>> On 09/16/2017 17:14, Erlu Chen wrote:
>> Hi, dev
>>
>> Currently, I am thinking about the function of show segments. We can see
>> segments of carbon table by executing this command, but it can only return
>> segmentId, status, load start time and load end time, and all this
>> information is from tablestatus, which I think it may be not enough for
>> users to know better about the situation of each segment, so now I want to
>> add two parameters, one is the number of carbon data file under segment
>> folder, another is the number of carbon index file under segment folder.
>>
>> Any suggestion about my idea ?
>>
>> Welcome to communicate.
>>
>> Regards.
>> Chenerlu.
>>
>>
>>
>> --
>> Sent from:
>> http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
>>

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSSION] Update the function of show segments

David CaiQiang
I agree with Jacky.

I think enhanced segment metadata will help us to understand the table.  

I suggest the following properties for segment metadata:
1. total data file size
2. total index file size
3. data file count
4. index file count
5. last modified time (last update time)

Through these information,  we can answer the following questions.
1. Is there small file issue? Whether table require compaction or not, which
type should be used?
2. Whether index files is too many or not?  we will can estimate the total
size of index in memory whether it is big or small for driver memory
configuration.
3. Whether some segment has too many files?  Maybe it is useful to locate
some performance issue.



-----
Best Regards
David Cai
--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Best Regards
David Cai
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSSION] Update the function of show segments

ravipesala
Hi,

I agree with Jacky and David.
But it is suggested to keep current 'show segments' command without any
change and provide only brief information about segments.
Add new extended command like `extended show segments` to provide more
information which is required for power user.

Regards, only
Ravindra.

On 21 September 2017 at 09:03, David CaiQiang <[hidden email]> wrote:

> I agree with Jacky.
>
> I think enhanced segment metadata will help us to understand the table.
>
> I suggest the following properties for segment metadata:
> 1. total data file size
> 2. total index file size
> 3. data file count
> 4. index file count
> 5. last modified time (last update time)
>
> Through these information,  we can answer the following questions.
> 1. Is there small file issue? Whether table require compaction or not,
> which
> type should be used?
> 2. Whether index files is too many or not?  we will can estimate the total
> size of index in memory whether it is big or small for driver memory
> configuration.
> 3. Whether some segment has too many files?  Maybe it is useful to locate
> some performance issue.
>
>
>
> -----
> Best Regards
> David Cai
> --
> Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.
> n5.nabble.com/
>



--
Thanks & Regards,
Ravi
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSSION] Update the function of show segments

xuchuanyin-2
If adding a new statement, I suggest to learn from hive:
desc formatted table_name;
VS
desc table_name;

Show segment...
VS
Show formatted segment...




On 09/21/2017 14:02, Ravindra Pesala wrote:
Hi,

I agree with Jacky and David.
But it is suggested to keep current 'show segments' command without any
change and provide only brief information about segments.
Add new extended command like `extended show segments` to provide more
information which is required for power user.

Regards, only
Ravindra.

On 21 September 2017 at 09:03, David CaiQiang <[hidden email]> wrote:

> I agree with Jacky.
>
> I think enhanced segment metadata will help us to understand the table.
>
> I suggest the following properties for segment metadata:
> 1. total data file size
> 2. total index file size
> 3. data file count
> 4. index file count
> 5. last modified time (last update time)
>
> Through these information,  we can answer the following questions.
> 1. Is there small file issue? Whether table require compaction or not,
> which
> type should be used?
> 2. Whether index files is too many or not?  we will can estimate the total
> size of index in memory whether it is big or small for driver memory
> configuration.
> 3. Whether some segment has too many files?  Maybe it is useful to locate
> some performance issue.
>
>
>
> -----
> Best Regards
> David Cai
> --
> Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.
> n5.nabble.com/
>



--
Thanks & Regards,
Ravi
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSSION] Update the function of show segments

Erlu Chen
In reply to this post by ravipesala
Yeah. agree with ravi.

We can keep both "Show segments"  and "Show extended segment" .

@xuchuanyin, as i know currently the result of show segment is formatted.

Regards.
Chenerlu.



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSSION] Update the function of show segments

Erlu Chen
In reply to this post by ravipesala
Yeah. agree with ravi.

We can keep both "Show segments"  and "Show extended segment" .

@xuchuanyin, as i know currently the result of show segment is formatted.

Regards.
Chenerlu.



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/