[GitHub] [carbondata] marchpure opened a new pull request #3792: [CARBONDATA-3856] Support the LIMIT operator for show segments command

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] marchpure opened a new pull request #3792: [CARBONDATA-3856] Support the LIMIT operator for show segments command

GitBox

marchpure opened a new pull request #3792:
URL: https://github.com/apache/carbondata/pull/3792


    ### Why is this PR needed?
    When the number segments is large, "SHOW SEGMENTS" time overhead is too high. The LIMIT operator shall be supported in the SHOW SEGMENTS command.
   
    ### What changes were proposed in this PR?
   Add LIMIT operator in the grammar parsing and read segments function.
       
    ### Does this PR introduce any user interface change?
   Yes
   
    ### Is any new testcase added?
   Yes
       
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3792: [CARBONDATA-3856] Support the LIMIT operator for show segments command

GitBox

CarbonDataQA1 commented on pull request #3792:
URL: https://github.com/apache/carbondata/pull/3792#issuecomment-645594860


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1443/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3792: [CARBONDATA-3856] Support the LIMIT operator for show segments command

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3792:
URL: https://github.com/apache/carbondata/pull/3792#issuecomment-645595925


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3168/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3792: [CARBONDATA-3856] Support the LIMIT operator for show segments command

GitBox
In reply to this post by GitBox

Indhumathi27 commented on a change in pull request #3792:
URL: https://github.com/apache/carbondata/pull/3792#discussion_r442128902



##########
File path: docs/segment-management-on-carbondata.md
##########
@@ -54,6 +54,12 @@ concept which helps to maintain consistency of data and easy transaction managem
   SHOW SEGMENTS ON CarbonDatabase.CarbonTable
   ```
 
+  Show lastest 10 visible segments

Review comment:
       After compaction also, it should show latest segments right? For example, there are 6 segments, after major compaction, show segments with limit 2 will display 4.1 and 5th segment only, whereas latest segments are 0.2 and 0.3. I think we should get latest segments based on timestamp.

##########
File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/segment/ShowSegmentTestCase.scala
##########
@@ -191,7 +214,11 @@ class ShowSegmentTestCase extends QueryTest with BeforeAndAfterAll {
     sql("drop table if exists a")
     sql("create table a(a string) stored as carbondata")
     sql("insert into a select 'k'")
+    sql("insert into a select 'j'")
+    sql("insert into a select 'k'")
     val rows = sql("show segments for table a").collect()
+    assert(sql(s"show segments for table a").collect().length == 3)

Review comment:
       ```suggestion
       assert(rows.length == 3)
   ```

##########
File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/segmentreading/TestSegmentReading.scala
##########
@@ -251,13 +251,16 @@ class TestSegmentReading extends QueryTest with BeforeAndAfterAll {
         s"""LOAD DATA local inpath '$resourcesPath/data.csv' INTO TABLE carbon_table_show_seg OPTIONS
             |('DELIMITER'= ',', 'QUOTECHAR'= '\"')""".stripMargin)
       val df = sql("SHOW SEGMENTS for table carbon_table_show_seg as select * from carbon_table_show_seg_segments")
+      sql("SHOW SEGMENTS for table carbon_table_show_seg as select * from carbon_table_show_seg_segments").show()

Review comment:
       Can remove this line




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] marchpure commented on a change in pull request #3792: [CARBONDATA-3856] Support the LIMIT operator for show segments command

GitBox
In reply to this post by GitBox

marchpure commented on a change in pull request #3792:
URL: https://github.com/apache/carbondata/pull/3792#discussion_r442151742



##########
File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/segment/ShowSegmentTestCase.scala
##########
@@ -191,7 +214,11 @@ class ShowSegmentTestCase extends QueryTest with BeforeAndAfterAll {
     sql("drop table if exists a")
     sql("create table a(a string) stored as carbondata")
     sql("insert into a select 'k'")
+    sql("insert into a select 'j'")
+    sql("insert into a select 'k'")
     val rows = sql("show segments for table a").collect()
+    assert(sql(s"show segments for table a").collect().length == 3)

Review comment:
       modified

##########
File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/segmentreading/TestSegmentReading.scala
##########
@@ -251,13 +251,16 @@ class TestSegmentReading extends QueryTest with BeforeAndAfterAll {
         s"""LOAD DATA local inpath '$resourcesPath/data.csv' INTO TABLE carbon_table_show_seg OPTIONS
             |('DELIMITER'= ',', 'QUOTECHAR'= '\"')""".stripMargin)
       val df = sql("SHOW SEGMENTS for table carbon_table_show_seg as select * from carbon_table_show_seg_segments")
+      sql("SHOW SEGMENTS for table carbon_table_show_seg as select * from carbon_table_show_seg_segments").show()

Review comment:
       modified




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] marchpure commented on a change in pull request #3792: [CARBONDATA-3856] Support the LIMIT operator for show segments command

GitBox
In reply to this post by GitBox

marchpure commented on a change in pull request #3792:
URL: https://github.com/apache/carbondata/pull/3792#discussion_r442154411



##########
File path: docs/segment-management-on-carbondata.md
##########
@@ -54,6 +54,12 @@ concept which helps to maintain consistency of data and easy transaction managem
   SHOW SEGMENTS ON CarbonDatabase.CarbonTable
   ```
 
+  Show lastest 10 visible segments

Review comment:
       Yeah. currently, the showed segments are ordered by loadname desc. in the type of:
   +---+---------+-----------------------+
   |ID |Status   |Load Start Time        |
   +---+---------+-----------------------+
   |5  |Compacted|2020-06-18 04:20:09.041|
   |4.1|Success      |2020-06-18 04:20:09.041|
   |4  |Compacted|2020-06-18 04:20:08.69 |
   |3  |Compacted|2020-06-18 04:20:07.622|
   |2.1|Compacted|2020-06-18 04:20:07.622|
   |2  |Compacted|2020-06-18 04:20:07.226|




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3792: [CARBONDATA-3856] Support the LIMIT operator for show segments command

GitBox
In reply to this post by GitBox

Indhumathi27 commented on a change in pull request #3792:
URL: https://github.com/apache/carbondata/pull/3792#discussion_r442174299



##########
File path: docs/segment-management-on-carbondata.md
##########
@@ -54,6 +54,12 @@ concept which helps to maintain consistency of data and easy transaction managem
   SHOW SEGMENTS ON CarbonDatabase.CarbonTable
   ```
 
+  Show lastest 10 visible segments

Review comment:
       yes. i think it is better to display latest segments based on timestamp on limit




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3792: [CARBONDATA-3856] Support the LIMIT operator for show segments command

GitBox
In reply to this post by GitBox

Indhumathi27 commented on a change in pull request #3792:
URL: https://github.com/apache/carbondata/pull/3792#discussion_r442174299



##########
File path: docs/segment-management-on-carbondata.md
##########
@@ -54,6 +54,12 @@ concept which helps to maintain consistency of data and easy transaction managem
   SHOW SEGMENTS ON CarbonDatabase.CarbonTable
   ```
 
+  Show lastest 10 visible segments

Review comment:
       yes. i think it is better to display latest segments based on timestamp on query with limit




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3792: [CARBONDATA-3856] Support the LIMIT operator for show segments command

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3792:
URL: https://github.com/apache/carbondata/pull/3792#issuecomment-646019412


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3169/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3792: [CARBONDATA-3856] Support the LIMIT operator for show segments command

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3792:
URL: https://github.com/apache/carbondata/pull/3792#issuecomment-646020176


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1444/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] Indhumathi27 commented on pull request #3792: [CARBONDATA-3856] Support the LIMIT operator for show segments command

GitBox
In reply to this post by GitBox

Indhumathi27 commented on pull request #3792:
URL: https://github.com/apache/carbondata/pull/3792#issuecomment-646429823


   LGTM


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] asfgit closed pull request #3792: [CARBONDATA-3856] Support the LIMIT operator for show segments command

GitBox
In reply to this post by GitBox

asfgit closed pull request #3792:
URL: https://github.com/apache/carbondata/pull/3792


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]