[GitHub] [carbondata] nihal0107 opened a new pull request #4116: [CARBONDATA-4162] Leverage Secondary Index till segment level with Spark plan rewrite

classic Classic list List threaded Threaded
41 messages Options
123
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] nihal0107 opened a new pull request #4116: [CARBONDATA-4162] Leverage Secondary Index till segment level with Spark plan rewrite

GitBox

nihal0107 opened a new pull request #4116:
URL: https://github.com/apache/carbondata/pull/4116


    ### Why is this PR needed?
    In the existing architecture, if the parent(main) table and SI table don’t have the same valid segments then we disable the SI table. And then from the next query onwards, we scan and prune only the parent table until we trigger the next load or REINDEX command (as these commands will make the parent and SI table segments in sync). Because of this, queries take more time to give the result when SI is disabled.
   
    ### What changes were proposed in this PR?
   1. Instead of disabling the SI table(when parent and child table segments are not in sync) we will do pruning on SI tables for all the valid segments(segments with status success, marked for update and load partial success) and the rest of the segments will be pruned by the parent table.
   2. Now, different SI tables may contain different numbers of segments. In that case, made the changes to identify the best fit SI table based on segment count. If more than one SI table contains the same segment count then identify the best fit SI table based on the current design.
       
    ### Does this PR introduce any user interface change?
    - No
   
    ### Is any new testcase added?
    - Yes
   
       
   


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4116: [CARBONDATA-4162] Leverage Secondary Index till segment level with Spark plan rewrite

GitBox

CarbonDataQA2 commented on pull request #4116:
URL: https://github.com/apache/carbondata/pull/4116#issuecomment-813379819


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/3374/
   


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4116: [CARBONDATA-4162] Leverage Secondary Index till segment level with Spark plan rewrite

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4116:
URL: https://github.com/apache/carbondata/pull/4116#issuecomment-813382532


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5125/
   


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4116: [WIP][CARBONDATA-4162] Leverage Secondary Index till segment level with Spark plan rewrite

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4116:
URL: https://github.com/apache/carbondata/pull/4116#issuecomment-813850926


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/3378/
   


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4116: [WIP][CARBONDATA-4162] Leverage Secondary Index till segment level with Spark plan rewrite

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4116:
URL: https://github.com/apache/carbondata/pull/4116#issuecomment-813851440


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5129/
   


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4116: [WIP][CARBONDATA-4162] Leverage Secondary Index till segment level with Spark plan rewrite

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4116:
URL: https://github.com/apache/carbondata/pull/4116#issuecomment-813910798


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/3382/
   


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4116: [WIP][CARBONDATA-4162] Leverage Secondary Index till segment level with Spark plan rewrite

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4116:
URL: https://github.com/apache/carbondata/pull/4116#issuecomment-813911488


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5133/
   


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4116: [WIP][CARBONDATA-4162] Leverage Secondary Index till segment level with Spark plan rewrite

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4116:
URL: https://github.com/apache/carbondata/pull/4116#issuecomment-814159308


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/3384/
   


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4116: [WIP][CARBONDATA-4162] Leverage Secondary Index till segment level with Spark plan rewrite

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4116:
URL: https://github.com/apache/carbondata/pull/4116#issuecomment-814164174


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5135/
   


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4116: [CARBONDATA-4162] Leverage Secondary Index till segment level.

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4116:
URL: https://github.com/apache/carbondata/pull/4116#issuecomment-824710356


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5226/
   


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4116: [CARBONDATA-4162] Leverage Secondary Index till segment level.

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4116:
URL: https://github.com/apache/carbondata/pull/4116#issuecomment-824715708


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/3478/
   


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4116: [CARBONDATA-4162] Leverage Secondary Index till segment level.

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4116:
URL: https://github.com/apache/carbondata/pull/4116#issuecomment-824823324


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5231/
   


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4116: [CARBONDATA-4162] Leverage Secondary Index till segment level.

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4116:
URL: https://github.com/apache/carbondata/pull/4116#issuecomment-824827486


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/3483/
   


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] nihal0107 commented on pull request #4116: [CARBONDATA-4162] Leverage Secondary Index till segment level.

GitBox
In reply to this post by GitBox

nihal0107 commented on pull request #4116:
URL: https://github.com/apache/carbondata/pull/4116#issuecomment-825389939


   Retest this please.


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4116: [CARBONDATA-4162] Leverage Secondary Index till segment level.

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4116:
URL: https://github.com/apache/carbondata/pull/4116#issuecomment-825429510


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/3491/
   


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4116: [CARBONDATA-4162] Leverage Secondary Index till segment level.

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4116:
URL: https://github.com/apache/carbondata/pull/4116#issuecomment-825433658


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5239/
   


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] kunal642 commented on pull request #4116: [CARBONDATA-4162] Leverage Secondary Index till segment level.

GitBox
In reply to this post by GitBox

kunal642 commented on pull request #4116:
URL: https://github.com/apache/carbondata/pull/4116#issuecomment-825546110


   @nihal0107 the existing impl of external segment support with SI uses a similar approach(prune some segments from SI and others from main table), please try to use the existing flow. We can avoid the filter level changes through it
   


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] nihal0107 commented on pull request #4116: [CARBONDATA-4162] Leverage Secondary Index till segment level.

GitBox
In reply to this post by GitBox

nihal0107 commented on pull request #4116:
URL: https://github.com/apache/carbondata/pull/4116#issuecomment-826695471


   > @nihal0107 the existing impl of external segment support with SI uses a similar approach(prune some segments from SI and others from main table), please try to use the existing flow. We can avoid the filter level changes through it
   
   ok, done the changes and handled missing SI segments as external segment at the time of pruning with main table.


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4116: [CARBONDATA-4162] Leverage Secondary Index till segment level.

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4116:
URL: https://github.com/apache/carbondata/pull/4116#issuecomment-826718982


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5260/
   


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4116: [CARBONDATA-4162] Leverage Secondary Index till segment level.

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4116:
URL: https://github.com/apache/carbondata/pull/4116#issuecomment-826728041


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/3514/
   


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


123