Login  Register

[GitHub] [carbondata] VenuReddy2103 opened a new pull request #3972: [WIP]Launch same number of task as select query for insert into select and ctas cases when target table is of no_sort

Posted by GitBox on Oct 08, 2020; 10:55am
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/GitHub-carbondata-VenuReddy2103-opened-a-new-pull-request-3972-WIP-Launch-same-number-of-task-as-selt-tp101521.html


VenuReddy2103 opened a new pull request #3972:
URL: https://github.com/apache/carbondata/pull/3972


     ### Why is this PR needed?
    At present, When we do insert into table select from or create table as select from, we lauch one single task per node. Whereas when we do a simple select * from table query, tasks launched are equal to number of carbondata files(CARBON_TASK_DISTRIBUTION default is CARBON_TASK_DISTRIBUTION_BLOCK).
   <p> Thus, slows down the load performance of insert into select and ctas cases.
   Refer [Community discussion regd. task lauch](http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Discussion-Query-Regarding-Task-launch-mechanism-for-data-load-operations-tt98711.html)
    ### What changes were proposed in this PR?
   Lauch the same number of tasks as in select query for insert into select and ctas cases when the target table is of no-sort.
       
    ### Does this PR introduce any user interface change?
    - No
   
    ### Is any new testcase added?
    - No
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]