[jira] [Created] (CARBONDATA-4042) Insert into select and CTAS launches fewer tasks(limited to max nodes) even when target table is of no_sort
Posted by
Akash R Nilugal (Jira) on
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/jira-Created-CARBONDATA-4042-Insert-into-select-and-CTAS-launches-fewer-tasks-limited-to-max-nodes-et-tp102695.html
Venugopal Reddy K created CARBONDATA-4042:
---------------------------------------------
Summary: Insert into select and CTAS launches fewer tasks(limited to max nodes) even when target table is of no_sort
Key: CARBONDATA-4042
URL:
https://issues.apache.org/jira/browse/CARBONDATA-4042 Project: CarbonData
Issue Type: Improvement
Components: data-load, spark-integration
Reporter: Venugopal Reddy K
*Issue:*
At present, When we do insert into table select from or create table as select from, we lauch one single task per node. Whereas when we do a simple select * from table query, tasks launched are equal to number of carbondata files(CARBON_TASK_DISTRIBUTION default is CARBON_TASK_DISTRIBUTION_BLOCK).
Thus, slows down the load performance of insert into select and ctas cases.
Refer [Community discussion regd. task lauch|
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Discussion-Query-Regarding-Task-launch-mechanism-for-data-load-operations-tt98711.html]
*Suggestion:*
Lauch the same number of tasks as in select query for insert into select and ctas cases when the target table is of no-sort.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)