[GitHub] [carbondata] MarvinLitt opened a new pull request #3514: [FAQ]add faq for how to deal with trailing task

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] MarvinLitt opened a new pull request #3514: [FAQ]add faq for how to deal with trailing task

GitBox
MarvinLitt opened a new pull request #3514: [FAQ]add faq for how to deal with trailing task
URL: https://github.com/apache/carbondata/pull/3514
 
 
   Be sure to do all of the following checklist to help us incorporate
   your contribution quickly and easily:
   
    - [ ] Any interfaces changed?
   
    - [ ] Any backward compatibility impacted?
   
    - [ ] Document update required?
   
    - [ ] Testing done
           Please provide details on
           - Whether new unit test cases have been added or why no new tests are required?
           - How it is tested? Please attach test report.
           - Is it a performance related change? Please attach the performance test report.
           - Any additional information to help reviewers in testing this change.
         
    - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] jackylk commented on a change in pull request #3514: [FAQ]add faq for how to deal with trailing task

GitBox
jackylk commented on a change in pull request #3514: [FAQ]add faq for how to deal with trailing task
URL: https://github.com/apache/carbondata/pull/3514#discussion_r361909663
 
 

 ##########
 File path: docs/faq.md
 ##########
 @@ -227,6 +228,29 @@ This property will enable the DEBUG log for the CarbonLRUCache and UnsafeMemoryM
 **Note:** If  `Removed entry from InMemory LRU cache` are frequently observed in logs, you may have to increase the configured LRU size.
 
 To observe the LRU cache from heap dump, check the heap used by CarbonLRUCache class.
+
+## How to deal with the trailing task in query?
+
+During the tuning process, it may be found that a few tasks slow down the overall query progress.  If the amount of data processed is the same, people will naturally think about the impact of IO, CPU and network bandwidth. Usually these tests can't able to have a quick result. So we need a way to solve and deal with these problems more quickly. spark.locality.wait and spark.speculation configuration it's an attempt, which can make the task that executes overtime retry in other nodes as soon as possible, and finally the task that ends first will be used. This may lose some of the data locality, but the actual verification helps to reduce the time-consuming of the trailing task.
 
 Review comment:
   ```suggestion
   When tuning query performance, user may found that a few tasks slow down the overall query progress.  To improve performance in such case, user can set spark.locality.wait and spark.speculation=true to enable speculation in spark, which will launch multiple task and get the result the one of the task which is finished first. Besides, user can also consider following configurations to further improve performance in this case.
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3514: [FAQ]add faq for how to deal with trailing task

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3514: [FAQ]add faq for how to deal with trailing task
URL: https://github.com/apache/carbondata/pull/3514#issuecomment-569627248
 
 
   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1375/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] jackylk commented on issue #3514: [FAQ]add faq for how to deal with trailing task

GitBox
In reply to this post by GitBox
jackylk commented on issue #3514: [FAQ]add faq for how to deal with trailing task
URL: https://github.com/apache/carbondata/pull/3514#issuecomment-569627377
 
 
   I canceled CI for this PR since it is for document modification
   LGTM, merging this PR

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] asfgit closed pull request #3514: [FAQ]add faq for how to deal with trailing task

GitBox
In reply to this post by GitBox
asfgit closed pull request #3514: [FAQ]add faq for how to deal with trailing task
URL: https://github.com/apache/carbondata/pull/3514
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3514: [FAQ]add faq for how to deal with trailing task

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3514: [FAQ]add faq for how to deal with trailing task
URL: https://github.com/apache/carbondata/pull/3514#issuecomment-569644106
 
 
   Build Failed with Spark 2.2.1, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.2/1365/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3514: [FAQ]add faq for how to deal with trailing task

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3514: [FAQ]add faq for how to deal with trailing task
URL: https://github.com/apache/carbondata/pull/3514#issuecomment-569650060
 
 
   Build Success with Spark 2.1.0, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.1/1355/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services