ajantha-bhat opened a new pull request #3901: URL: https://github.com/apache/carbondata/pull/3901 ### Why is this PR needed? #3764 has added nosort (this is wrong code, but no functional impact as it was not changing new segment load to no_sort) #3856 has changed it to no_sort (creates a functional impact by changing target table new segment to use to no_sort) ### What changes were proposed in this PR? CDC update as new segment should use target table sort_scope ### Does this PR introduce any user interface change? - No ### Is any new testcase added? - No (verified manually the flows) ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
ajantha-bhat commented on pull request #3901: URL: https://github.com/apache/carbondata/pull/3901#issuecomment-680714175 @QiangCai , @ravipesala @marchpure @akashrn5 : please check this ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3901: URL: https://github.com/apache/carbondata/pull/3901#issuecomment-680771095 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3875/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3901: URL: https://github.com/apache/carbondata/pull/3901#issuecomment-680773026 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2134/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
Zhangshunyu commented on pull request #3901: URL: https://github.com/apache/carbondata/pull/3901#issuecomment-682379495 set 'no_sort' for cdc is for better load performance during merge, but i think we should keep same as target table. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
ajantha-bhat commented on pull request #3901: URL: https://github.com/apache/carbondata/pull/3901#issuecomment-682380195 > set 'no_sort' for cdc is for better load performance during merge, but i think we should keep same as target table. @Zhangshunyu : But the target table itself can be created with no_sort. Now some segments can be sorted and some are not. so, I fixed it. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
ajantha-bhat edited a comment on pull request #3901: URL: https://github.com/apache/carbondata/pull/3901#issuecomment-682380195 > set 'no_sort' for cdc is for better load performance during merge, but i think we should keep same as target table. @Zhangshunyu : But the target table itself can be created with no_sort. Now if target table is global sort, old segments are sorted and new CDC segmets are not. so, I fixed it. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
ajantha-bhat edited a comment on pull request #3901: URL: https://github.com/apache/carbondata/pull/3901#issuecomment-682380195 > set 'no_sort' for cdc is for better load performance during merge, but i think we should keep same as target table. @Zhangshunyu : to have a faster CDC merge, target table itself can be created with no_sort. Now if target table is global sort, old segments are sorted and new CDC segmets are not. so, I fixed it. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
Zhangshunyu commented on pull request #3901: URL: https://github.com/apache/carbondata/pull/3901#issuecomment-682381033 @ajantha-bhat yes, agree with this pr's change, to keep same as target. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
akashrn5 commented on pull request #3901: URL: https://github.com/apache/carbondata/pull/3901#issuecomment-682408178 @ajantha-bhat if the target table is no sort and since we are inserting new segment as a separate segment during merge, we can sort this segment and write which will help in query, instead of blindly going with target table sort? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
ajantha-bhat commented on pull request #3901: URL: https://github.com/apache/carbondata/pull/3901#issuecomment-682419143 > @ajantha-bhat if the target table is no sort and since we are inserting new segment as a separate segment during merge, we can sort this segment and write which will help in query, instead of blindly going with target table sort? It is not blindly. The user has decided whether his table needs to be sorted or not based on his requirement (no_sort if want good load speed, global_sort if want good query speed), so it is better to have all segment follow user decision. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
asfgit closed pull request #3901: URL: https://github.com/apache/carbondata/pull/3901 ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
Free forum by Nabble | Edit this page |