jackylk opened a new pull request #3643: [CARBONDATA-3726] Add CDC and Data Dedup example
URL: https://github.com/apache/carbondata/pull/3643 ### Why is this PR needed? Change Data Capture and Data Duplication is common tasks for data management, we should add examples for user to reference We can leverage CarbonData's Merge API to implement these use cases ### What changes were proposed in this PR? 1. Two examples are added 2. A performance improvement: earlier carbon always use 'full_outer' join to do Merge, in this PR we decide Join type based on match condition 3. Make auditor enable dynamically, user can disable it. ### Does this PR introduce any user interface change? - No ### Is any new testcase added? - No ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
CarbonDataQA1 commented on issue #3643: [CARBONDATA-3726] Add CDC and Data Dedup example
URL: https://github.com/apache/carbondata/pull/3643#issuecomment-592469513 Build Failed with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/528/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3643: [CARBONDATA-3726] Add CDC and Data Dedup example
URL: https://github.com/apache/carbondata/pull/3643#issuecomment-592476576 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2228/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
jackylk commented on issue #3643: [CARBONDATA-3726] Add CDC and Data Dedup example
URL: https://github.com/apache/carbondata/pull/3643#issuecomment-592900548 retest this please ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3643: [CARBONDATA-3726] Add CDC and Data Dedup example
URL: https://github.com/apache/carbondata/pull/3643#issuecomment-592913627 Build Failed with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/536/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3643: [CARBONDATA-3726] Add CDC and Data Dedup example
URL: https://github.com/apache/carbondata/pull/3643#issuecomment-592916515 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2236/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
jackylk commented on issue #3643: [CARBONDATA-3726] Add CDC and Data Dedup example
URL: https://github.com/apache/carbondata/pull/3643#issuecomment-592926820 retest this please ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3643: [CARBONDATA-3726] Add CDC and Data Dedup example
URL: https://github.com/apache/carbondata/pull/3643#issuecomment-592928253 Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/541/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3643: [CARBONDATA-3726] Add CDC and Data Dedup example
URL: https://github.com/apache/carbondata/pull/3643#issuecomment-592935246 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2241/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
niuge01 commented on a change in pull request #3643: [CARBONDATA-3726] Add CDC and Data Dedup example
URL: https://github.com/apache/carbondata/pull/3643#discussion_r389453999 ########## File path: processing/src/main/java/org/apache/carbondata/processing/util/Auditor.java ########## @@ -58,18 +58,26 @@ } } + private static boolean enable = true; + + public static void enable(boolean bool) { Review comment: Config it with system property will be better. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
niuge01 commented on a change in pull request #3643: [CARBONDATA-3726] Add CDC and Data Dedup example
URL: https://github.com/apache/carbondata/pull/3643#discussion_r389454190 ########## File path: processing/src/main/java/org/apache/carbondata/processing/util/Auditor.java ########## @@ -58,18 +58,26 @@ } } + private static boolean enable = true; Review comment: Make this field final, and use upper field name. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
niuge01 commented on issue #3643: [CARBONDATA-3726] Add CDC and Data Dedup example
URL: https://github.com/apache/carbondata/pull/3643#issuecomment-596361719 LGTM ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3643: [CARBONDATA-3726] Add CDC and Data Dedup example
URL: https://github.com/apache/carbondata/pull/3643#issuecomment-596395332 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2394/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3643: [CARBONDATA-3726] Add CDC and Data Dedup example
URL: https://github.com/apache/carbondata/pull/3643#issuecomment-596397641 Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/688/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
asfgit closed pull request #3643: [CARBONDATA-3726] Add CDC and Data Dedup example
URL: https://github.com/apache/carbondata/pull/3643 ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
Free forum by Nabble | Edit this page |