GitHub user xuchuanyin opened a pull request:
https://github.com/apache/carbondata/pull/1175 [CARBONDATA-1281] Support multiple temp dirs for writing temp files while loading # Modifications This feature mainly focus on avoiding disk hot-spot in single massive data loading, changes are made in two parts: 1. randomly choose a yarn local folder to write sort temp file in sort-process; 2.randomly choose a yarn local folder to write carbondata file in write-process. # Usage To enable this feature, user should enable `carbon.using.multi.temp.dir=true` and `carbon.use.local.dir=true`. # Performance In my case, this feature improves the loading performance from 35M/s/node to 70+M/s/node You can merge this pull request into a Git repository by running: $ git pull https://github.com/xuchuanyin/carbondata feature_multiple_temp_dir_for_load Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1175.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1175 ---- commit 35cba3d49ccb536328545fd705020edcc50189af Author: xuchuanyin <[hidden email]> Date: 2017-07-05T12:56:33Z Merge pull request #3 from apache/master sync code commit c778f22dbda7f0b36a69a10db5d008744cdb99f3 Author: ravikiran23 <[hidden email]> Date: 2017-06-22T12:48:19Z [CARBONDATA-1214] Changing the delete syntax as in the hive for segment deletion This closes #1078 commit 46a53962d4913b06d0f4be61c48053106da4a108 Author: jackylk <[hidden email]> Date: 2017-07-03T13:54:39Z modify compare test fix fix style change table commit a54dda9b7dcd0749e6b187f2579a5b867f421eaf Author: jatin <[hidden email]> Date: 2017-07-05T12:04:19Z [CARBONDATA-1266][PRESTO] Fixed issue for non existing table This closes #1137 commit 32e3d1f7a4ff09309ebf0d4e7315e5dbef2765b4 Author: xuchuanyin <[hidden email]> Date: 2017-07-05T13:00:45Z [CARBONDATA-1267] Add short_int case branch in DeltaIntegalCodec This closes #1139 commit a479d1672bfbe1c2b92b44737f93a2e14cbef2a7 Author: Geetika Gupta <[hidden email]> Date: 2017-07-06T06:14:06Z [CARBONDATA-1269][PRESTO] Fixed bug for select operation in non existing database This closes #1143 commit b69771370f95926999b7cb0501c38ae7202cebf3 Author: sgururajshetty <[hidden email]> Date: 2017-07-06T06:23:38Z [CARBONDATA-1270] Documentation update for Delete by ID and DATE syntax and example This closes #1141 commit 4d2d518a3d5176d70f81d83c5782ef56f7118800 Author: ashok.blend <[hidden email]> Date: 2017-07-08T10:57:41Z [CARBONDATA-1282] Choose BatchedDatasource scan only if schema fits codegen This closes #1148 commit 5e74e50547a98e2452b296f839a020d978225cf2 Author: chenliang613 <[hidden email]> Date: 2017-07-08T15:53:02Z [CARBONDATA-1280] Solve HiveExample dependency issues and fix spark 1.6 CI This closes #1150 commit 79a777052a8e5ead117f7424a6daf974dc405c26 Author: Liang Chen <[hidden email]> Date: 2017-07-08T22:32:10Z fix doc, remove invalid description This closes #1151 commit 16938770f54eaf5ec646df2692418e726d9defd4 Author: kunalkapoor <[hidden email]> Date: 2017-07-10T06:42:10Z [CARBONDATA-1229] acquired meta.lock during table drop This closes #1153 commit f5e4bb083f166d62de38267c7503ab3609e1fcca Author: czg516516 <[hidden email]> Date: 2017-07-11T03:01:47Z [CARBONDATA-1289] remove unused method This closes #1157 commit 7e433115a925e6a477d8de157596cc6eb16dfa17 Author: xuchuanyin <[hidden email]> Date: 2017-07-15T03:14:46Z fix confilicts commit 76b071846c226348b3a934edfda63454ee973254 Author: xuchuanyin <[hidden email]> Date: 2017-07-15T06:12:35Z Support multiple temp dirs for writing files while loading data ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
Github user asfgit commented on the issue:
https://github.com/apache/carbondata/pull/1175 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user asfgit commented on the issue:
https://github.com/apache/carbondata/pull/1175 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1175 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3088/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1175 Build Success with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/498/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on the issue:
https://github.com/apache/carbondata/pull/1175 @xuchuanyin please squash all commits to one commit. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user xuchuanyin closed the pull request at:
https://github.com/apache/carbondata/pull/1175 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on the issue:
https://github.com/apache/carbondata/pull/1175 @chenliang613 OK, I'll raise another PR. #1177 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
Free forum by Nabble | Edit this page |