[GitHub] carbondata pull request #1175: [CARBONDATA-1281] Support multiple temp dirs ...

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1175: [CARBONDATA-1281] Support multiple temp dirs ...

qiuchenjian-2
GitHub user xuchuanyin opened a pull request:

    https://github.com/apache/carbondata/pull/1175

    [CARBONDATA-1281] Support multiple temp dirs for writing temp files while loading

    # Modifications
    This feature mainly focus on avoiding disk hot-spot in single massive data loading, changes are made in two parts:
   
    1. randomly choose a yarn local folder to write sort temp file in sort-process;
   
    2.randomly choose a yarn local folder to write carbondata file in write-process.
   
    # Usage
   
    To enable this feature, user should enable `carbon.using.multi.temp.dir=true` and `carbon.use.local.dir=true`.
   
    # Performance
    In my case, this feature improves the loading performance from 35M/s/node to 70+M/s/node


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/xuchuanyin/carbondata feature_multiple_temp_dir_for_load

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/1175.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1175
   
----
commit 35cba3d49ccb536328545fd705020edcc50189af
Author: xuchuanyin <[hidden email]>
Date:   2017-07-05T12:56:33Z

    Merge pull request #3 from apache/master
   
    sync code

commit c778f22dbda7f0b36a69a10db5d008744cdb99f3
Author: ravikiran23 <[hidden email]>
Date:   2017-06-22T12:48:19Z

    [CARBONDATA-1214] Changing the delete syntax as in the hive for segment deletion
   
    This closes #1078

commit 46a53962d4913b06d0f4be61c48053106da4a108
Author: jackylk <[hidden email]>
Date:   2017-07-03T13:54:39Z

    modify compare test
   
    fix
   
    fix style
   
    change table

commit a54dda9b7dcd0749e6b187f2579a5b867f421eaf
Author: jatin <[hidden email]>
Date:   2017-07-05T12:04:19Z

    [CARBONDATA-1266][PRESTO] Fixed issue for non existing table
   
    This closes #1137

commit 32e3d1f7a4ff09309ebf0d4e7315e5dbef2765b4
Author: xuchuanyin <[hidden email]>
Date:   2017-07-05T13:00:45Z

    [CARBONDATA-1267] Add short_int case branch in DeltaIntegalCodec
   
    This closes #1139

commit a479d1672bfbe1c2b92b44737f93a2e14cbef2a7
Author: Geetika Gupta <[hidden email]>
Date:   2017-07-06T06:14:06Z

    [CARBONDATA-1269][PRESTO] Fixed bug for select operation in non existing database
   
    This closes #1143

commit b69771370f95926999b7cb0501c38ae7202cebf3
Author: sgururajshetty <[hidden email]>
Date:   2017-07-06T06:23:38Z

    [CARBONDATA-1270] Documentation update for Delete by ID and DATE syntax and example
   
    This closes #1141

commit 4d2d518a3d5176d70f81d83c5782ef56f7118800
Author: ashok.blend <[hidden email]>
Date:   2017-07-08T10:57:41Z

    [CARBONDATA-1282] Choose BatchedDatasource scan only if schema fits codegen
   
    This closes #1148

commit 5e74e50547a98e2452b296f839a020d978225cf2
Author: chenliang613 <[hidden email]>
Date:   2017-07-08T15:53:02Z

    [CARBONDATA-1280] Solve HiveExample dependency issues and fix spark 1.6 CI
   
    This closes #1150

commit 79a777052a8e5ead117f7424a6daf974dc405c26
Author: Liang Chen <[hidden email]>
Date:   2017-07-08T22:32:10Z

    fix doc, remove invalid description
   
    This closes #1151

commit 16938770f54eaf5ec646df2692418e726d9defd4
Author: kunalkapoor <[hidden email]>
Date:   2017-07-10T06:42:10Z

    [CARBONDATA-1229] acquired meta.lock during table drop
   
    This closes #1153

commit f5e4bb083f166d62de38267c7503ab3609e1fcca
Author: czg516516 <[hidden email]>
Date:   2017-07-11T03:01:47Z

    [CARBONDATA-1289] remove unused method
   
    This closes #1157

commit 7e433115a925e6a477d8de157596cc6eb16dfa17
Author: xuchuanyin <[hidden email]>
Date:   2017-07-15T03:14:46Z

    fix confilicts

commit 76b071846c226348b3a934edfda63454ee973254
Author: xuchuanyin <[hidden email]>
Date:   2017-07-15T06:12:35Z

    Support multiple temp dirs for writing files while loading data

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1175: [CARBONDATA-1281] Support multiple temp dirs for wri...

qiuchenjian-2
Github user asfgit commented on the issue:

    https://github.com/apache/carbondata/pull/1175
 
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1175: [CARBONDATA-1281] Support multiple temp dirs for wri...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user asfgit commented on the issue:

    https://github.com/apache/carbondata/pull/1175
 
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1175: [CARBONDATA-1281] Support multiple temp dirs for wri...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1175
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3088/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1175: [CARBONDATA-1281] Support multiple temp dirs for wri...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1175
 
    Build Success with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/498/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1175: [CARBONDATA-1281] Support multiple temp dirs for wri...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on the issue:

    https://github.com/apache/carbondata/pull/1175
 
    @xuchuanyin  please squash all commits to one commit.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1175: [CARBONDATA-1281] Support multiple temp dirs ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user xuchuanyin closed the pull request at:

    https://github.com/apache/carbondata/pull/1175


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1175: [CARBONDATA-1281] Support multiple temp dirs for wri...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/1175
 
    @chenliang613 OK, I'll raise another PR. #1177


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---