Login  Register

Discussion about using multi local directorys to improve dataloading perfomance

classic Classic list List threaded Threaded
4 messages Options Options
Embed post
Permalink
Reply | Threaded
Open this post in threaded view
| More
Print post
Permalink

Discussion about using multi local directorys to improve dataloading perfomance

David CaiQiang
Hi All,
  For each dataloading, we write the sorted temp files into only one different local directory. I think this is a bottle neck of dataloading. It is neccessary to use multi local directorys in multi disks for each dataloading to improve dataloading performance.
Best Regards
David Cai
Reply | Threaded
Open this post in threaded view
| More
Print post
Permalink

Re: Discussion about using multi local directorys to improve dataloading perfomance

Jacky Li
Yes, I think it is a good feature to have. Please feel free to create JIRA issue and Pull Request.

Regards,
Jacky

> 在 2016年10月9日,上午12:04,caiqiang <[hidden email]> 写道:
>
> Hi All,
>  For each dataloading, we write the sorted temp files into only one different local directory. I think this is a bottle neck of dataloading. It is neccessary to use multi local directorys in multi disks for each dataloading to improve dataloading performance.



Reply | Threaded
Open this post in threaded view
| More
Print post
Permalink

Re: Discussion about using multi local directorys to improve dataloading perfomance

Liang Chen
Administrator
In reply to this post by David CaiQiang
+1 for the solution.

Regards
Liang

QiangCai wrote
Hi All,
  For each dataloading, we write the sorted temp files into only one different local directory. I think this is a bottle neck of dataloading. It is neccessary to use multi local directorys in multi disks for each dataloading to improve dataloading performance.
Reply | Threaded
Open this post in threaded view
| More
Print post
Permalink

RE: Discussion about using multi local directorys to improve dataloading perfomance

Jihong Ma
In reply to this post by Jacky Li
Agree, help boost performance.

Jenny

-----Original Message-----
From: Jacky Li [mailto:[hidden email]]
Sent: Saturday, October 08, 2016 9:09 AM
To: [hidden email]
Subject: Re: Discussion about using multi local directorys to improve dataloading perfomance

Yes, I think it is a good feature to have. Please feel free to create JIRA issue and Pull Request.

Regards,
Jacky

> 在 2016年10月9日,上午12:04,caiqiang <[hidden email]> 写道:
>
> Hi All,
>  For each dataloading, we write the sorted temp files into only one different local directory. I think this is a bottle neck of dataloading. It is neccessary to use multi local directorys in multi disks for each dataloading to improve dataloading performance.