Discussion about using multi local directorys to improve dataloading perfomance

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Discussion about using multi local directorys to improve dataloading perfomance

David CaiQiang
Hi All,
  For each dataloading, we write the sorted temp files into only one different local directory. I think this is a bottle neck of dataloading. It is neccessary to use multi local directorys in multi disks for each dataloading to improve dataloading performance.
Best Regards
David Cai
Reply | Threaded
Open this post in threaded view
|

Re: Discussion about using multi local directorys to improve dataloading perfomance

Jacky Li
Yes, I think it is a good feature to have. Please feel free to create JIRA issue and Pull Request.

Regards,
Jacky

> 在 2016年10月9日,上午12:04,caiqiang <[hidden email]> 写道:
>
> Hi All,
>  For each dataloading, we write the sorted temp files into only one different local directory. I think this is a bottle neck of dataloading. It is neccessary to use multi local directorys in multi disks for each dataloading to improve dataloading performance.



Reply | Threaded
Open this post in threaded view
|

Re: Discussion about using multi local directorys to improve dataloading perfomance

Liang Chen
Administrator
In reply to this post by David CaiQiang
+1 for the solution.

Regards
Liang

QiangCai wrote
Hi All,
  For each dataloading, we write the sorted temp files into only one different local directory. I think this is a bottle neck of dataloading. It is neccessary to use multi local directorys in multi disks for each dataloading to improve dataloading performance.
Reply | Threaded
Open this post in threaded view
|

RE: Discussion about using multi local directorys to improve dataloading perfomance

Jihong Ma
In reply to this post by Jacky Li
Agree, help boost performance.

Jenny

-----Original Message-----
From: Jacky Li [mailto:[hidden email]]
Sent: Saturday, October 08, 2016 9:09 AM
To: [hidden email]
Subject: Re: Discussion about using multi local directorys to improve dataloading perfomance

Yes, I think it is a good feature to have. Please feel free to create JIRA issue and Pull Request.

Regards,
Jacky

> 在 2016年10月9日,上午12:04,caiqiang <[hidden email]> 写道:
>
> Hi All,
>  For each dataloading, we write the sorted temp files into only one different local directory. I think this is a bottle neck of dataloading. It is neccessary to use multi local directorys in multi disks for each dataloading to improve dataloading performance.