Non-sucky way to do SFTP from .net
-
So... We're investigating reports of awful, slow performance of SFTP file transfers. I'll be damned if they're not right. Our code does a sample dataset in a bit over an hour, where FileZilla does the same transfer in 16 minutes. It ain't a compression thing, since the test set is incompressible noise.
Currently, we use SSH.NET, which is an opensores library. Performance appears to suck. A lot.
Any other suggestions? Or are we going to end up farming out to psftp commandline like fucking Unix kids?
-
Are you using parallelization? Fz does that by default IIRC.
-
My understanding is tgat Fz only parallrlls at the file level. Since we have one big file, that shouldn't affect anything.
-
-
Trying the new fangled WinSCP library.
-
The WinSCP library works fine from PowerShell, in my experience. I've never tried it in C#, however.
It works pretty well, but it's wonky, IMO. All the library does is send commands to the executable that's bundled with the .Net library, so you need both files and you'll actually see the .exe show up on the process list. I've never had a problem with scripting with it, but I can easily see how you might.
-
The API is sufficiently different that we'll have to do some major redesign of our plugin architecture to avoid having to buffer shit to disk. Sigh.
But it seems to perform hugely better.
-
Only other thing I can think would be to check the buffer size on the SftpClient. It wouldn't surprise me if SSH.Net defaults to small buffers because the library is written to support all SSH functionality so it assumes you'll be firing off shell commands. Then again, I don't know how you're generating your data!
-
I've used SharpSSH in the past. I never pushed it very hard, so I can't say if performs well for large files, but it's a viable alternative.
-
We tried a whole spectrum of buffer sizes. No statistically significant difference. It just doesn't seem to be particularly speedy.
-
It just doesn't seem to be particularly speedy.
The bottleneck might be crypto implemented in C#. You could try P/Invoking into libssh2.
-
Difficulty: We do pure-C# crypto all the time (Bouncycastle).
-
It sounds like you already decided against it, but I'll second the recommendation of the WinSCP library. 4 years ago, I had to write a script that grabbed data from an SFTP server and passed it into another system. I implemented it as a C# console app, so I can confirm that it works fine with C#
I looked for a native .NET SFTP library, but at the time, most of them sucked and that was the option I settled on. I wrote this process about 4 years ago, but it sounds like not much has changed.
-
Oh, we'll probably do it. I just need to get out my big red marker and refactor the fuck out of the architecture so that WinSCP fits in the Stream shaped box.
-
My understanding is tgat Fz only parallrlls at the file level. Since we have one big file, that shouldn't affect anything.
Fz can use multiple streams to grab parts of a file. I used it once to get a driver off some Chinese website with miserable performance; opened ten or so simultaneous connections. I'm not sure if that's what you meant or not.
-
Interesting.
At any rate, WinSCP gets close enough to the performance level manglement wants.
-
Interesting.
At any rate, WinSCP gets close enough to the performance level manglement wants.
Yeah, all you really do is open multiple simultaneous connections and have each one request a different range of bytes. The FTP protocol's supported this for ages.
If you've already got a working alternative, though, the point is
-
Last time I had to solve this problem, and you're going to hate this solution, was distributing the WinSCP DLL along with my project. And creating a helper file full of pinvokes for it.
Turned out to be a lot easier to implement than I imagined.
I might still have the code somewhere if you need it.
EDIT: now that I've read the thread it looks like at least 3 people have independently decided the WinSCP route was the best. Also I mis-remembered: you don't need to use pinvokes because WinSCP provides their own assembly.
-
The FTP protocol's supported this for ages.
Yea, but SFTP is not based on FTP... it's an entirely different protocol.
-
Yea, but SFTP is not based on FTP... it's an entirely different protocol.
The spec apparently allows for reading ranges, though, so it sounds like you could still open multiple connections: http://tools.ietf.org/html/draft-ietf-secsh-filexfer-13#section-8.2
-
Either way, the lib we were using didn't support it.