Shirt Pocket Discussions

Shirt Pocket Discussions (https://www.shirt-pocket.com/forums/index.php)
-   General (https://www.shirt-pocket.com/forums/forumdisplay.php?f=6)
-   -   Terrible performance with folders with 5000+ files? (https://www.shirt-pocket.com/forums/showthread.php?t=914)

xochi 12-12-2005 08:19 PM

Terrible performance with folders with 5000+ files?
 
I'm cloning a boot drive to an external firewire drive using "Smart Copy". Here is what happens:

1. SD churns along happily, reaching effective copy speeds of nearly 1 gig/second (processing about 3000 files/second). 99% of my files are not updated, so this speed makes sense. Everything is good.

2. After processing about 150,000 files, suddenly SD hits some mailing list mail in my mail folder (verified by doing 'sudo lsof | grep SDCopy'). These files are not up to date and do need to be copied. At this point, SD drops to about 10 files per second. I can watch the file counter slowly count up. These are small files (e.g. 8k each), but since they are mailing list folders, they may have 5000 or 10,000 messages in each folder.

Something about this seems to choke SD.

I'm guessing that some of your code does not deal gracefully with folders of 5000+ items?

dnanian 12-13-2005 09:04 AM

Hi, xochi. I believe you asked this question in email, and I responded there as well. Here's what I said:

I don't think we have any particular problem with large folders: we're just stepping through the fileset with standard system calls (fts). But it's quite possible that OSX itself has more difficulty with large folders.

We, though, don't deal with it as a "folder full of files", but rather as individual files. Whether they're in a single folder or not won't have an effect on the speed of the code on "our side" of the copy/check operation.

Hope that helps!

xochi 12-13-2005 12:49 PM

If you want to try to reproduce it, do this:

1. Do a backup.
2. Find a Mail folder with 5000+ messages. Rename it in Mail.
3. Do another backup, using the "Smart Copy" option.

My issue here is that in this particular case, when the performance drops off to nearly zero, it makes "Smart Copy" actually slower than the regular dumb copy.

You say that the fact that the files are in a folder is irrelevant. If this is true, then you must be tracking the files by some other id. I'm wondering if perhaps the slowdown is when you have two files that have the same ID (i.e. they are the same file object) but they have different paths (because some joker like me renamed the enclosing folder). Maybe you can have your programmer check this particular case to see if there is some inefficiency?

My impression is that in this case, SuperDuper suddenly goes from being disk-bound to CPU-bound, and performance suffers as a result.

dnanian 12-13-2005 12:58 PM

xochi --

I have a mail folder with *well* over that number of messages (multiple folders with over 30K messages), and don't have this particular issue there... I smart update it daily. I think there's likely something else going on with your particular situation, which is why I suggested it might be something in the OS in your folder.

We are not tracking the files with any ID. We're not looking them up, or doing anything with a database. There's no reason a large number of files in a given folder would be slower than a small number, unless there's a system issue involved...

xochi 12-13-2005 02:15 PM

Repeating my tests on a drive with more free space seems to relieve the problem. So maybe fragmentation was to blame? See http://www.shirt-pocket.com/forums/s...=4667#post4667

xochi 12-13-2005 02:24 PM

Quote:

Originally Posted by dnanian
xochi --

I have a mail folder with *well* over that number of messages (multiple folders with over 30K messages), and don't have this particular issue there... I smart update it daily. I think there's likely something else going on with your particular situation, which is why I suggested it might be something in the OS in your folder.

Dave -- if you have the time, please try renaming several of these folders (containing, say 50000 files) then doing another Smart Update. Do you see a drastic slowdown in performance? If not then it may very well be some issue on my end.

xochi 12-13-2005 02:27 PM

Aha! Browsing my system logs from yesterday (when I was seeing the bug) I came across a few of these:

Dec 12 15:52:19 MikeG4 DirectoryService[41]: Failed Authentication return is being delayed due to over five recent auth failures for username: mike

That sounds to me like an OS bug that might cause a drastic slowdown, no? I don't know what DirectoryService is, but it sounds like something involved with filesystem operations to me.

Googling this log message reveals some discussion of cases in which an app that uses authentication suddenly fails to authenticate properly.

Anyone else see this in their system.log file?

dnanian 12-13-2005 03:39 PM

No drastic slowdown, xochi... other than what would be expected (and that you saw in your own test)...

sjk 12-30-2005 06:49 PM

Quote:

Originally Posted by xochi
Dec 12 15:52:19 MikeG4 DirectoryService[41]: Failed Authentication return is being delayed due to over five recent auth failures for username: mike

Anyone else see this in their system.log file?

Yep, quite often (and only) on systems using Fast User Switching so I figured it's related to that.


All times are GMT -4. The time now is 12:14 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2024, vBulletin Solutions, Inc.