UNIX Socket FAQ

A forum for questions and answers about network programming on Linux and all other Unix-like systems

You are not logged in.

#1 2008-07-29 02:37 PM

developer_chris
Member
Registered: 2008-06-05
Posts: 21

Re: locking a file with flock

Hi

I am trying to write a small caching script
the script copies a file to the cache folder (from an NFS)

because there is the possibility of another copy of the script trying to copy the same file I create a lock file and flock it

If I use ab (apache bench) to run the script with many concurrent processes it seems to fail ab reports the number of bytes returned as less than the size of the file. so it would appear that the locks are being dropped before the copy is complete thus an incomplete file is being sent to the user

Here is the copy function (in PHP )

function CopyFileToCache($src,$cache,$dst) {

	// create a lock to prevent multiple access
	$wouldblock = false;
	$lock = fopen($cache.$dst.".lock", "w+");
	flock($lock,LOCK_EX | LOCK_NB,$wouldblock);
	
	if ($wouldblock) { // another process is handling this file we can wait for it to finish
		flock($lock,LOCK_EX);// 2nd and greater processes wait here
		fclose($lock);
		// assume the other process succeeded do more tests if this is not good enough
		return null;
	}
	
	// use the shell to copy, faster than php and doesnt bloat the memory
	$r = null;
	$c = "cp -f {$src} {$cache}{$dst} 2>&1";
	$r = `$c`;

	// release lock and file
	fclose($lock);

	// unlink lock file
	unlink($cache.$dst.".lock");

	if ($r)
		return $r;
	else
		return null;
}

can any one see why the 2nd flock would return before the copy is complete and the lock is destroyed?

The only thing I can think of is the backtick operator returns before the copy itself is complete, but that doesnt seem right.

DC

Offline

#2 2008-07-29 04:31 PM

i3839
Oddministrator
From: Amsterdam
Registered: 2003-06-07
Posts: 2,230

Re: locking a file with flock

IIRC, NFS didn't support file locking. Maybe the newer versions do, dunno.

because there is the possibility of another copy of the script trying to copy the same file I create a lock file and flock it

If the concurrent process just reads the file and don't write to it then you don't
need locking. You only need locking to prevent another process modifying the file
while you read it.

You could check mtime or even the checksum to know if you need to copy the file
to cache or not.

Offline

#3 2008-07-29 06:44 PM

RobSeace
Administrator
From: Boston, MA
Registered: 2002-06-12
Posts: 3,826
Website

Re: locking a file with flock

IIRC, NFS didn't support file locking. Maybe the newer versions do, dunno.

NFS doesn't support flock()-style locking...  However, it should support fcntl()-style
locking (and lockf())...  I know nothing about PHP, so I don't know if its flock() really
corresponds to the standard libc flock() or if it's implemented on top of fcntl()...

However, what's being done seems utterly strange...  You're creating a "*.lock" file,
then flock()'ing that...  Most normal people would lock the real file you're modifying,
perhaps in addition to creating a "*.lock" file for non-flock()/fcntl()-aware apps and
such...  But, locking the "*.lock" file itself seems utterly strange...  And, typically, you
would write at least your PID into the "*.lock" file, to allow other apps to test for
stale locks...  (But, again, that's something that only really works properly on a
single system, not over NFS to a remote system...)  If you just want to use the
simple existence of the lock file as the key, then you should be able to tell that with
a simple O_EXCL type open() (I can't tell you how to do it with PHP, sorry)...  But,
then you run the risk of crashed apps leaving stale lock files...

But, since you're just spawning off "cp" to do your real work anyway, you might as
well just use the "lockfile" command to create your lockfile for you, then "rm" it after...
It has the ability to remove old stale lockfiles, and supposedly works over NFS fine...

Offline

#4 2008-07-30 01:28 AM

developer_chris
Member
Registered: 2008-06-05
Posts: 21

Re: locking a file with flock

i3839
The concurrent processes can't read the file because it is being copied from the NFS which of course is much slower than local disk and thus the concurrent process may get an incomplete version of the file.

RobSeace

The idea of the cache system is to reduce system calls to the NFS the remote machine holding the original files is a critical machine and the load must be kept to a minimum.

So when a script tries to send the file to the remote user it checks if it is in the cache if not it fetches it from the NFS. But several processes may try to access it at the same time so I must make sure the file is completely copied before newer process start to copy the file.

Because I am using cp to copy the file to the local cache, I can't flock it because it doesn't exist until after the copy. So I create a local lock file the first process to successfully lock it gets ownership of the copy process. Other processes wait until the flock is released at that time the newly copied file should be available. thats the theory any way.

In practice the first time I hit AB and run multiple concurrent processes the file returned is only half the size of the original. but if I check the file size of the cached file immediately afterwards it is correct. Meaning the copy (cp) worked fine and the most likely scenario is the flock is returning before the file is completely copied OR lighttpd is for some reason truncating the file or ab is misreporting the length.

I thought maybe cp was returning before the file had been completely copied and therefore the process owning the lock was releasing it too early



AHHHHH!

Ok Guys thanks for being my sounding board. I worked out what the issue is.
The second process was only checking for file existence before deciding to go into the copy process. because the copy had started the file existed so it simply started serving it to the client. But it was incomplete because the previous process was still copying it.

I will have to check for the lock file as well if that exists I need to flock it. So my response to i3839 was exactly what was happening.

That means multiple more file system calls. :(

an "atomic" copy function would be better perhaps I'll copy to a temp name and then rename which would make it atomic. That will reduce system calls

DC

Offline

#5 2008-07-30 04:12 AM

developer_chris
Member
Registered: 2008-06-05
Posts: 21

Re: locking a file with flock

RobSeace;24912 wrote:

what's being done seems utterly strange...  You're creating a "*.lock" file,
then flock()'ing that...  Most normal people would lock the real file you're modifying,
perhaps in addition to creating a "*.lock" file for non-flock()/fcntl()-aware apps and
such...  But, locking the "*.lock" file itself seems utterly strange...

I have always thought I was a bit strange I never seem to do things the way "normal" people would

Now I have confirmation! :rolleyes:

Offline

#6 2008-07-30 12:34 PM

RobSeace
Administrator
From: Boston, MA
Registered: 2002-06-12
Posts: 3,826
Website

Re: locking a file with flock

I have always thought I was a bit strange I never seem to do things the way "normal" people would

Heh.  Not that that's necessarily always a bad thing, either...  In fact, I'd say it's quite
often a good thing... ;-)  Normality tends to be highly overrated...

I can't flock it because it doesn't exist until after the copy.

Well, since you're using "cp" to do the copy, you're right; you'd need support within
cp itself to handle the locking for you, in that case...  The way *I* would do it, however,
is to do my own copying (cp's job really isn't that difficult to duplicate), therefore I'd
be creating the destination file, and therefore I could lock it as it's created...  (And,
to solve the dilemma of the reader knowing that the file is in the process of being
written to, you'd use a read-lock/shared-lock, which would only succeed once the
write-lock/exclusive-lock of the writer is released...)

Offline

#7 2008-07-30 01:59 PM

i3839
Oddministrator
From: Amsterdam
Registered: 2003-06-07
Posts: 2,230

Re: locking a file with flock

To summarize, you have one process copying files from NFS to local disk as a
local cache, and multiple other processes can read that cached file if it's there?

What if it isn't there, do they read the NFS version or do they start doing the copying?

So what you're basically doing is adding a local disk cache to NFS?

How do you detect that the NFS file changed, how do you keep everything in sync?

cp + mv should be atomic without any need for extra locking. But copying the
same file multiple times is a bit wasteful, so perhaps doing an exclusive open on
filename.tmp as locking would prevent it. You might need a way to notify when
the cp is done though, then you're back to flock/semaphore/whatever.

Offline

#8 2008-07-30 02:22 PM

developer_chris
Member
Registered: 2008-06-05
Posts: 21

Re: locking a file with flock

Well, since you're using "cp" to do the copy, you're right; you'd need support within cp itself to handle the locking for you, in that case... The way *I* would do it, however, is to do my own copying (cp's job really isn't that difficult to duplicate), therefore I'd be creating the destination file, and therefore I could lock it as it's created... (And, to solve the dilemma of the reader knowing that the file is in the process of being written to, you'd use a read-lock/shared-lock, which would only succeed once the write-lock/exclusive-lock of the writer is released...)

Normally thats exactly what I would do. but I wanted to minimise memory usage PHP tends to consume wads of the stuff. Being a scripted language any read write loop will be much slower so cp is used to make it as fast as possible.

This is a solution to an ongoing problem. if it proves successful I may rewrite it in straight c. in which case a simple read write loop and exclusive  lock would suffice.

but then I have to interface it to fastcgi and thats a whole nuther story.

i3839
the script is a backend to a caching file server. If the file doesnt exist locally (cached) the first script to run is responsible for fetching it and copying it to the cache. once the file is copied or if it already existed in the local cache the script returns and the webserver continues and sends the local cached file to the user. because the files are all static files there is no need to test for last modification date. although if required it would vbe trivial to add that test and replace the existing cached file with the new one.

I have modded the script and now it works well.

DC

Offline

Board footer

Powered by FluxBB