UNIX Socket FAQ

A forum for questions and answers about network programming on Linux and all other Unix-like systems

You are not logged in.

#1 2008-07-29 02:37 PM

developer_chris
Member
Registered: 2008-06-05
Posts: 21

Re: locking a file with flock

Hi

I am trying to write a small caching script
the script copies a file to the cache folder (from an NFS)

because there is the possibility of another copy of the script trying to copy the same file I create a lock file and flock it

If I use ab (apache bench) to run the script with many concurrent processes it seems to fail ab reports the number of bytes returned as less than the size of the file. so it would appear that the locks are being dropped before the copy is complete thus an incomplete file is being sent to the user

Here is the copy function (in PHP )

function CopyFileToCache($src,$cache,$dst) {

	// create a lock to prevent multiple access
	$wouldblock = false;
	$lock = fopen($cache.$dst.".lock", "w+");
	flock($lock,LOCK_EX | LOCK_NB,$wouldblock);
	
	if ($wouldblock) { // another process is handling this file we can wait for it to finish
		flock($lock,LOCK_EX);// 2nd and greater processes wait here
		fclose($lock);
		// assume the other process succeeded do more tests if this is not good enough
		return null;
	}
	
	// use the shell to copy, faster than php and doesnt bloat the memory
	$r = null;
	$c = "cp -f {$src} {$cache}{$dst} 2>&1";
	$r = `$c`;

	// release lock and file
	fclose($lock);

	// unlink lock file
	unlink($cache.$dst.".lock");

	if ($r)
		return $r;
	else
		return null;
}

can any one see why the 2nd flock would return before the copy is complete and the lock is destroyed?

The only thing I can think of is the backtick operator returns before the copy itself is complete, but that doesnt seem right.

DC

Offline

#2 2008-07-29 04:31 PM

i3839
Oddministrator
From: Amsterdam
Registered: 2003-06-07
Posts: 2,239

Re: locking a file with flock

Offline

#3 2008-07-29 06:44 PM

RobSeace
Administrator
From: Boston, MA
Registered: 2002-06-12
Posts: 3,847
Website

Re: locking a file with flock

Offline

#4 2008-07-30 01:28 AM

developer_chris
Member
Registered: 2008-06-05
Posts: 21

Re: locking a file with flock

i3839
The concurrent processes can't read the file because it is being copied from the NFS which of course is much slower than local disk and thus the concurrent process may get an incomplete version of the file.

RobSeace

The idea of the cache system is to reduce system calls to the NFS the remote machine holding the original files is a critical machine and the load must be kept to a minimum.

So when a script tries to send the file to the remote user it checks if it is in the cache if not it fetches it from the NFS. But several processes may try to access it at the same time so I must make sure the file is completely copied before newer process start to copy the file.

Because I am using cp to copy the file to the local cache, I can't flock it because it doesn't exist until after the copy. So I create a local lock file the first process to successfully lock it gets ownership of the copy process. Other processes wait until the flock is released at that time the newly copied file should be available. thats the theory any way.

In practice the first time I hit AB and run multiple concurrent processes the file returned is only half the size of the original. but if I check the file size of the cached file immediately afterwards it is correct. Meaning the copy (cp) worked fine and the most likely scenario is the flock is returning before the file is completely copied OR lighttpd is for some reason truncating the file or ab is misreporting the length.

I thought maybe cp was returning before the file had been completely copied and therefore the process owning the lock was releasing it too early



AHHHHH!

Ok Guys thanks for being my sounding board. I worked out what the issue is.
The second process was only checking for file existence before deciding to go into the copy process. because the copy had started the file existed so it simply started serving it to the client. But it was incomplete because the previous process was still copying it.

I will have to check for the lock file as well if that exists I need to flock it. So my response to i3839 was exactly what was happening.

That means multiple more file system calls. :(

an "atomic" copy function would be better perhaps I'll copy to a temp name and then rename which would make it atomic. That will reduce system calls

DC

Offline

#5 2008-07-30 04:12 AM

developer_chris
Member
Registered: 2008-06-05
Posts: 21

Re: locking a file with flock

Offline

#6 2008-07-30 12:34 PM

RobSeace
Administrator
From: Boston, MA
Registered: 2002-06-12
Posts: 3,847
Website

Re: locking a file with flock

Offline

#7 2008-07-30 01:59 PM

i3839
Oddministrator
From: Amsterdam
Registered: 2003-06-07
Posts: 2,239

Re: locking a file with flock

To summarize, you have one process copying files from NFS to local disk as a
local cache, and multiple other processes can read that cached file if it's there?

What if it isn't there, do they read the NFS version or do they start doing the copying?

So what you're basically doing is adding a local disk cache to NFS?

How do you detect that the NFS file changed, how do you keep everything in sync?

cp + mv should be atomic without any need for extra locking. But copying the
same file multiple times is a bit wasteful, so perhaps doing an exclusive open on
filename.tmp as locking would prevent it. You might need a way to notify when
the cp is done though, then you're back to flock/semaphore/whatever.

Offline

#8 2008-07-30 02:22 PM

developer_chris
Member
Registered: 2008-06-05
Posts: 21

Re: locking a file with flock

Offline

Board footer

Powered by FluxBB