UNIX Socket FAQ

A forum for questions and answers about network programming on Linux and all other Unix-like systems

You are not logged in.

  • Index
  • » Processes
  • » What can cause a spontaneous EPIPE error without either end calling close or crash?

#1 2010-02-10 12:04 PM

FooBarWidget
Member
Registered: 2010-02-10
Posts: 6

Re: What can cause a spontaneous EPIPE error without either end calling close or crash?

Offline

#2 2010-02-10 02:39 PM

RobSeace
Administrator
From: Boston, MA
Registered: 2002-06-12
Posts: 3,847
Website

Re: What can cause a spontaneous EPIPE error without either end calling close or crash?

Hmmmm...  Well, if it were a TCP socket, I'd say it was probably just network flakiness
terminating the connection abnormally or something...  But, with a Unix domain socket,
that can't be it...

You say you're sure you don't close() the socket incorrectly on either end when it
happens, but what about shutdown()?  Do you call that at all in any circumstance on
either end?  Because, that could have the same effect as a close(), but still leave the
FD valid and open, as it seems to be in your case...

Aside from that, my only real guess is some sort of subtle memory corruption going
on...  Eg: such that the variable holding your socket FD gets overwritten, and so you
try to write() to the wrong FD, or something...  Seems unlikely though that it'd get
overwritten with another valid, open FD, which just happens to be a pipe or socket
that has no reader, such that it'd generate the EPIPE error...

I'd say just add as much debug logging as you can, and see if users can duplicate
it and send you the logs...  Log all connects and disconnects (normal and abnormal),
the FDs in use at all times, etc...  And, when you get an EPIPE, log the FD, and try
to obtain as much info as you can about that open FD before throwing it away...
Do getsockname() and getpeername() on it, look it up in "/proc/self/fd/" (in fact,
maybe dump the whole set of currently open FDs from there), and cross-reference
the inode# for the socket from there with "/proc/net/unix" to pull up more info...  Do
getsockopt(SO_PEERCRED) to obtain PID and UID of your connecting peers, and
poke into their "/proc/<pid>/fd/"s, too (assuming your server has perms to peek in
there, anyway)...  Etc...  Basically, just try to log everything you can, and hopefully
something will stand out if/when someone duplicates the problem in the future...

Offline

#3 2010-02-10 02:42 PM

FooBarWidget
Member
Registered: 2010-02-10
Posts: 6

Re: What can cause a spontaneous EPIPE error without either end calling close or crash?

Nope, no shutdown() anywhere.

Memory corruption is not out of the question, but unlikely. I've tested stuff with Valgrind and I've never seen any EBADF errors.

Offline

#4 2010-02-10 02:51 PM

RobSeace
Administrator
From: Boston, MA
Registered: 2002-06-12
Posts: 3,847
Website

Re: What can cause a spontaneous EPIPE error without either end calling close or crash?

Offline

#5 2010-02-10 03:00 PM

FooBarWidget
Member
Registered: 2010-02-10
Posts: 6

Re: What can cause a spontaneous EPIPE error without either end calling close or crash?

Yes I am talking about AF_LOCAL. All processes are running on localhost.

Offline

#6 2010-02-11 06:31 AM

i3839
Oddministrator
From: Amsterdam
Registered: 2003-06-07
Posts: 2,239

Re: What can cause a spontaneous EPIPE error without either end calling close or crash?

It really looks like a subtle bug in your code. It's a lot easier to help if we
see your code.

Offline

#7 2010-02-11 10:07 AM

FooBarWidget
Member
Registered: 2010-02-10
Posts: 6

Re: What can cause a spontaneous EPIPE error without either end calling close or crash?

Offline

#8 2010-02-11 03:30 PM

RobSeace
Administrator
From: Boston, MA
Registered: 2002-06-12
Posts: 3,847
Website

Re: What can cause a spontaneous EPIPE error without either end calling close or crash?

Ugh...  More C++ code... ;-/

Well, from what I could see, the real guts of the actual low-level socket handling are
burried in "ext/common/MessageChannel.h"...  Is that right, or am I looking at the
wrong thing?

Anyway, one thing I don't like: for {read,write}Scalar() you use a 32-bit size header,
while for plain {read,write}() you use a 16-bit one...  It would only matter if reader and
writer disagreed on which method they should be using to read/write at the same
time, but still, I can't see much reason not to use the same sized header for both...
Also, you don't seem to be handling EINTR as non-fatal in any of your syscall read()'s
or write()'s...  And, why not have your read() call readRaw() like readScalar() does,
instead of rolling its own low-level syscall reading?

It's really hard to follow everything that's happening through all the layers of C++
classes and stuff, so I'm not sure what the real problem is...  I might try to take
another look and see if I can figure out WTF is going on, though...  I'm a straight C
coder myself though, so it hurts my damn brain to twist through all that wacky C++
abstraction and obfuscation... ;-)

Offline

#9 2010-02-11 04:22 PM

FooBarWidget
Member
Registered: 2010-02-10
Posts: 6

Re: What can cause a spontaneous EPIPE error without either end calling close or crash?

Offline

#10 2010-02-12 02:33 PM

RobSeace
Administrator
From: Boston, MA
Registered: 2002-06-12
Posts: 3,847
Website

Re: What can cause a spontaneous EPIPE error without either end calling close or crash?

Offline

#11 2010-02-12 03:14 PM

FooBarWidget
Member
Registered: 2010-02-10
Posts: 6

Re: What can cause a spontaneous EPIPE error without either end calling close or crash?

Offline

#12 2010-02-12 10:41 PM

RobSeace
Administrator
From: Boston, MA
Registered: 2002-06-12
Posts: 3,847
Website

Re: What can cause a spontaneous EPIPE error without either end calling close or crash?

Offline

  • Index
  • » Processes
  • » What can cause a spontaneous EPIPE error without either end calling close or crash?

Board footer

Powered by FluxBB