UNIX Socket FAQ

A forum for questions and answers about network programming on Linux and all other Unix-like systems

You are not logged in.

#1 2013-01-02 08:42 PM

thinking
Member
Registered: 2005-09-15
Posts: 103

distributed local ipc

hi all,

i implemented a kind of a demo library using sysv semaphores and shared memory to have a multi-writer/multi-reader circular buffer
the nice thing about this was, that it's able to handle it's participants in a very flexible way
so, at any time its possible to attach to the shm and it's semaphores, use it (read/write) and detach.
the first one initializes everything, the last one destroys everything

why sysv?
im using the SEM_UNDO feature to know how many processes are using the shm at any given time.
this is needed to safely destroy everything at the end AND i added a read-counter in every written message, so i know if the read-counter of a message drop's to 0
the circular-buffer has new free bytes which can be overwritten by new messages (- i hope this was understandable?)
also sem_undo helps in case of a process crash, which may not caused by my library itself (kill -9, the prog using my library, ...)

i didn't test/stress much, but it seems working

about sysv semaphores i read that they are
1) kind of out-dated semaphores
2) the sem_undo feature may cause problems (see bug's section of man semop - but i dont think i have this kind of problem that the value drops below 0)

so i thought about using alternatives and was very surprised that there dont seem to be any in a portable way

my questions:
1) anyone can think of an alternative way to have a reliable and flexible local communication framework?
the easiest one i could think of is multicasting on lo-interface, this should also work windows, but it may be complicated to get this reliable cause of udp and such
2) would it be possible to use lock-free/wait-free algorithms for such distributed local behaviour? i dont have expirience with this so i dont know if i should have a look into this?
it's also very surprising to me, that there aren't any lock-free libraries ready for usage, everything i found seems to be highly theoretical or conceptual
3) any tips to get this running outside of sysv supported plattforms?
4) what really beg's me is what do other developers do who use posix/like semaphores in case the application/library crashes? i mean how to know if a semaphore, if it's left locked by a crash, is not in use and could be destroyed by the next running instance?

thx

Last edited by thinking (2013-01-02 08:45 PM)

Offline

#2 2013-01-03 03:45 PM

RobSeace
Administrator
From: Boston, MA
Registered: 2002-06-12
Posts: 3,826
Website

Re: distributed local ipc

I really don't use semaphores at all for anything, either sysv or Posix...  They always seemed like such an ugly, clunky feature with a horrible API...  But, yes, this limitation of Posix semaphores is apparently well known...  And, from what I understand, it's intentional as you're supposed to be able to pass off a semaphore to another process to post after you've waited for it...  No idea why anyone would want to do such a thing, but whatever...

But, for a use like you're talking about, I'd think one of the simple boolean/mutex-like alternatives mentioned in the above link would suffice...  With file locking, you could even get proper read vs. write locking...  Ie: it would allow multiple read locks, but prevent simultaneous write locks coexisting with read locks, so that readers aren't reading while the data is being changed...  (Though, you'd need to use fcntl() locks instead of lockf() as the link uses...)

anyone can think of an alternative way to have a reliable and flexible local communication framework?

Well, it depends on how you want to define your terms...  For me, I generally find sockets (generally Unix domain for purely local use) sufficient...  It generally requires a client/server approach to things, though, rather than your ad-hoc peers coming and going at will approach...  But, I generally think in client/server terms by default, so it seems the more natural approach to me...  (And, I really don't see a whole lot of real-world advantage to NOT having to run a dedicated server process on the machine in order to facilitate this framework...  Hell, if you wanted, you could just have the first client to come along auto-start the server if it's not already running, and achieve basically the same ad-hoc effect!  You could even have the server auto-terminate after some period of non-use, as well, if for some reason you thought there was an actual advantage to not having it always running...)  Yes, you could also go with multicast to get the ad-hoc effect, if you wanted, though I'm not sure how it would work for a use case like you describe...  (How would any process read the "buffer" contents in such a scenario, since no one source is recording all the contents anywhere?  I suppose if readers only cared about new messages from the point they start listening only, it would work...  And, I would think that on the loopback interface at least, it should be reliably delivered as well, regardless of being UDP-based...)

The only time I generally use shared memory is for rarely-changing data that doesn't need any synchronization between sharing processes...  Ie: everyone is just a reader, except the first process to actually create it...  And, if another process decides it needs to change the data for some reason, it just unlinks the old segment and creates a new one; that way, everyone using the old data carries on using it, but it goes away automatically when they all die or close the segment, but all newcomers get the new data...  Not suitable for IPC, really, but just for sharing resources...

Offline

#3 2013-01-04 12:25 AM

Nope
Administrator
From: Germany
Registered: 2004-01-24
Posts: 385
Website

Re: distributed local ipc

I'd also go with a client/server architecture. This special case sounds a lot like a good use for the Observer pattern.

I've done a lot with semaphores (Posix), but mostly in threaded applications. I can remember that I wanted to use system semaphores once to sync several processes, but I can't remember what I wanted to achieve, only that the Posix implementation of Linux at that time actually didn't support them. Usually I can`t count on the parts residing on the same machine anyway. So to ensure that it works, whether being on the same server or someplace else in the network, a client/server approach is the only liable way.

Offline

#4 2013-01-04 03:28 AM

i3839
Oddministrator
From: Amsterdam
Registered: 2003-06-07
Posts: 2,230

Re: distributed local ipc

If using shared memory and not caring about portability,
I would use robust futexes on the shared memory segment.

Pthread supports robust mutexes, but AFAIK only for
multi-threaded programs, not separate processes sharing
only a bit of memory. So one downside of using robust
futexes is that you can't use Pthread robust mutexes because
that will clash with your library.

See http://www.kernel.org/doc/Documentation … utexes.txt
Of course using Futexes is a bit tricky, you need to use atomic ops
and be very careful what you do. But once you do use them, adding
support for robust futexes is peanuts in your case, as you probably
only need one lock.

Another alternative is to go (partially) lockless. E.g:
http://lttng.org/urcu
http://rusty.ozlabs.org/?p=302

Of course the sensible approach is to avoid using shared memory
and to use a socket based approach where the kernel takes care
of all the tricky bits, but that's no fun.

Offline

#5 2013-01-04 07:11 PM

Nope
Administrator
From: Germany
Registered: 2004-01-24
Posts: 385
Website

Re: distributed local ipc

I used inter process futexes in the past. My dynamic pre-forking webserver used these to prevent the thundering herd problem of the old Linux socket accept. They did use shared memory instead of the normal int. Basically the processes waited in a select based on the file descriptor of the futex. So, for his program they might work.

Offline

#6 2013-01-06 03:13 PM

thinking
Member
Registered: 2005-09-15
Posts: 103

Re: distributed local ipc

thx all

currently i'm trying to use a server/client approach, but a bit different than suggested
server/client often means one server, many clients which means in my case, that the server would be a centralized point of failure
and it may be highly loaded beacauses every message from one client may need to be distributed to all the other clients
which need to be done by a single point for every client

my current idea is that every process hosts a server bound to a "random" port (=0), the acctually bound port number is broadcasted using multicast to a group of peers
so it's working locally and on the network producing a mesh of interconnected peers
if one fails, everything else will continue working

the first tests seem promising but i'll have to do a few things more to have it finally ready working

Last edited by thinking (2013-01-06 03:14 PM)

Offline

#7 2013-01-06 05:22 PM

RobSeace
Administrator
From: Boston, MA
Registered: 2002-06-12
Posts: 3,826
Website

Re: distributed local ipc

my current idea is that every process hosts a server bound to a "random" port (=0), the acctually bound port number is broadcasted using multicast to a group of peers
so it's working locally and on the network producing a mesh of interconnected peers
if one fails, everything else will continue working

You'll have problems scaling that up very large, since I assume every single client will need a separate connection to every single other client (or its "server", if that's a separate process)?  If they're all running locally on a single host, you'll burn through your ephemeral ports fairly quickly...  (Since you're using them both for listening ports and for every outbound connection to another client...)  You'll be growing the total number of sockets in the system by the square of the number of clients...  At about 254 clients, you'll have used up every possible port# in the system, assuming you haven't already exceded some other limit like total open file descriptors...  (And, that's being generous and assuming you have run of the full 64K range of ports...  In reality, your ephemeral range will surely be much smaller...  On some old systems, it's severely limited to only a few thousand ports...  But, even on modern systems where you've got about 32K, you'll hit that limit around 180 clients...)

There's a reason why the one server with multiple clients model is so widely used over such pure peer-to-peer mesh-type networks... ;-)

Offline

#8 2013-01-06 06:09 PM

Nope
Administrator
From: Germany
Registered: 2004-01-24
Posts: 385
Website

Re: distributed local ipc

Hmm, but in modern cluster like systems, you do have the possibility for every client to become master. An example would be elasticsearch. So perhaps you might want to look into that one. I do think, that I saw a paper from kimchi (the developer) about how elasticsearch decides who's master.

Offline

#9 2013-01-06 07:09 PM

thinking
Member
Registered: 2005-09-15
Posts: 103

Re: distributed local ipc

@RobSeace
wow, thx for your thoughts - this really helps getting a better view on the topic

@Nope
good tip
i didn't find the paper yet, but i'll have a look at the source

Offline

#10 2013-01-07 12:50 AM

i3839
Oddministrator
From: Amsterdam
Registered: 2003-06-07
Posts: 2,230

Re: distributed local ipc

If you can count on multicast mostly working, then I would go for a hybrid
approach: Use multicast for everything, but have a few peer to peer links
per process, which are used to retransmit any packets that didn't originally
made it (and to handle any clients where multicast doesn't work for some
reason). As a last fallback, you can always multicast a retransmit request
and have some heuristic in place to limit the number of replies. Another
option is, in case of lots of small packets being sent, is to always resend the
previous N packets for redundancy. All assuming there is a regular stream
of packets coming in, otherwise lost packets aren't detected soon enough.
This should scale very well, up till the individual clients can't handle the
total message stream any more.

Another way of avoiding the port problem is to use one UDP socket per
process for all peers, but this has a higher latency and uses more bandwidth.
The advantage of this to choosing one dynamic TCP server which everyone
uses until it goes down is that not all load is put on one server process and
it avoids one extra indirection. Downside is that you can't use TCP.

If you don't care about latency too much and can't use multicast, you can also
make a kind of tree or ring peer to peer structure (either UDP or TCP). But
then you have the problem of maintaining that structure in a robust way.

Offline

#11 2013-01-07 01:48 PM

RobSeace
Administrator
From: Boston, MA
Registered: 2002-06-12
Posts: 3,826
Website

Re: distributed local ipc

The advantage of this to choosing one dynamic TCP server which everyone
uses until it goes down is that not all load is put on one server process and
it avoids one extra indirection.

I don't see the first as much of a real advantage, just the second...  It should be possible to have a server that copes with high traffic without very much difficulty...  If necessary, you can use multithreading or split off multiple child processes to deal with groups of clients as needed...  (Threads probably being a better fit to a use like this, where everyone has to talk to everyone else...)  You can increase the server's priority so it gets first crack at run time whenever it needs it...  Etc...

So, the only true advantage I see is cutting the server out of the I/O loop, and saving a single copy of every message sent...  And, that's only the case if every single message must always be retransmitted to all peers in all cases...  If, instead, some peers only need access to some of the data periodically, it will save on data transfer, since you only need to send one copy to the server now, and then peers can periodically request access to it whenever they need it, rather than forcing them all to be bombarded with all of it continuously, whether they actually care about it now or not...

But, I must say, it does indeed sound like multicast is the best fit overall for this design...  So, if you can find a way to use that, it may be the best way to go...  It's just that if you need true reliability, I don't know how much work you're going to need to do to get it...  If this is all purely local on a single machine, using the loopback interface, I suspect you won't drop many, if any, packets anyway...  But, if going across a real network of any kind, well good luck... ;-)

Offline

#12 2013-01-07 11:22 PM

i3839
Oddministrator
From: Amsterdam
Registered: 2003-06-07
Posts: 2,230

Re: distributed local ipc

I had mostly network load in mind, not CPU usage.

The days that CPU speed lags bandwidth seem to be over. 10Gb Ethernet should
have been common for years by now. (I blame hard disks having been pegged
around 100MB/s, making anything faster than 1Gb/s pretty useless widespread.)

There is always a problem when the reader(s) can't keep up with the sender(s).
Either you drop messages, or one slow reader can slow down everything and
cause a lot of memory usage for buffering not yet received messages (also bad
for latency). So I'd argue that dropping packets isn't always a bad thing, though
in this case it's probably better to disconnect or kill a consistently too slow client
if it can't keep up with the traffic. If you want guaranteed delivery and handling
of all packets then the whole will run as fast as the slowest node. You have this
choice no matter what kind of communication channel you use, the same is true
for shared memory.

The best approach and right solution depends on the fine details of what your
library is used for though. You have to choose what guarantees, if any, you
are prepared to give, and how you want to handle overload conditions.

Offline

Board footer

Powered by FluxBB