UNIX Socket FAQ

A forum for questions and answers about network programming on Linux and all other Unix-like systems

You are not logged in.

#1 2006-10-13 03:36 PM

dvmoinescu
Member
From: Bucharest
Registered: 2003-10-09
Posts: 40

Re: shared memory ?

Hello all,

I need some help to implement some code to reduce the need for retrieving information from the SQL server. The problem is that the application server, which is multiproc-multithreaded and capable of serving arround 10.000 requests / second, is in great need for information from the database. I want to implement some cache mechanism to store queried information for future use (thus reducing the queries number). Up to now, the application server was 1 proc - n threads, so malloc()-ed cache was the appropriate answer.

My question is: should I use shared memory to store some 6,000,000 entries, each entry containing 100bytes? Or should I use some other method for caching the information? I was thinking about implementing some sort of binary file to store the information and fseek()-ing inside of it using some index.

These are my two ideas about caching the information:
1. cache inside shared memory
2. per-process index with offsets inside some binary file.

If there is another idea I would like to hear more about it.

Thanx for your answers :)

Offline

#2 2006-10-13 06:48 PM

RobSeace
Administrator
From: Boston, MA
Registered: 2002-06-12
Posts: 3,847
Website

Re: shared memory ?

Is the database server remote or on the same machine?  Because, if on the same
machine, I'd think at least the file-based cache would be major overkill, if not the
memory-based cache, as well...  Basically, in that case, I wouldn't think you could
really gain much over a direct query to the database server, by keeping your own
file-based cache...  Certainly not enough to be worth the trouble, I wouldn't think...
(Unless your database server truly and totally sucks, anyway... ;-))  A memory-based
cache maybe could be useful, if you're likely to repeatedly accessing a relatively
small subset of the total entries on a regular basis...  Ie: if some lookups are more
popular than others...  If they're totally random, it may not make much sense to
even cache in memory, unless you've got absolutely tons of RAM to spare, such
that you can basically replicate the entire database in your own memory space...
(And, even then, you'd want to come up with an efficient scheme of accessing the
items you're interested in, so you aren't constantly crawling through Gigs of RAM
looking for the items you're interested in...  Eg: put them in a red/black tree, or
something similar...)

Offline

#3 2006-10-13 06:54 PM

i3839
Oddministrator
From: Amsterdam
Registered: 2003-06-07
Posts: 2,239

Re: shared memory ?

Both should work, and comparable in speed when mmaping the file. But as it's a cache I don't think you want to read the data from the disk, as that can be as slow as querying the database. So I'd go for shared memory as that has a bigger chance of staying in memory (swapout happens later than file cache pruning). If going for a file then use at least mmap, and then de difference is practically zero with shared memory, except that it is also stored on disk.

But a 600 Mb cache file looks like a bad idea to me, that looks more like replacing your database as it's too slow. With a cache you also need to consider stale data and concurrency issues, things normally handled by the db.

Offline

#4 2006-10-14 01:00 AM

mlampkin
Administrator
From: Sol 3
Registered: 2002-06-12
Posts: 911
Website

Re: shared memory ?


"The only difference between me and a madman is that I'm not mad."

Salvador Dali (1904-1989)

Offline

#5 2006-10-16 10:34 AM

dvmoinescu
Member
From: Bucharest
Registered: 2003-10-09
Posts: 40

Re: shared memory ?

thanx for responses.

SQL server runs on the same box because I experieced problems with SQL server on another server. Even though I have an sql connection manager, and SQL server was running like 200 concurrent connections, I realised why it is better to have a unix socket connection instead of tcp connection. So I stick to having the SQL server on the same box.
The problem when SELECTing lots of information is that from time to time there is an INSERT or UPDATE (writer) that blocks all readers. I really must have a non-blocking READ (SELECT) no matter of writers. I cannot control the writers as there are other scripts that update the information.
The malloc()-ed cache is a 16bit hash containing pointers to 2^16 red-black trees. And it finds the information really fast.

So you are saying that I should be using the shared memory solution? The SQL server uses 512MB for indexes, so I could lower that to 128MB, and put all the information into memory? I have 2GB of RAM. Will it work? I hope so :)

Offline

#6 2006-10-16 10:41 AM

dvmoinescu
Member
From: Bucharest
Registered: 2003-10-09
Posts: 40

Re: shared memory ?

I forgot to mention that each request means at least 1 SELECT, 1 INSERT into another table (asynchronously) and possibly generates 1 UPDATE (asynchronously).
Though the table is well indexed, such great number of SELECTs just cannot be handled rapidly because of random writers.

Offline

Board footer

Powered by FluxBB