A forum for questions and answers about network programming on Linux and all other Unix-like systems

You are not logged in.

#1 2002-07-25 08:17 PM

From: Colombia
Registered: 2002-06-12
Posts: 353

Re: 3.8 - What to do to recv' unknown data size


This question asked by Niranjan Perera ([email protected]).

When the size of the incoming data is unknown, you can either make the size of the buffer as big as the largest possible (or likely) buffer, or you can re-size the buffer on the fly during your read. When you malloc() a large buffer, most (if not all) varients of unix will only allocate address space, but not physical pages of ram. As more and more of the buffer is used, the kernel allocates physical memory. This means that malloc'ing a large buffer will not waste resources unless that memory is used, and so it is perfectly acceptable to ask for a meg of ram when you expect only a few K.

On the other hand, a more elegant solution that does not depend on the inner workings of the kernel is to use realloc() to expand the buffer as required in say 4K chunks (since 4K is the size of a page of ram on most systems). I may add something like this to sockhelp.c in the example code one day.

From: Sujay

I am really failing to understand the malloc part ....help

From: Vic Metcalfe

Lets say your application does this: buf = (char *)malloc(10000000000);
Your machine has only 128MB of ram. If the OS actually tried to allocate that much RAM, it would exhaust the system's resources. What most operating systems do is allocate only the memory that is used. On most systems that I've used, memory allocation is done in 4K "pages". The above malloc() doesn't actually allocate anything but address space. No physical RAM is allocated. So if I then added...   buf[0] = 0;   buf[5] = 0;   buf[9999999999] = 0; The first assignment would cause a page-fault because the memory for it has not yet been allocated. The OS traps the page-fault, sees that it did promise that memory to an application, allocates it and allows the assignment to continue. Now, assuming a page size of 4K the first 4096 characters of the buffer have been allocated. The second assignment produces no page fault and no additional memory is allocated. Now what do you expect to happen with the last assign? The buffer at that possition has not been mapped to physical memory yet, so it too produces a page fault. Once again the OS traps the error, allocates the memory and allows the assignment to continue. Note that it does not allocate all pages between the two blocks. The OS will have only allocated two blocks for a total of 8K of physical RAM. If you know that the OS you are targetting does physical allocation of memory on demand you can take advantage of the fact. In this example you can use it to create an expanding buffer with no tricky coding required. You could also use it for a very sparse array. The more correct thing to do is realloc() the buffer as required. I've coded this sort of thing a few times, but of course now that I want to pull out an example to share with you I can't find one. It isn't complicated anyway, you just look for the buffer to fill, and then realloc() it a bit bigger each time it fill up. Hope this helps,   Vic.

From: Garen Parham

A good way to handle the unexpected length of data which could be coming in is to use a fixed-length buffer which could be the largest size of data you expect to receive, but in one read() you may not get it all into your static-length buffer either because it's too short or there wasn't enough data in the kernels receive queue at the time of the read(). The fixed length buffer could then be used as a kind of ring queue data structure, if you reach the end of your buffer and didn't receive all of what you expected (say for instance with a line protocol, you didn't get the \r\n or \n (CR-LF or LF)) you could write that over the beginning of your fixed length buffer, update the write position to point past that so the next read concatenates it for you and so on.

From: David Gillies

The technique I use for streaming data back from a connection
when I don't know how much data is coming back is simply
to loop until read returns 0 (non-fatal error) or a negative
number. I have a function ReadBufferedData which looks like this:



  Given a socket to read from, read data until 
  the socket is empty or the buffer is full. 
  Return the number of bytes that were read. 

  readSocket I the socket to read from 

  buffer O the buffer to read into 

  bufLen I the size of the buffer 

  bytesRead O the number of bytes read 

  errnoBack O the value of errno, if any, 
encountered in the function 

  Returns: status code indicating success - noErr = success 

OSErr ReadBufferedData(const int readSocket,char *const buffer, 
const size_t bufLen,size_t *const bytesRead, 
int *const errnoBack) 
OSErr readErr=noErr; 
size_t bytesLeft,bytesThisTime,bytesSoFar=0UL; 
Boolean done=FALSE; 







return readErr; 

OSErr is a typedef for short (like on a Mac) with noErr=0.
I typically use this in a loop, going read-blocked with
select() each time I get socketEOFErr until I get


Board footer

Powered by FluxBB