UNIX Socket FAQ

A forum for questions and answers about network programming on Linux and all other Unix-like systems

You are not logged in.

#1 2006-05-31 02:16 PM

Lloyd
Member
Registered: 2006-02-13
Posts: 53

Re: STL for Tree

How can I make use of the tree available with C++? I don't know where it resides, please help me.

Can you suggest me any other data structure than linked list (like tree) to access the data elements fastly. My problem is I have multiple key fields. In a tree we arrange the elements based on a field's value. But in my case it is four fields. (I think, here I can't use a tree) so which is the efficient data structure available for me?

Offline

#2 2006-06-01 04:45 AM

Uzume
Administrator
Registered: 2002-08-30
Posts: 186

Re: STL for Tree

I am not entirely sure I understand but I might suggest a hash table.

Offline

#3 2006-07-05 07:24 PM

mlampkin
Administrator
From: Sol 3
Registered: 2002-06-12
Posts: 911
Website

Re: STL for Tree

That many STL impls actually are implementing the set(s) and map(s) as trees... 

Its only ( on some systems / compilers ) e.g. Microsoft when you declare a hash_ set or map that it actually uses what one typically thinks of as a hash table type style...


Michael


"The only difference between me and a madman is that I'm not mad."

Salvador Dali (1904-1989)

Offline

#4 2009-05-22 08:34 AM

yurec
Member
From: Singapore
Registered: 2006-11-16
Posts: 134

Re: STL for Tree

"In a tree we arrange the elements based on a field's value" - that is std::set.
"But in my case it is four fields" - I think you should take a look at boost::multi-index.

Regards,
Yurij

Offline

#5 2009-05-22 04:46 PM

jfriesne
Administrator
From: California
Registered: 2005-07-06
Posts: 348
Website

Re: STL for Tree

Lloyd;19453 wrote:

My problem is I have multiple key fields. In a tree we arrange the elements based on a field's value. But in my case it is four fields. (I think, here I can't use a tree) so which is the efficient data structure available for me?

I'm not clear about what you want to do.  Do you want to:

A) Look up items in the data structure by providing all four fields, e.g.:

   Item * GetItem(const keyType & k1, const keyType & k2, const keyType & k3, const keyType & k4);

or do you want to:

B) Look up items in the data structure by providing any one of the four fields:

   Item * GetItem(const keyType & k);  // k could be any of the four keys held by the returned item

If it's (A), I'd suggest making a struct or class out of the four keys and using that as your key; then you can use any data structure in the standard manner.

If it's (B), one option would be to just insert each item into four different data structures, and then look it up by doing a lookup in each data structure in turn until you find a match.

Jeremy

Offline

#6 2009-05-22 09:52 PM

i3839
Oddministrator
From: Amsterdam
Registered: 2003-06-07
Posts: 2,230

Re: STL for Tree

Guys, you are resurrecting a three years old thread.

(Probably because of the stupid spam bot that bumped it up.)

Offline

#7 2009-05-22 11:58 PM

jfriesne
Administrator
From: California
Registered: 2005-07-06
Posts: 348
Website

Re: STL for Tree

i3839;26752 wrote:

Guys, you are resurrecting a three years old thread.

(Probably because of the stupid spam bot that bumped it up.)

Well geez, it's not like we've got anything better do, except work... :^)

Offline

#8 2009-06-03 05:29 AM

mlampkin
Administrator
From: Sol 3
Registered: 2002-06-12
Posts: 911
Website

Re: STL for Tree

Um...

What eludes me about the resurrection is that a set / etc. from the STL allows a comparator to be specified... which could then be written to do the appropriate comparisons against 1 or more elements contained in the type / object to be stored...

The thing is that I am not anti-Boost but... I don't understand the rationale of adding an external library to duplicate functionality already included with the standard libraries... other than maybe save the programmer from typing two or three extra lines of code... ?!


Michael


"The only difference between me and a madman is that I'm not mad."

Salvador Dali (1904-1989)

Offline

#9 2009-06-03 08:49 AM

yurec
Member
From: Singapore
Registered: 2006-11-16
Posts: 134

Re: STL for Tree

Hi!

Do you mean it's better to use std::find (bla,bla,bla) instead of set.get<[keyType]>().find? And think about complex predicate? Anyhow i don't think it can be done efficient using a single set container. That's why i disagree with your suggestion, dear mr.mlapkin.

I found good example of usage multi_index_container here http://www.weiqigao.com/blog/2006/03/28/boost_multi_index_container.html

Although i have to correct this code a bit.

#include <boost/multi_index_container.hpp>
#include <boost/multi_index/member.hpp>
#include <boost/multi_index/ordered_index.hpp>
#include <string>

 struct Customer {
   Customer(const std::string & i_ssn, const std::string & i_name)
   :  m_ssn(i_ssn), m_name(i_name)
   {
   
   }
   
   std::string m_ssn;
   std::string m_name;
   };
   
 struct SSN { };
 struct Name { };
 
 typedef boost::multi_index_container<
   Customer,
   boost::multi_index::indexed_by<
   boost::multi_index::ordered_unique<
   boost::multi_index::tag<SSN>,
   boost::multi_index::member<Customer, std::string, &Customer::m_ssn>
   >,
   boost::multi_index::ordered_non_unique<
   boost::multi_index::tag<Name>,
   boost::multi_index::member<Customer, std::string, &Customer::m_name>
   >
   >
   >
   MyCustomerDB;

int main() 
{
 typedef MyCustomerDB::index<SSN>::type SSNIndex;
 typedef MyCustomerDB::index<Name>::type NameIndex;
 typedef SSNIndex::iterator SSNIterator;
 typedef NameIndex::iterator NameIterator;
 
 MyCustomerDB db;
 SSNIndex& ssnIndex = db.get<SSN>();
 NameIndex& nameIndex = db.get<Name>();
 
 Customer c1("333-44-5555", "John Smith");
 Customer c2("666-77-8888", "Robert Brown");
 
 db.insert(c1);
 db.insert(c2);
 
 SSNIterator ssnIter = ssnIndex.find("333-44-5555");
 c1.m_ssn = "999-00-1111";
 ssnIndex.replace(ssnIter, c1);

return 0;
}

Yurij

Offline

#10 2009-06-05 03:21 PM

mlampkin
Administrator
From: Sol 3
Registered: 2002-06-12
Posts: 911
Website

Re: STL for Tree

Um... yurec...

I have to ask whats up with the "Dear Mr Lampkin" routine... are you writing a response to my post or a love letter ?! 

:lol::P

Anyway... let me clarify... the user seemed to be saying they wanted a collection keyed on multiple fields within a single object... in which case I think a regular stl style set / similar would work fine ( and part(s) of the stored value can be specified as the key segment )... e.g. I don't think the following is complex at all:

// Some definition of class Foo here to be stored...
struct ltFoo
{
  bool operator()(const Foo f1, const Foo f2) const
  {
    // Compare f1.key field 1 to f2.key field 1 and
    // if equal compare f1.key field 2 to f2.key field 2
    // and if equal compare etc.

    return true | false;
  }
};
int main()
{
    set<Foo, ltFoo> Bar;

    // ... add some Foo

    // Just set the appropriate fields of the finderFoo where is a temp
    // obj or persisted per thread ( or similar ) if desired to be thread-safe
 
    Bar.find( finderFoo ); // returns an iterator to the matching obj

}

On the other hand...

If the user who posted the question several centuries ago wanted to not just key on multiple fields within an object... but also to key on specified and differing fields during find operations a la database style indexing operations... then maybe something like the suggested boost container would be appropriate...

But again... if you re-read the post - thats not what he / she said they wanted to do... and unfortunately... at this late date its a bit hard to get clarification... lol


Michael


"The only difference between me and a madman is that I'm not mad."

Salvador Dali (1904-1989)

Offline

#11 2009-06-05 09:27 PM

yurec
Member
From: Singapore
Registered: 2006-11-16
Posts: 134

Re: STL for Tree

Hi,

The task was to access elements of a container fast using multiple keys.
But you are right, only the user who posted the question can make it clear.

Regards,
Yurij

Offline

#12 2009-06-11 09:27 PM

Nope
Administrator
From: Germany
Registered: 2004-01-24
Posts: 385
Website

Re: STL for Tree

- yawns -

Boost, no Boost. That'll be no longer really relevant once the next specification is out. Because, basically the whole Boost Libs will be then part of the STL.

Allthough, at least I won't be effected by that. I use neither STL nor Boost. Both usually are much too slow for my taste.

Offline

#13 2009-06-11 10:01 PM

jfriesne
Administrator
From: California
Registered: 2005-07-06
Posts: 348
Website

Re: STL for Tree

Nope;26868 wrote:

I use neither STL nor Boost. Both usually are much too slow for my taste.

Too slow to develop with, or too slow at run time?

Assuming the latter.... I was under the impression that both were designed to be as efficient as possible, at least for the general case.  What is it that makes them slower than the code you use in their place?

-Jeremy (who doesn't use STL or boost either, but is curious)

Offline

#14 2009-06-12 07:46 AM

yurec
Member
From: Singapore
Registered: 2006-11-16
Posts: 134

Re: STL for Tree

Anyhow you shoud need some kind of structures, algorithms, smart ptrs (threre are much more staff like this in stl and boost). You say that it's not efficient enough. But i'm sure you make your own analogous reusable code for this kind of things. And If it's not just bla-bla about your mega efficient code, i think boost committee (and milions of other developers) will appreciate your great code!
Don't be so greedy to share it!
Waiting for a new library by mr.Nope ;-)

Best regards,
Yurij

Offline

#15 2009-06-12 09:04 PM

Nope
Administrator
From: Germany
Registered: 2004-01-24
Posts: 385
Website

Re: STL for Tree

rofl.

Too slow at run time and usually not that much faster to develop to accept the lost speed later. Besides, they also waste memory. And, especially if you think about the energy problem in the data centers, faster code means less needed energy, creating less heat and running costs.

No, sorry. No libs from me. You see, what makes self written code faster than general purpose libs, is the fact that the general purpose procedures/classes/functions carry a lot around you you just don't need in most cases. Even worse, often you still have to write additional code to get from them what you really need.

Take the tree example. A simple tree might be enough for a small amount of highly dynamic data. But once you enter the realm of huge data sets, you need a more complex tree design. For example one that balances itself, or not a binary tree but one that uses lists for nodes that share the same sorting key. There are so many kinds of trees, each with it's distinct strong and weak points.
Or another favourite of mine: databases. Surely is a db like MySQL or Oracle a mighty tool. But it's usually overkill, resource hungry and it is outside the application, making it mandatory to waste resources for the communication. I am working with search engines these days, and guess what, they all share one thing... they've all got their own DB engines highly specialised for search speed, because those general purpose ones are just way too slow.

I've written a lot code in the past 28 years and I've read a lot books too. So in most cases I can just throw things together for the needs I have. Often I can borrow parts of code from old projects, sometimes I have even something there just right for the new problem. I can't use that in the job though, there I have to write "code" in Perl, Python, C# and the like. But then, there it's useless to optimize too much anyway.

Offline

#16 2009-06-13 01:35 PM

yurec
Member
From: Singapore
Registered: 2006-11-16
Posts: 134

Re: STL for Tree

Sorrry, i was a bit sarcastic.
For sure there is much sense in your words.
I would say that the libs we are discussing (stl and boost are perfect in 95% ). And you are those 5%.
But for other 95% they are must know and must use.

And i probably  don't have as much experience as you. So that is only my point of view.

Yurij

Offline

#17 2009-06-13 06:09 PM

RobSeace
Administrator
From: Boston, MA
Registered: 2002-06-12
Posts: 3,826
Website

Re: STL for Tree

You just have to think about what your code is going to be doing...  If I'm writing an
end-user GUI app, I don't care as much about super-efficiency, so using a general
purpose solution will be fine as far as speed goes, and easier to code...  But, if I'm
writing a back-end server app of some kind, then I'm likely to care a lot more
about efficiency, and would have no qualms about rolling my own custom solution in
place of a general purpose solution, if I knew it'd be more efficient for my own
specific uses...

Of course, I write pretty much everything in straight C, so I'd never use STL or boost
for anything, anyway... ;-)  Damn OOP wussies... ;-)

Offline

#18 2009-06-14 03:14 PM

Nope
Administrator
From: Germany
Registered: 2004-01-24
Posts: 385
Website

Re: STL for Tree

Yeah, that's it. Server apps are things where I write complete own and optimized code. But also normal apps that need a lot cpu will get at least the power hungry part highly optimized.

For all other things STL might be ok. But I often do compare programming with composing music. And STL is like using pre-recorded sound snippets. To get the flow right you have to bend your code/music to match the needs of those. You'll inevitably get something that doesn't sound right and it isn't really your music either.
So yurec, if you use STL and Boost to finish work to earn money or a good grad, it's really ok. But honestly, if you do write something for yourself that you don't need asap, then go the way to write the one or other thing by yourself. You'll learn a lot and it's the only way to get experience and a better understanding of data structures and how to organize/process/manipulate them in a meaningfull way.
If you are content with a limited understanding and think you won't need or appreciate any thing any deeper, then you could also start using things like the curl lib and forget so low level tasks like socket programming. After all, there's almost for all a lib out there in the web.

:P och Rob. OOP, as well as the whole rest of OOAD isn't as bad as you might think. It actually makes team work and maintenance so much easier if done right. Just give it a try :P

Offline

#19 2009-06-14 06:10 PM

RobSeace
Administrator
From: Boston, MA
Registered: 2002-06-12
Posts: 3,826
Website

Re: STL for Tree

But I often do compare programming with composing music.

It's really a mixed discipline...  It's roughly equal parts rigid science/engineering, and
art/expression...  And, the end result is generally meant to be useful for some practical
purpose, rather than exist purely as a work of art (like music)...  Rather than music, I'd
say that architecture is a better comparison...  You can build a boring pre-fab house
without much thought, or you can design your own custom work of art building (which
is also practically useful as an actual building)...

You'll learn a lot and it's the only way to get experience and a better understanding of data structures and how to organize/process/manipulate them in a meaningfull way.

I agree, but it's also possible to carry this philosophy to extremes...  An assembly
programmer would probably look down on us C coders for being lazy and letting
the compiler do our work for us, instead of writing our own assembly code and
truly deeply understand the workings of the computer...  Embedded developers
(or kernel developers) probably look down on us for relying on a bloated libc with
so many useful features that they have to do without or reinvent on their own...
Basically, it's all a matter of perspective, and unless you're truly coding in raw
machine code, without relying on any external libraries at all, there's someone who
can look down on you, too...

The good thing about programming is you CAN make use of other people's hard
work, without always needing to reinvent the wheel...  Sometimes, your needs for a
wheel are really specific and no one has one that will quite do, so you have to
reinvent your own...  But, most of the time, someone else has already had that same
need and has one ready to use, so you can just grab it and get on with the important
design without worrying about the little fiddly bits...

Now, taken to extreme, you end up with the equivalent of the aforementioned pre-fab
house, which takes no skill or knowledge at all to design or build...  You have a
program with various pre-built bits of code wired together by someone without much
real understanding of how they work, and it's generally a crappy program...  But,
there has to be some compromise...  You need to know what parts are important to
focus your design skills on, and which are just trivial details you are better off using
a pre-built solution for...  To continue the architecture metaphor: you'd be foolish to
try to come up with your own special concrete mixture, or forge your own steel beams,
or grow your own trees to chop down for wood, etc...  Some things, you just need
to let someone else worry about, while you get on with the real important work...
The main trick of programming, I think, is making that determination about which
things are worth your time and which are not...  Lazy/poor programmers will think
nothing is worth their time, and will always use pre-built solutions anytime they can;
but, it can be equally bad to go the opposite direction and NEVER use any pre-built
solutions, too...  You end up wasting your time on trivialities, and never getting your
job done...  Yes, it's good to know how to do the low-level stuff, and everyone should
do so at least once in their life...  But, for everyday work, in general, you want to
avoid as much low-level drudge work as you possibly can, and concentrate on the
overall high-level design of your code...  Sometimes, you'll need to dig into the low-level
stuff, and you need to be able to recognize when that's the case...  But, most of the
time, you want to avoid it, if possible...

OOP, as well as the whole rest of OOAD isn't as bad as you might think. It actually makes team work and maintenance so much easier if done right. Just give it a try

Oh, I've had to try it, in college...  I wasn't impressed... ;-)  I'll take a well designed
C interface over most C++ classes any day...  It's not a horrible idea, I just don't
tend to think along that OOP mindset... *shrug*

Offline

#20 2009-06-16 02:40 AM

jfriesne
Administrator
From: California
Registered: 2005-07-06
Posts: 348
Website

Re: STL for Tree

RobSeace;26909 wrote:

Oh, I've had to try [OOP], in college...  I wasn't impressed... ;-)  I'll take a well designed C interface over most C++ classes any day...  It's not a horrible idea, I just don't tend to think along that OOP mindset... *shrug*

Most of OOP/C++ is syntactic sugar, but the one thing I'd really miss if I was forced to use straight C would be constructors/destructors, since without them automatic resource management is difficult (probably impossible) to achieve.  And without automatic resource management it's way to easy to make that one tiny slip that lands me in memory-leak and/or dangling-pointer hell....

Offline

#21 2009-06-16 01:42 PM

RobSeace
Administrator
From: Boston, MA
Registered: 2002-06-12
Posts: 3,826
Website

Re: STL for Tree

And without automatic resource management it's way to easy to make that one tiny slip that lands me in memory-leak and/or dangling-pointer hell....

*shrug*  Lots of people seem to feel the same way, but I really don't get what's so
bloody hard about it...  It just seems second nature to me, after all these years...  Do
I still fuck up and cause seg-faults now and then?  Of course...  But, I don't blame it
on the language being too hard to use; it's my own fault for screwing it up...  If I didn't
make bugs there, I'd be making them somewhere else...  As long as we have humans
writing code, we'll have bugs...  But, some people seem to look at every category of
bug as if it were a deficiency in the language, rather than just a mistake on the part of
the programmer...  They think, if we can just redesign the language to eliminate this
class of bug, we'll have a far better language, and far fewer bugs...  But, it never works
out that way, does it?  You just get lots of different types of bugs instead...  Plus, generally
the overhead from whatever method you used in the language to eliminate the original
class of bugs is too great to be tolerable for many uses...  And, sometimes, it doesn't
even truly eliminate those bugs completely, either...  It's just a never-ending game
of whack-a-bug, and if taken to its ultimate conclusion, would result in a language
with absolutely no flexibility at all...  It'd have a single command: "exit", and that's all
you are safely allowed to do; anything beyond that is deemed too complex for your
simple human bug-prone programmer mind to comprehend... ;-)

Offline

#22 2009-06-16 07:41 PM

jfriesne
Administrator
From: California
Registered: 2005-07-06
Posts: 348
Website

Re: STL for Tree

RobSeace;26940 wrote:

If I didn't make bugs there, I'd be making them somewhere else...  As long as we have humans writing code, we'll have bugs...

Sure, but that's an oversimplification.  While zero defects is probably an impossible goal, it's still worthwhile to select the tools that will let you minimize the number of bugs generated per unit of functionality created.

By eliminating resource tracking as a source of bugs, I am able to take all the time that I would have spent tracking down memory leaks and random memory corruption crashes and use that time for other things, such as creating more interesting bugs^H^H^H^H features instead (or I can just use it to go home at 5PM and have a life, rather than debugging all night).

But, it never works out that way, does it? You just get lots of different types of bugs instead.

I disagree, sometimes it does work that way.... I starting programming in C, and several years later started using C++ as well (well, to the subset of C++ that I like to use, anyway), and I find that it's much easier to write a program now that "just works" (i.e. without requiring heroic amounts of attention to detail, or hours of debugging) in C++ than in C.

Of course, not every new language feature is worth its own complexity cost, so you need to be very judicious in which features you add, and which ones you use in a given context.

Offline

#23 2009-06-16 08:54 PM

RobSeace
Administrator
From: Boston, MA
Registered: 2002-06-12
Posts: 3,826
Website

Re: STL for Tree

While zero defects is probably an impossible goal, it's still worthwhile to select the tools that will let you minimize the number of bugs generated per unit of functionality created.

Sure, but that's kind of hard to quantify, too...  I mean, if we could accurately count all
the bugs in our code, then we theoretically wouldn't have any left... ;-)

I disagree, sometimes it does work that way....

Sometimes, perhaps...  The addition of prototypes with ANSI C probably led to better
code than prototypeless K&R C, with no real downsides...  But, it just seems that most
of the time, these sort of "protect the programmers from themselves" features end up
tying the hands of programmers who know what they're doing, and encouraging new
programmers to learn less and less about how to do things properly, relying on their
language to hold their hands all the time...  Most importantly for new programmers, I
think it's vital that they be allowed to shoot themselves in the foot repeatedly, and
hopefully learn from it...  Once you get to a certain experience level, then sure you
can choose to make things cushier and easier for yourself, if you like...  But, it seems
like they try to push the easiest, hand-holdingest languages on the new programmers
first, and more and more, many of them never bother to advance beyond that stage,
and are unable to cope with a non-hand-holding environment at all...

Your specific constructor/destructor example wasn't even the main thing that inspired
my rant...  It just got me thinking of all the other hand-holding features everyone likes
to push whenever they trot out the old "C is a horrible language, and no one should
be using it to write apps anymore!" tired argument...  "Keeping track of how big my
buffers are so I don't overflow them is too HARD!  Waaaaaa!"  I want to slap people
every time I hear any variation on that...  Maybe if they learned to program properly
first, they'd be able to handle it...  But, no, they like to project their own weaknesses
onto EVERYONE, and since they can't handle it, NO ONE can, and no one should
be allowed to try! ;-/

But, anyway...  No, I don't have much against constructors/destructors, either...
Though, honestly, I don't see the huge benefit of them, either...  What's wrong with
simply creating *_init() and *_free() functions for whatever objects you are dealing
with, and put all the constructor/destructor guts in there, along with the allocation
and deallocation code?  That's what I generally do for complex structures and
similar things that need any behavior above and beyond simple malloc()/free()...
(And, if you want to allow stack/static versions, have versions that do the work but
without the allocation/deallocation...)  The only trick is remembering to call the
functions when you should...  But, don't you kind of have to do that with C++ anyway
(ie: remember to call "delete")?  So, what is the big improvement?

Offline

#24 2009-06-16 09:57 PM

jfriesne
Administrator
From: California
Registered: 2005-07-06
Posts: 348
Website

Re: STL for Tree

RobSeace;26944 wrote:

"Keeping track of how big my buffers are so I don't overflow them is too HARD!  Waaaaaa!"  I want to slap people every time I hear any variation on that...  Maybe if they learned to program properly first, they'd be able to handle it...

The problem is, even seasoned, professional programmers with decades of experience will occasionally make a mistake, or forget a corner case.  They will do it much less often than the newbies, to be sure, but nobody ever gets their defect ratio down to zero.

And when the resulting code gets out into the real world, it sooner or later is likely to be turned into an exploit, which could end up costing millions of dollars (and conceivably, even human lives).  So there is definitely something to be said for using a programming language in which buffer overflows are impossible to create.  (The tradeoff is usually execution speed, but in many cases the tradeoff is well worth it)

Though, honestly, I don't see the huge benefit of them, either...  What's wrong with simply creating *_init() and *_free() functions for whatever objects you are dealing with, and put all the constructor/destructor guts in there, along with the allocation and deallocation code?

The problem is that sooner or later you (or I, or somebody) will either forget to call a necessary _free() function, or will call it when they should not have called it.  In the former case, you end up with a memory leak (very hard to detect and track down) and in the latter case you end up with memory corruption (also very hard to detect and track down).  In both cases, sooner or later the program will crash, which is what we want to avoid.

The only trick is remembering to call the functions when you should...  But, don't you kind of have to do that with C++ anyway
(ie: remember to call "delete")?  So, what is the big improvement?

Well, that's exactly it .... in C++ you can arrange things so that you don't have to remember to call delete, but still be guaranteed that your object will be deleted when the last reference to it goes away.

As a quick example, here's a snippet of code in C:

// This code may, or may not, contain a bug!
   int s = socket(AF_INET, SOCK_DGRAM, 0);
   if (s >= 0)
   {
      DoSomething(s);
      close(s);
   }

Now the question is:  is that code buggy?  It could be... if DoSomething() calls close() on the socket, then the socket will be close()'d twice.  Or if DoSomething copies the socket FD value into a static variable somewhere, it could get accessed again after the code above called close() on it.  Either of those can be fatal errors, and the only way to ensure they don't occur is by exhaustively checking DoSomething() (and all the code that DoSomething calls) to ensure that they don't do those things.

But of course even if I examine DoSomething() carefully, I have no guarantee that next week some doofus (okay, probably me) won't modify DoSomething() so that it now close()'s the socket.  Doh!

Now, here is what I can do in C++ instead:

// This code is guaranteed not to contain a bug!
   int s = socket(AF_INET, SOCK_DGRAM, 0);
   if (s >= 0)
   {
      SocketHolderRef sRef(new SocketHolder(s));
      DoSomething(sRef);
   }

In this case, there is no possibility of the socket being leaked, and also no possibility of the socket being close()'d twice.  That is because the SocketHolder object contains a reference counter, which is incremented by the SocketHolderRef constructor and decremented by the SocketHolderRef destructor, and only when the reference count gets to zero will the SocketHolderRef destructor delete the SocketHolder object, at which point the SocketHolder destructor will call close() on the socket.

That means that I am guaranteed that DoSomething() will not close the socket (because I can grep my code and see that the only call to close() in the codebase is in the SocketHolder destructor).

It also means that if DoSomething wants to keep the socket open for later use, he can do so by making a copy of the SocketHolderRef object I passed to him.  Since my code never explicitly calls close() on the socket, it is guaranteed that the socket will remain valid for as long as he keeps his SocketHolderRef object.

The same pattern can be used for any dynamically allocated resource (memory, file handles, whatever).  The benefit is that the dynamically allocated resources are now cleaned up just as reliably/automatically as stack variables are, because they are cleaned up automatically by the SocketHolderRef destructors when the stack is unwound.  (and the reference-counting is there to correctly handle the case where more than one thing is referencing the resource).

Of course, it's still possible to subvert the mechanism (e.g. by copying the file descriptor out of the SocketHolder object and calling close() on it), but now you have to go out of your way to do so.  Generally speaking, you no longer have to even think about managing the cleanup of resources, it "just happens" at the appropriate time.  Because of that, it is now possible to write bug-free code even while half-asleep at 3AM... :^)

Jeremy

Offline

#25 2009-06-16 11:18 PM

RobSeace
Administrator
From: Boston, MA
Registered: 2002-06-12
Posts: 3,826
Website

Re: STL for Tree

The problem is, even seasoned, professional programmers with decades of experience will occasionally make a mistake, or forget a corner case.

Of course, but the difference is we don't (often ;-)) whine about it being "too hard" to
do the right thing...  We know that we just fucked up...  Mistakes will always happen,
as long as we humans have any creative input into the process of programming...
(And, even if/when we don't, I'm sure mistakes will still happen, because who will
have programmed the original program-writing bots?  Us error-prone humans... ;-)
Maybe they can evolve away from our flaws after a few generations, though...)  But,
using your mistakes as justification to condemn your language is rather silly...  As
the old saying goes, "It's a poor carpenter who blames his tools"...  If I repeatedly
hit myself in the thumb whenever I try to hammer a nail, and never seem to learn
anything from this, then maybe it's not my hammer that's to blame... ;-)

So there is definitely something to be said for using a programming language in which buffer overflows are impossible to create.

Maybe for some specific uses...  But, in general, if you're writing super-critical code
of the nature you're talking about, you should be doing things a lot more carefully
than your average run-of-the-mill software project, too...  The security will come
from that attention to detail (secure design, written by programmers with lots of
secure coding experience, extensively reviewed and tested), not from just letting
any old ignorant newbie coder off the street write the code in a supposedly "safe"
language, and then pray for the best...

Plus, eliminate buffer overflows in the end-user apps, and they'll just start finding
them in the "safe" language's runtime environment...  Which is typically written in,
*gasp*, an "unsafe" language like C!  (There are plenty of examples of buffer
overflows and int overflows and such in JRE/JVM...)  When you create a self-hosting
"safe" language, whose implementation is completely written in its own language,
then maybe you'll be closer to your ideal...

Besides, if all you care about is eliminating buffer overflows, there are libraries and
compilers that will take care of that for you in plain C, without needing a different
language to do it...  But, if writing super-critical software, you really need to worry
about a whole lot more than just buffer overflows, too...

Well, that's exactly it .... in C++ you can arrange things so that you don't have to remember to call delete, but still be guaranteed that your object will be deleted when the last reference to it goes away.

Interesting...  But, again, you can do reference-counting in straight C, too...  Look at
the GTK+ and glib code sometime; they're loaded with it...

Basically, instead of auditing the unknown DoSomething(), you propose completely
rewriting it to work with a completely different type of data?  I'm not sure how that
would be easier...  But, ok...  I still don't know why you need C++ for that:

typedef struct sock_ref_struct {
    int sock, cnt;
} sock_ref;

sock_ref *sock_ref_open (int family, int type, int protocol)
{
    sock_ref *ref = malloc (sizeof (sock_ref));
    if (!ref) return (NULL);
    ref->sock = socket (family, type, protocol);
    if (ref->sock < 0) { free (ref); return (NULL); }
    ref->cnt = 1;
    return (ref);
}

sock_ref *sock_ref_copy (sock_ref *ref)
{
    if (!ref) { errno = EINVAL; return (NULL); }
    if (ref->cnt < 1) { errno = EINVAL; return (NULL); }
    ref->cnt++;
    return (ref);
}

int sock_ref_close (sock_ref *ref)
{
    if (!ref) { errno = EINVAL; return (-1); }
    if (ref->cnt < 1) { errno = EINVAL; return (-1); }
    ref->cnt--;
    if (ref->cnt == 0) { close (ref->sock); free (ref); }
    return (ref->cnt);
}

...

sock_ref *mysock = sock_ref_open (AF_INET, SOCK_DGRAM, 0);
if (mysock) DoSomething (mysock);
sock_ref_close (mysock);

Yes, your way uses the stack, so you never have to explicitly call any equivalent
to sock_ref_close()...  But, is that really such a huge burden?  You know when you
need to...  And, it's perfectly safe to call it, without your worry about double-close()...
(Which typically isn't a huge problem, anyway...  Unless you've opened up other
FDs in between the two close()'s...)  And, the DoSomething() code will have to be
no more complex than with your way...  If it wants to hold its own reference, it'll have
to call sock_ref_copy() to increment the count, and then sock_ref_close() it when it
is done with it...

*shrug*  I mean, I see some interesting usefullness to your method, but I'm not
convinced it's a huge end-all/be-all deal-breaker type feature...

Offline

Board footer

Powered by FluxBB