UNIX Socket FAQ

A forum for questions and answers about network programming on Linux and all other Unix-like systems

You are not logged in.

#26 2009-06-17 12:01 AM

jfriesne
Administrator
From: California
Registered: 2005-07-06
Posts: 348
Website

Re: STL for Tree

RobSeace;26946 wrote:

Of course, but the difference is we don't (often ;-)) whine about it being "too hard" to do the right thing...  We know that we just fucked up...

But the whole point of computer languages is to make it easier to do the right thing (and harder to do the wrong thing).  If it wasn't for that, we'd all be writing x86 opcodes directly in binary.  The only question is how far you want to take things (and the answer is, it depends on what you are trying to do).


As the old saying goes, "It's a poor carpenter who blames his tools"...

Yup, but it's also a poor carpenter who uses the wrong tool for the job.

Maybe for some specific uses...  But, in general, if you're writing super-critical code of the nature you're talking about, you should be doing things a lot more carefully than your average run-of-the-mill software project, too...

The problem is that any code that faces the Internet is now "critical code", since any buggy program (no matter how trivial its intended purpose) that accepts data from the Internet might be used as an attack vector.   But that doesn't mean that every code monkey out there writing a web application will suddenly become a genius programmer who never makes mistakes (no matter how much you or I would like them to).

The security will come from that attention to detail (secure design, written by programmers with lots of secure coding experience, extensively reviewed and tested), not from just letting any old ignorant newbie coder off the street write the code in a supposedly "safe" language, and then pray for the best...

Alas, any model that relies on programmer expertise for its security is doomed to failure.  Correct, secure coding in C is hard to get right, even when you're good at programming, and let's face it, most people aren't that good.

Plus, eliminate buffer overflows in the end-user apps, and they'll just start finding them in the "safe" language's runtime environment...  Which is typically written in, *gasp*, an "unsafe" language like C!  (There are plenty of examples of buffer overflows and int overflows and such in JRE/JVM...)

Very true... but that's still a win, because there are only a small number of "safe" runtime environments, and they are (for the most part) written by good programmers.  That means the target profile is much smaller:  there are only a few codebases that must be coded 100% correctly, rather than hundreds of thousands of codebases, and there are only a few dozen programmers that must know how to write secure code, rather than millions.

Interesting...  But, again, you can do reference-counting in straight C, too...  Look at the GTK+ and glib code sometime; they're loaded with it...

Oh, I know... and it sucks, because it's really easy to screw it up.  I had a brief foray into Objective C (for iPhone programming) and it was appalling:  for every system call where you passed an object, you had to look up in the documentation whether that system call would increment the object's reference count, or not, and make sure that your code did the right thing.  The upshot was that you still had to constantly worry about memory leaks and double-frees, because it was so easy to forget that function foo() would increment the refcount but function bar() would not.

Feh.  :^)

Basically, instead of auditing the unknown DoSomething(), you propose completely rewriting it to work with a completely different type of data?  I'm not sure how that would be easier...

Well, in my case, I don't have to rewrite DoSomething() because I wrote it to use the 'safe' ref-counted data types the first time.  Obviously it doesn't work quite so well when calling out to third-party code, but even then you can write a simple wrapper C++ API that gives you the desired 'safe' semantics.

Yes, your way uses the stack, so you never have to explicitly call any equivalent to sock_ref_close()...  But, is that really such a huge burden?  You know when you need to...

In simple code (e.g. a one-file test/example program), manual resource management is not a huge burden, and in that situation it's often not worh the effort.

However, in a large program with hundreds of APIs and thousands of files to dig through, trying to remember how every function expects resources to be managed really IS a huge burden to get right 100% of the time.   Especially in cases where you want to transfer resource ownership between multiple layers of API, and still be able to fail cleanly at any time without leaking any resources, it can get very complicated indeed, and the chances of messing something up and causing a problem increase dramatically.

And, it's perfectly safe to call it, without your worry about double-close()... (Which typically isn't a huge problem, anyway...  Unless you've opened up other FDs in between the two close()'s...)

Double-close is a big problem in a multithreaded environment, because a different thread may have reallocated the FD at *exactly* the wrong time, just after your first close() is executed but before your second close() is executed.  The reason the problem is so bad is that when the double-close() bug is present, the actual error happens so rarely that it's extremely difficult to track down, and when it does happen, the cause is not at all obvious.  :^(

And, the DoSomething() code will have to be
no more complex than with your way...  If it wants to hold its own reference, it'll have to call sock_ref_copy() to increment the count, and then sock_ref_close() it when it is done with it...

In the C++ environment, sock_ref_close() never needs to be explicitly called, and sock_ref_copy() is just the assignment operator, so if you want to keep the resource open for yourself it's just a matter of:

static SocketHolderRef myRef = theRefArgument;

Of course that's just syntactic sugar, but not having to call sock_ref_close() is a big win because it makes it impossible to forget to call sock_ref_close(), which in turn means that you won't leak memory.

*shrug*  I mean, I see some interesting usefullness to your method, but I'm not convinced it's a huge end-all/be-all deal-breaker type feature...

I'm sure it's not.... but it is something that I find quite useful, that can't be done as well in C.

Jeremy

Offline

#27 2009-06-17 12:04 AM

RobSeace
Administrator
From: Boston, MA
Registered: 2002-06-12
Posts: 3,829
Website

Re: STL for Tree

*sigh*  So, of course I had to go and demonstrate a bug in the tiny bit of C code I
typed, as if just to prove your point for you... ;-)

if (ref->cnt == 0) { close (ref->sock); free (ref); }
    return (ref->cnt);

Yeah, obviously you don't want to reference "ref" after freeing it... ;-/  Pretend I had a
"return (0);" inside the curly-braces, or something... ;-)

Offline

#28 2009-06-17 12:44 AM

RobSeace
Administrator
From: Boston, MA
Registered: 2002-06-12
Posts: 3,829
Website

Re: STL for Tree

But the whole point of computer languages is to make it easier to do the right thing (and harder to do the wrong thing). If it wasn't for that, we'd all be writing x86 opcodes directly in binary.

I always assumed the point of higher level languages was to keep us sane by not
having to think in assembly or machine language on a regular basis... ;-)

Yup, but it's also a poor carpenter who uses the wrong tool for the job.

True...  But, when you have an all-purpose tool that can literally do ANYTHING if you
know how to operate it correctly, it's hard to say "it" is the one that's wrong, if you
aren't capable of using it for that properly...  Especially when some have demonstrated
they can...  (Even though every once in a while, one might saw off his own leg with
it accidentally, or something... ;-))

But that doesn't mean that every code monkey out there writing a web application will suddenly become a genius programmer who never makes mistakes (no matter how much you or I would like them to).

No, I know...  And, again, it's not about making mistakes; EVERYONE makes
mistakes...  It's about accepting responsibility for them and fixing them, rather than
crying that it's too hard...

But, yes, some people perhaps shouldn't be allowed to play with the grown-up's
toys, like C... ;-)  Fine, if they're the sort of code-monkey who just doesn't care and
isn't going to ever care, keep them on their padded playground, in full protective
gear...  But, their code is STILL going to suck, I guarantee you...  All the holes that
the language itself traps and deals with may go away, but they'll likely find some
other way to screw things up...

Alas, any model that relies on programmer expertise for its security is doomed to failure.

Any security model is doomed to failure, ultimately...  Given enough resources, an
attacker can break through anything...  Still, I'd rather have experienced programmers
writing my secure code rather than the disinterested web-monkeys from the previous
remark...  Do you think OpenSSH would become more secure if we took it away
from the experienced OpenBSD people, and let random Java coders rewrite it in
Java instead of C?

The upshot was that you still had to constantly worry about memory leaks and double-frees, because it was so easy to forget that function foo() would increment the refcount but function bar() would not.

That just sounds like a poor design to me, then...  And, ObjC != C, either...  It's as
different a language as C++, at least...

It's still perfectly possible to design a reasonable API that behaves the way you want
in plain C, I think...  The only thing you'd miss out on is the auto-close/free behavior
when running out of scope...  Which, admittedly is nifty and potentially very useful...
But, given that your scope is always going to pretty limited (unless you're the type
who likes multi-thousand-line do-it-all functions), it doesn't seem too hard to remember
to do the explit close/free call at the trailing "}" (or sooner)...

However, in a large program with hundreds of APIs and thousands of files to dig through, trying to remember how every function expects resources to be managed really IS a huge burden to get right 100% of the time.

It can be a pain, but if designed well from the start, it shouldn't be too bad...  Of
course, we've all had code drift from clean, simple designs to horrible, Frankensteinian
monster code over time, too...  So, yeah, I get what you're saying...

I just don't know how your way is any better...  That code still all has to agree to be
using your magic ref-counting class properly in order for it to all work together
properly...  So, it could instead just be all using a similar not-so-magic set of
ref-counting lib functions instead...  As long as everyone follows this one simple,
clear, hopefully never needing to change API, it should all work as well as your
way, with just as little worry... *shrug*

Offline

#29 2009-06-17 12:51 AM

i3839
Oddministrator
From: Amsterdam
Registered: 2003-06-07
Posts: 2,230

Re: STL for Tree

I don't have as much programming experience as either Rob or Jeremy, but
I started with C++ and ended up at C.

What I detest in both "solutions" is that they both make simple code more
complicated than needed.

Making code more complicated is never a good way of making code more
secure. And making things more complicated than needed is what C++ code
seems to do all the time.

In this example, a socket never comes alone, there's almost always some
other associated data. Then it's really easy to simplify resource handling
because the lifetime of both some state structure and the socket is the
same. Having a good code design with consistent rules is the right solution,
not some lame "let's wrap it and then forget about it". Bah.

Offline

#30 2009-06-17 01:36 AM

jfriesne
Administrator
From: California
Registered: 2005-07-06
Posts: 348
Website

Re: STL for Tree

RobSeace;26949 wrote:

True...  But, when you have an all-purpose tool that can literally do ANYTHING if you now how to operate it correctly, it's hard to say "it" is the one that's wrong, if you aren't capable of using it for that properly...  Especially when some have demonstrated they can...  (Even though every once in a while, one might saw off his own leg with it accidentally, or something... ;-))

Let's just say there is a reason why most people prefer to use a microwave oven to cook their food, rather than a flaming pit :^)

No, I know...  And, again, it's not about making mistakes; EVERYONE makes mistakes...  It's about accepting responsibility for them and fixing them, rather than crying that it's too hard...

But, it's also about learning from your mistakes, and improving your process so that you won't repeat them in the future.  The best way to avoid human error is to delegate tedious, error-prone tasks to the computer.... in fact, some would argue that that is the definition of programming.

Do you think OpenSSH would become more secure if we took it away
from the experienced OpenBSD people, and let random Java coders rewrite it in Java instead of C?

Nope, but I do think OpenSSH would probably be more secure if it was written in Java (by people as qualified in Java as the current OpenSSH authors are qualified in C), and of course if it was running on a bug-free JVM.

Not that I'm a Java fan, mind you, but C code is vulnerable to buffer overflows, and Java isn't.

That just sounds like a poor design to me, then...  And, ObjC != C, either...  It's as different a language as C++, at least...

Probably so, but I think a C API doing the same thing would eventually be liable to suffer from the same problem, unless its growth was managed very carefully.

But, given that your scope is always going to pretty limited (unless you're the type who likes multi-thousand-line do-it-all functions), it doesn't seem too hard to remember to do the explit close/free call at the trailing "}" (or sooner)...

Well, if you're the type that likes to have more than one exit point in his function, then you need to remember to repeat the close/free call at every exit point, or use a "goto" or something to skip to a cleanup section.  Either way is rather error prone and ugly.

I just don't know how your way is any better...  That code still all has to agree to be using your magic ref-counting class properly in order for it to all work together properly...

In the C++ case, "using it properly" means using the Ref objects almost the same way as one would use ints or floats.  (I say "almost" because my implementation doesn't include copy-on-write, but one could add that if one wanted).  So it actually is harder to get it wrong than it is to get it right, and therefore one tends to get it right more often than not.

-Jeremy

Offline

#31 2009-06-17 01:39 AM

jfriesne
Administrator
From: California
Registered: 2005-07-06
Posts: 348
Website

Re: STL for Tree

i3839;26950 wrote:

In this example, a socket never comes alone, there's almost always some other associated data. Then it's really easy to simplify resource handling because the lifetime of both some state structure and the socket is the same.

Ah, but that doesn't solve the problem, it only moves it up one layer.  Now there needs to be some code that remembers to free the structure-that-contains-the-socket.  Also, the author of the structure-that-contains-the-socket needs to remember to close the socket when the structure is done with it.   In both cases, it's possible even for a good programmer (on a bad day) to forget to do these things, and presto, memory leak...

Offline

#32 2009-06-17 08:46 AM

yurec
Member
From: Singapore
Registered: 2006-11-16
Posts: 134

Re: STL for Tree

Impressive discussion, guys.
However that is c++ thread.

Regarding oop and not oop. After one of my early program : Java script chess, i found that funcional coding is a big mess in comparance to oop mindset.

Progress of languages which are easy to learn, fast to code and produce many times cheaper applications that satisfy user needs prooves that OOP is a really good idea.

But there are still will be cars which are hand made (exlusive, of highest quality and incredibly expensive).

Offline

#33 2009-06-17 10:47 AM

i3839
Oddministrator
From: Amsterdam
Registered: 2003-06-07
Posts: 2,230

Re: STL for Tree

jfriesne;26952 wrote:

Ah, but that doesn't solve the problem, it only moves it up one layer.  Now there needs to be some code that remembers to free the structure-that-contains-the-socket.  Also, the author of the structure-that-contains-the-socket needs to remember to close the socket when the structure is done with it.   In both cases, it's possible even for a good programmer (on a bad day) to forget to do these things, and presto, memory leak...

Umm, considering that it only needs to happen at top level, you would have
a memory leak in any other language as well, cause you would have dangling
references. If you actually use the thing for something then, oh horror, you
need to think of many things. Closing the socket and freeing the memory
are the least of your problems, and also the easiest to fix and detect when
it happens. Of course if your program has a crappy design and it all gets
very messy I'm sure you want some "help" in "getting it right".

yurec wrote:

Progress of languages which are easy to learn, fast to code and produce many times cheaper applications that satisfy user needs prooves that OOP is a really good idea.

That is e.g. Python, not C++ or Java, and that is good for other reasons
than it supporting classes. Like making good built-in support for data
structures an integral part of the language, which in Java is sadly lacking.

The mistake you make is thinking that merely adding OOP support to a
language means that all code in that language is OOP and all code in others
isn't OOP. OOP is a way of programming, not a programming language
property.

Some of you are arguing for languages which have automatic memory
management. C++ isn't one of them. And Java isn't high level enough
to be much of an improvement, it's mostly just inflexible.

EDIT: Additionally, OOP is good because it adds structure to the code.
But the world isn't black and white, not being OOP doesn't mean that
the code should be an inconsistent mess.

Offline

#34 2009-06-17 01:33 PM

RobSeace
Administrator
From: Boston, MA
Registered: 2002-06-12
Posts: 3,829
Website

Re: STL for Tree

Let's just say there is a reason why most people prefer to use a microwave oven to cook their food, rather than a flaming pit :^)

And, I think most people would also agree that barbecued food is far superior to
microwaved food, as well...  There's a trade-off you're making for ease and simplicity
and "safety" (perceived or real); and that's quality...

But, it's also about learning from your mistakes, and improving your process so that you won't repeat them in the future.

Of course, one would hope one learns from past mistakes...  It may not completely
prevent you from making them in the future (we're still only human, after all), but if
nothing else, you'll probably be able to recognize when you've made them a lot faster
and easier than before...

The best way to avoid human error is to delegate tedious, error-prone tasks to the computer....

Using code written by an error-prone human?? ;-)

No, I do completely agree...  And, I spend much of my time writing up simple tools
to automate various tedious tasks...  It's definitely a good idea, and something we
all should do...

Nope, but I do think OpenSSH would probably be more secure if it was written in Java (by people as qualified in Java as the current OpenSSH authors are qualified in C),

See, I think that's just a completely bullshit claim...  Your hypothesizing that it'd be
more secure merely because the language theoretically eliminates one class of bugs
not eliminated by C...  Despite the fact that the C developers religiously use only
"safe" bounds-checking functions, and so there's almost zero chance of that particular
class of bug popping up to start with...  Meanwhile, what new potential bugs does
Java bring with it that weren't an issue in C?  I'm not at all a Java person, so I can't
speculate with great athority, but just as an example, I've heard most Java code tends
to be multi-threaded for various reasons (the only way to deal with multiplexing
multiple sockets?)...  Multi-threaded code brings with it a whole HOST of potential
problems, with things like race conditions and locking problems and such...  So, just
maybe this hypothetical Java OpenSSH would end up being a whole lot LESS secure
than the C version...  Or maybe not...  The point is, you can't say...  And, trying to
just based on hypothetical language features is pure bullshit...  The code and the
programmers who write it are all that matters; the language itself is almost irrelevant...

and of course if it was running on a bug-free JVM.

Right...  And, such a thing exists where?

Probably so, but I think a C API doing the same thing would eventually be liable to suffer from the same problem, unless its growth was managed very carefully.

Well, that is part of our job: to carefully manage stuff like this...  Especially when it
comes to heavily used APIs...  They need to be well thought out and designed
properly from the start, and any future changes need to be very carefully planned
and executed...  If, over time, it morphs into something unpleasant to use, then it
simply needs to be scrapped and redesigned in a better way...  Half of my time is
spent scrapping and rewriting old shitty code...  (Probably more than half, these
days, since we're moving platforms, and instead of blindly port old shit, I'm taking
the opportunity to redesign it wherever it makes sense...)

Well, if you're the type that likes to have more than one exit point in his function,

Usually a bad idea, and I avoid it whenever possible in real code...  (The quick and
dirty snippets posted above were just off-the-cuff junk code...)

then you need to remember to repeat the close/free call at every exit point,

If you need it for the life of the function, yes...  But, if you have limited need for it (as
in your simple example with scope limited to a tiny local block), you can just call it
once right at that point...  But, if you need it for the life of the function, then yes you
have to treat it the same as you would any other non-automatic data you need for
the life of the function...  We have to deal with this all the time with opened FDs,
allocated memory, etc...  It's nothing new or unusual or hard to deal with...

or use a "goto" or something to skip to a cleanup section. Either way is rather error prone and ugly.

I can see goto being "ugly", but how exactly is it "error prone"?  It's not likely you're
going to accidentally jump to some completely unrelated goto point somewhere...  It
seems pretty straightforward and error-free to me... *shrug*

However, I prefer to avoid that approach whenever possible, as well...  I'd generally
prefer to set some error code or similar, and then all future processing within the
function tests that before continuing...  Eg:

int some_func (...)
{
    int ret = 0;

    /* ... */

    if (some_bad_thing)
        ret = ERR_SOME_BAD_THING;

    if (!ret) {
        /* ... do more real work ... */
        if (some_other_bad_thing)
            ret = ERR_SOME_OTHER_BAD_THING;
    }

    if (!ret) {
        /* ... do more real work ... */
    }
    /* ... */

    /* ... now do whatever cleanup is needed ... */
    return (ret);
}

That can also get ugly in some cases...  In which case the goto approach is often
much cleaner and nicer, believe it or not...  (Just because goto should generally be
avoided doesn't mean it's pure evil and never has its uses...)

Some people also define a macro which does all their clean-up code and returns
from the function, and just always use that macro anywhere they want to return...

Additionally, OOP is good because it adds structure to the code.

It's ONE way to add structure to code...  It's far from the only way, or necessarily
the "best" way...

And, yes, you can do OOP in C just as well as C++, just without some of the syntactic
sugar...  (Again, the GTK+ libs are a good example of this...)  It's just not the only
way to go, and you can have perfectly clear, structured, usable non-OOP APIs...

Offline

#35 2009-06-17 02:19 PM

i3839
Oddministrator
From: Amsterdam
Registered: 2003-06-07
Posts: 2,230

Re: STL for Tree

RobSeace;26962 wrote:


It's ONE way to add structure to code...  It's far from the only way, or necessarily
the "best" way...

And, yes, you can do OOP in C just as well as C++, just without some of the syntactic
sugar...  (Again, the GTK+ libs are a good example of this...)  It's just not the only
way to go, and you can have perfectly clear, structured, usable non-OOP APIs...


Exactly, thank you for better formulating what I tried to say.

Personally I think more in terms of modules and data structures and
their interactions, preferring to keep code and data as separate entities
in my mind instead of throwing it on one heap and calling it all objects.

Offline

#36 2009-06-17 05:52 PM

jfriesne
Administrator
From: California
Registered: 2005-07-06
Posts: 348
Website

Re: STL for Tree

RobSeace;26962 wrote:

Despite the fact that the C developers religiously use only "safe" bounds-checking functions, and so there's almost zero chance of that particular class of bug popping up to start with...

Google says you are wrong: 

http://www.google.com/search?hl=en&clie … f&oq=&aqi=

I've heard most Java code tends to be multi-threaded for various reasons (the only way to deal with multiplexing multiple sockets?)...

That was a valid criticism of Java, but newer versions of Java added select()-style APIs so that you can now do single-threaded Java networking programs the same way you would in C/C++.

And, trying to just based on hypothetical language features is pure bullshit...  The code and the programmers who write it are all that matters; the language itself is almost irrelevant...

I respectfully disagree.  In a language where it is impossible to make mistake X, mistake X will not be made.  Yes, it will always be possible for a lousy programmer to write buggy code.  But it is also true that some languages are more error-prone than others, and that the more error-prone a language is, the more errors people will make while using it.  (if you don't believe me, try re-writing one of your favorite C programs in brainfuck and see how things go :^))


Right...  And, [a bug-free JVM] exists where?

In the NSA's top-secret sealed vault, right next to the bug-free C compiler.... :^)

[Multiple return points are] usually a bad idea, and I avoid it whenever possible in real code...  (The quick and dirty snippets posted above were just off-the-cuff junk code...)

The reason they are usually considered a bad idea is because in procedural languages they make it really easy to forget to do your cleanup/rollback work.  Once your cleanup/rollback work is done for you automatically (by reference counting in C++ or by a garbage collector in Java, or whatever) they become a good deal safer and more usable.

But, if you need it for the life of the function, then yes you
have to treat it the same as you would any other non-automatic data you need for the life of the function...  We have to deal with this all the time with opened FDs, allocated memory, etc...  It's nothing new or unusual or hard to deal with...

You "have to deal with that stuff all the time" in C, but not in other languages.  It's something that a C++ compiler (or Java JVM or etc) can take care of for you, just like the C compiler takes care of mapping variables to registers.  With the computer handling that detail, those programmer-brain-cycles are freed up to worry about other things.

Some people also define a macro which does all their clean-up code and returns from the function, and just always use that macro anywhere they want to return...

Those are all ways to manually ensure the cleanup code is executed, and they are fine as far as they go... but IMHO they are all inferior to a system where the compiler automatically executes the cleanup code for you at the proper time, since when it is done automatically it is impossible (well, okay, much harder) to accidentally do it wrong.

Jeremy

Offline

#37 2009-06-17 08:42 PM

RobSeace
Administrator
From: Boston, MA
Registered: 2002-06-12
Posts: 3,829
Website

Re: STL for Tree

Personally I think more in terms of modules and data structures and
their interactions, preferring to keep code and data as separate entities
in my mind instead of throwing it on one heap and calling it all objects.

Yes, exactly...  Me too...

Google says you are wrong:

Most of those are duplicate stories about the same 2 real bugs, which are the only
ones I find that anyone could generously call a "buffer overflow"...  One involves
memset() and the other memcpy()...  Neither are traditional standard "buffer overflows"
one normally thinks of (eg: using strcpy() or other unbounded function), but are due
to using incorrect bounds, usually resulting from integer overflow or similar...  And,
one only affects protocol 1 if a non-standard feature is enabled, and priviledge
separation is disabled...  And, the other doesn't seem exploitable for anything...

So, ok, 2 != 0, but I did say "almost zero", which I think is close enough... ;-)

In a language where it is impossible to make mistake X, mistake X will not be made.

Sure, but maybe mistake Y will be made instead, where it wouldn't have before...

Yes, it will always be possible for a lousy programmer to write buggy code.

Exactly...  And, I'd submit that the ONLY thing you'll ever get out of a lousy programmer
is lousy code...  So, trying to dumb down languages to allow for more lousy programmers
is a losing proposition: it will ultimately result in worse code...  Because, when you
get down to it, what matters most is the programmer writing the code...

Now, good programmers using such languages may indeed write better code than
they would in other languages...  But, my main gripe is that it does allow for more
and more subpar programmers to proliferate, because they can get away with a
lot more bad code that's handled for them...  Basically, like I said before, I would
like to see all new programmers restricted to "shoot yourself in the foot" languages,
like C, until they've learned how to actually program well, then allow them to move
on to the so-called "safe" languages, if they want...  But, things don't work out that
way...  It's like learning to drive on an automatic, then being completely unable to
ever drive a manual...  (Which, BTW, applies to me, too... ;-)  Of course, I haven't
driven anything in so long, I don't know if I even remember how, at this point...)

but IMHO they are all inferior to a system where the compiler automatically executes the cleanup code for you at the proper time, since when it is done automatically it is impossible (well, okay, much harder) to accidentally do it wrong.

Sure, having it done automatically at return time is certainly easier...  And, I wouldn't
be at all opposed to a C extension that allowed you to register functions to be called
at function return time (or even block closing time), similar to atexit()...  But, sometimes,
the overhead of a function call at that point is too much to bear, as well, so lots of
code would still not use it...  But, having the option to might be nice...  (And, if it let
you inline the functions, it'd remove any performance issues, too...)

But, really, I think it's not THAT much of a burden to do things manually, either...
Once you understand C coding, you know you have to do such clean-up, so you
tend to code things with that in mind...  Yes, you'll sometimes screw up and forget
something...  But, I still refuse to see that as an indictment of the language...  Any
more than I'd blame a manual transmission car for the driver forgetting to shift to a
higher gear when they should...

Offline

#38 2009-06-17 10:34 PM

jfriesne
Administrator
From: California
Registered: 2005-07-06
Posts: 348
Website

Re: STL for Tree

RobSeace;26968 wrote:

But, I still refuse to see that as an indictment of the language...  Any more than I'd blame a manual transmission car for the driver forgetting to shift to a higher gear when they should...

It's not an indictment; C definitely has its place.  But if we're going to make car analogies, C is like an older car with a manual choke-control lever.  Newer cars all have automatic choke control, so that the user never has to deal with it... and (AFAIK) nobody who drives the newer cars misses the old manual method.

Offline

#39 2009-06-17 11:09 PM

RobSeace
Administrator
From: Boston, MA
Registered: 2002-06-12
Posts: 3,829
Website

Re: STL for Tree

and (AFAIK) nobody who drives the newer cars misses the old manual method.

Yeah, and that's why your analogy doesn't hold...  Because, I assure you, there are
MANY things I would dearly miss about C, if I were forced to write code in some other
language on a regular basis!  (And, I know I'm not alone in that, either...)

No one minds getting rid of useless and annoying features...  It's the ones that may
be hard to use properly, but are very powerful and useful in many cases that many
people would dearly miss if taken away...  That's exactly why I said manual transmission
vs. automatic...  There are a lot of people who refuse to drive automatics, and think a
manual gives you far more control over the car, and an experienced driver of one is
a lot safer than anyone in an automatic...

For instance, glibc has essentially taken away gets()...  No one misses it...  It was a
horrible idea from the start...  But, you take away my pointers (and the ability to
accidentally cause mahem that comes with them), and I'll hurt you... ;-)  They're far
too useful to ever give up, and that's why I would never want to write Java code for
a living...

Offline

#40 2009-06-17 11:27 PM

jfriesne
Administrator
From: California
Registered: 2005-07-06
Posts: 348
Website

Re: STL for Tree

RobSeace;26970 wrote:

Because, I assure you, there are MANY things I would dearly miss about C, if I were forced to write code in some other
language on a regular basis!

Yeah, me too, which is why I don't use Java unless I have to.  OTOH, you can use C++ without losing access to any of the features of C, since C++ is (more or less) a superset of C.

That way, the low-level C stuff is there if you need that level of control, plus you also have access to higher-level methodologies for the times when they are more appropriate.

Of course, some people would go even further and write all their high-level code in Python, with callouts down to C for the high-performance bits... but never mind them :^)

Offline

#41 2009-06-18 11:38 AM

i3839
Oddministrator
From: Amsterdam
Registered: 2003-06-07
Posts: 2,230

Re: STL for Tree

jfriesne;26971 wrote:

Of course, some people would go even further and write all their high-level code in Python, with callouts down to C for the high-performance bits... but never mind them :^)

I don't mind using C++ just like C but with classes. But the rest mostly just
makes everything more complicated and unreadable (weird inheritance
schemes, templates, STL, etc).

Either you want full speed and full control, then use C or minimal C++. Or
if you want more "higher level" convenience, why the hell would you use
something like full C++ or Java then? Go straight to real high level stuff like
Python which can actually make programming quicker and/or easier. And if
you do have only a few high performance bits, then yes, implementing some
parts in C and using that from the rest gives the best of both worlds.

As a last point I'd like to mention that in reality it's often important what
standard/existing libraries exists which take away a lot of work out of your
hands. In what language they are written decides a lot.

Offline

#42 2009-06-18 02:14 PM

RobSeace
Administrator
From: Boston, MA
Registered: 2002-06-12
Posts: 3,829
Website

Re: STL for Tree

Yeah, me too, which is why I don't use Java unless I have to. OTOH, you can use C++ without losing access to any of the features of C, since C++ is (more or less) a superset of C.

That way, the low-level C stuff is there if you need that level of control, plus you also have access to higher-level methodologies for the times when they are more appropriate.

Yeah, sure...  But, C++ also has a few little quirks that are different than plain C,
which tend to irritate me...  Eg: it stupidly doesn't auto-cast to/from void*, but forces
you to manually cast...  Which defeats the entire purpose of void*!  (Maybe it's only
when casting FROM void* that it's stupid...  I forget...)  And, it only accepts ANSI style
function definitions, whereas it's my long-ingrained habit to use K&R style function
definitions (accompanied by ANSI prototypes, of course) most of the time (basically
for everything except static functions, which I do ANSI style to make the definition be
its own prototype)...  And, I think there are a few other quirks that annoy me enough
to stick with a plain C compiler...

Offline

#43 2009-06-18 06:51 PM

jfriesne
Administrator
From: California
Registered: 2005-07-06
Posts: 348
Website

Re: STL for Tree

RobSeace;26976 wrote:

Eg: it stupidly doesn't auto-cast to/from void*, but forces you to manually cast...  Which defeats the entire purpose of void*!  (Maybe it's only when casting FROM void* that it's stupid...  I forget...)

That, as they say, is a feature.  :^)  You are right, btw, it's only when casting from (void *) to (something_specific *) that you have to make the cast explicit.

That requirement is there to make sure that you really do mean to do the cast, and aren't just not paying close enough attention to the types of your variables.

In any case, the use of (void *) is discouraged in C++ (and I think in C as well) since it throws any notion of type-safety out the window.  It's better to use typed pointers whenever possible.  (In C, you sometimes need to use void pointers e.g. for things like qsort(), where a callback function needs to be able to operate on objects of various types.... but in C++ there are better/safer ways to do that, such as templates or abstract base classes)

Offline

#44 2009-06-18 09:03 PM

RobSeace
Administrator
From: Boston, MA
Registered: 2002-06-12
Posts: 3,829
Website

Re: STL for Tree

That, as they say, is a feature. :^)

And, a damned annoying one...

That requirement is there to make sure that you really do mean to do the cast, and aren't just not paying close enough attention to the types of your variables.

That's a weak rationalization...  All it encourages is blind casting to shut the stupid
compiler up...

If I'm using a void*, then obviously it can represent a pointer to ANYTHING, by
definition...  So, making me explicitly cast it is just making me jump through stupid
hoops for no good reason...  It's not going to make anyone carefully examine what
that void* is really pointing at and verify that it's really ok to cast it; it's going to just
make them blindly cast it to whatever type they're trying to use it as...

(On a semi-related note: one of the greatest things GCC/glibc does is get rid of the
annoying casting to "struct sockaddr *" for all the socket functions, via its magic
"transparent union" construct...  That was also one of my long-standing annoyances
with socket coding, and used to wish they'd made that void* when they switched
over all the other old K&R libc functions (which were generally char* originally)...)

In any case, the use of (void *) is discouraged in C++ (and I think in C as well) since it throws any notion of type-safety out the window.

In C++, perhaps...  But, in C?  That'd be news to me, if so...  I remember the bad
old days of the K&R era libc functions with char* everywhere (eg: malloc() returned
char*, memcpy() took a char*, etc.), and switching to void* was probably the greatest
thing to come out of ANSI C (possibly tied with full prototypes)...  Like you say, in C
there's really no other feasable way to have functions that can work on arbitrary
data, without forcing silly casting everywhere...  (In cases where there are a limited
number of datatypes that need to be supported, you could use the aforementioned
"transparent union" method with GCC, like the glibc socket functions do...  But, if you
need to work with literally ANYTHING, like malloc(), memcpy(), etc., you really want
void*, and nothing else will do...)  So, I really can't imagine anyone discouraging
its use in such cases...  Yes, it's somewhat dangerous, but it's extremely useful...
Like most of the rest of C's features... ;-)

Offline

#45 2009-06-19 12:12 AM

Uzume
Administrator
Registered: 2002-08-30
Posts: 186

Re: STL for Tree

OK where is the Objective-C section/discussion? Are there no Mac programmers out there (I know that is not the only use but that is the common usage today).

This is great discussion by the way...though I am not sure it should be attached to this ancient thread--but who cares?

Offline

#46 2009-06-19 04:18 AM

jfriesne
Administrator
From: California
Registered: 2005-07-06
Posts: 348
Website

Re: STL for Tree

RobSeace;26978 wrote:

That's a weak rationalization...  All it encourages is blind casting to shut the stupid compiler up...

You could say the same thing about any compiler error or compiler warning that can be suppressed or worked around.  The truth is, only a very careless programmer will do "blind casting to shut the stupid compiler up".  A responsible programmer will recognize that the errors are occurring because he is doing a type-unsafe (and potentially erroneous) operation, and he will make a careful decision about whether he really wants to make the cast or not.  If he does, he makes the cast explicit, which has the side benefit of serving as documentation that he made a conscious choice in the matter.  If not, he changes his code to do things in a more type-safe manner.  Either way, the code quality is improved.

Without the error, the programmer might never even notice his potential mistake -- at least not until things started crashing at run time, and then it might take him quite a while to figure out why.

But, if you need to work with literally ANYTHING, like malloc(), memcpy(), etc., you really want void*, and nothing else will do...)

True, but most code is a little more specialized than that.  So for most code you would use a (my_datatype_t *) or similar.

-Jeremy

Offline

#47 2009-06-19 10:44 AM

i3839
Oddministrator
From: Amsterdam
Registered: 2003-06-07
Posts: 2,230

Re: STL for Tree

Okay, although I don't care much about C++ forcing people to cast, I'll
bother to reply anyway...

The reason it's totally useless and hence only pollutes the code is because
the coder writing that code is very aware that it's a void*. And a void* isn't
typeless, it's still a pointer, so most of the type info is preserved.

The main issue is this: Dealing with void* has two sides, the one where putting
something of a certain type in it, and the other side which extracts it again. A
bug only happens when there is a miscommunication and both sides use two
incompatible types. Both code snippets are separate, and in both it's very clear
what type they expect. That is why forcing a cast is totally useless, because it
doesn't prevent the miscommunication from happening, looking at the code
separately it's totally clear what type it is and should be cast to. It results in
code like:

type_a x = (type_a)y;

Useless, isn't it? If y's type changes, there's still a bug. If x's type changes
the programmer also changed the type cast, or does so after the compiler
whined. Either way, a bug goes unnoticed.

Even funnier, casting things makes things worse. As seen above it doesn't
solve a damn thing, but what it does do is add a potential for missing a class
of bugs: When y's type is changed from void* to type_b, you won't hear the
compiler complaining anymore.

Forced casting from void* is like a railing besides a mountain track that
collapes when you try to actually lean on it.

Offline

#48 2009-06-19 02:57 PM

RobSeace
Administrator
From: Boston, MA
Registered: 2002-06-12
Posts: 3,829
Website

Re: STL for Tree

Yeah, exactly what i3839 said...  He nailed it perfectly...  (But that won't stop me from
babbling some more, as well... ;-))

You could say the same thing about any compiler error or compiler warning that can be suppressed or worked around.

Yes, and I do say the same about other equally stupid compiler warnings...  For
instance, the compiler we have to use on QNX (some version of Watcom) chooses
to bitch and moan about unused arguments to all functions, with no easy way to
disable that one particular warning only...  So, every place we have a function with
an unused argument (which is often desirable, if not completely unavoidable; eg.
for conforming to an existing API, but for which you are only implementing a subset
of behavior, so you don't need all the args; a signal handler that doesn't care about
the passed in signal# would be a good example), we have to kluge around this
stupid broken-ass compiler by adding a fake reference to the arg somehow, usually
a do-nothing test like "if (arg) ;", which just clutters the code with ugly, pointless
bullshit...  Just like these forced casts from void* do... ;-/

Warnings should be reserved for actual bad things that should be avoided, not for
normal, perfectly fine behavior...  Spewing out warnings for everything just causes
us to become desensitized, so we end up either ignoring everything it's bitching
about (and so, may miss important warnings that mixed in with the trivial bullshit), or
we just start proactively cluttering our code with useless crap like the above to shut
it the hell up, which leads to  poorer quality code, which should NOT be the goal of
any compiler...  (And, I'm not just picking on Watcom; GCC is also extremely anal in
its warnings, by default...  The difference is, at least GCC lets you disable each
individual warning, according to your tastes, so with a bit of tuning, and an outrageously
long CFLAGS setting, you can eventually get reasonable behavior out of it... ;-))

The truth is, only a very careless programmer will do "blind casting to shut the stupid compiler up". A responsible programmer will recognize that the errors are occurring because he is doing a type-unsafe (and potentially erroneous) operation, and he will make a careful decision about whether he really wants to make the cast or not.

I'm sorry, but you're just crazy if you believe this...  Here's what will happen: the
first time they do a normal assignment of a void* to some specific type pointer, they
get the warning/error, and scratch their heads a bit, and finally give in and add the
explicit cast to shut it up...  Maybe that'll happen a couple more times after that, but
eventually they'll learn to proactively throw that explicit cast in there, because they
know the stupid compiler will bitch at them if they don't...  So, they do it, just as a
standard part of writing the code, almost without thinking about it, after a while...  It
becomes exactly like the stupid "if (unused_arg) ;" bits of code-clutter I mentioned
above: just an ingrained habitual thing you start doing, just as a preventative measure
so you don't have to deal with the obnoxious compiler complaints again...  It doesn't
improve the code in any way; in fact, it makes it far uglier, and serves absolutely no
logical purpose, other than to quiet the compiler...

Besides, weren't YOU the one saying to me that we can't count on programmers to
actually be competent in any way?  Yet now you're saying we should expect them
all to be "careful" and "responsible"??

If he does, he makes the cast explicit, which has the side benefit of serving as documentation that he made a conscious choice in the matter.

Oh, come on...  The fact that you declared a pointer of a certain type and assigned
a void* to it is all the documentation of conscious choice that you bloody well need!
Adding a duplication of that same exact type inside a cast doesn't add anything at
all beneficial to the situation...  It just clutters the code with useless, redundant crap...
I can't for the life of me imagine how one could go about "unconsciously" making
such an assignment, without actually intending to do so...

Offline

#49 2009-06-19 04:24 PM

jfriesne
Administrator
From: California
Registered: 2005-07-06
Posts: 348
Website

Re: STL for Tree

i3839;26982 wrote:

The reason it's totally useless and hence only pollutes the code is because the coder writing that code is very aware that it's a void*.

Is he?  How can you be sure?  In a non-trivial function, it's easy to accidentally use one variable when you meant to use another, and accidentally do something you didn't mean to do.  The reason that languages have type systems is to help catch that sort of error.

And a void* isn't typeless, it's still a pointer, so most of the type info is preserved.

I'd agree to that as you as you replace "most of" with "barely any of".  The fact is that without knowing the type of the data that is pointed to, there is almost nothing you can do with the pointer, other than check to see if it's NULL, or hand it to some other bit of code that does know its type.  So if that's not typeless, it's pretty close to it.

Forced casting from void* is like a railing besides a mountain track that collapes when you try to actually lean on it.

I take your point, but I think your argument is more an argument for avoiding void pointers entirely, since they are inherently unsafe.... :^)

Jeremy

Offline

#50 2009-06-19 05:07 PM

jfriesne
Administrator
From: California
Registered: 2005-07-06
Posts: 348
Website

Re: STL for Tree

RobSeace;26983 wrote:

So, every place we have a function with
an unused argument (which is often desirable, if not completely unavoidable; [...] we have to kluge around this stupid broken-ass compiler by adding a fake reference to the arg somehow, usually a do-nothing test like "if (arg) ;", which just clutters the code with ugly, pointless
bullshit...

There are better ways to deal with the warning... for one thing, you could have a

(void) arg;

which makes it more obvious that you are deliberately ignoring the argument, and didn't just botch an if statement.  In C++, you can also do this:

void MyFunc(int /*myUnusedArg*/)
   {
       [...]
   }

... and the compiler will know not to complain.

Warnings should be reserved for actual bad things that should be avoided, not for normal, perfectly fine behavior...

I guess that all depends on what your definitions of "bad things" and "perfectly fine behavior" are.

I know that I have been helped a number of times by the "unused argument" warning, because it reminded me that there was a portion of the function that I had meant to write that used that argument, and I had forgotten to write it (or I wrote it but forgot to actually use the argument as an input to the logic, or I accidentally left the test input value in, instead of changing the code to use the parameter when I was done testing, or etc).  Without those warnings it would have taken me longer to detect and correct my error.

Spewing out warnings for everything just causes
us to become desensitized, so we end up either ignoring everything it's bitching about (and so, may miss important warnings that mixed in with the trivial bullshit), or we just start proactively cluttering our code with useless crap like the above to shut it the hell up, which leads to  poorer quality code, which should NOT be the goal of any compiler...

Or, you could consider the possibility that the warnings have merit, and change your programming practices to avoid the risks they are trying to warn you about....   [shrug]   Compiler writers aren't ALL idiots, you know... :^)

Besides, weren't YOU the one saying to me that we can't count on programmers to actually be competent in any way?  Yet now you're saying we should expect them all to be "careful" and "responsible"??

Any widely used language is going to encounter all kinds of programmers, at all possible skill levels.  Ideally, the language would help all of them to the extent possible:  It would allow the perfect programming geniuses to get their work done, while at the same time helping the fallible humans avoid as many common mistakes as possible.  The typical programmer is well-intentioned but human, which means he often makes mistakes, but when his mistakes are pointed out to him, he tries in good faith to correct them.  One of the compiler's jobs, then, is to point out these mistakes (and potential mistakes) so that the programmer knows there is a problem that needs correcting.

I can't for the life of me imagine how one could go about "unconsciously" making such an assignment, without actually intending to do so...

Really?  You can't imagine someone typing

void SomeFunc()
{
    struct MyType blah;
    struct MyType * p;
    struct MyType * t = &blah;
    void * tt;

    [.... 50 lines of other code omitted here, for clarity ...]

    p = tt;     /* programmer error here!  */
    p->val = 666;     /* will cause undefined behavior */
}

In C++, the above code will not compile.  The programmer will see the error, examine the code, and say "oops, I meant to type p = t, not p = tt", fix it, and go on with his life.

In C, the above code will compile without any errors or warnings.  Then at run time, it might crash.  Or it might just silently corrupt memory, causing a $100 million spacecraft to crash into a kindergarten full of kittens.  Or something :^)

Offline

Board footer

Powered by FluxBB