Earnestly thinking NULL is a mistake is a symptom
-
Continuing the discussion from NULL: the worst mistake of computer science:
It has festered in the most popular languages of all time and is now known by many names: NULL, nil, null, None, Nothing, Nil, nullptr. Each language has its own nuances.
Some of the problems caused by NULL apply only to a particular language, while others are universal; a few are simply different facets of a single issue.
NULL…
subverts types
is sloppy
is a special case
makes poor APIs
exacerbates poor language decisions
is difficult to debug
is non-composableAssertion (based on Douglas Adams, The Restaurant at the End of the Universe):
“The story so far: In the beginning
the Universe wascomputers were created. This has made a lot of people very angry and been widely regarded as a bad move.”Frustration with NULL is a symptom. The real world is messy...very messy. First go read Falsehoods Programmers Believe About Names (if you haven't already) and then we'll proceed.
Back now? Great. Here's your job: Make a program that handles every possible name (or lack thereof) in the world, so it can be stored with the canonical "Person" object.
Sound a bit hard? Of course it does. Because the world is messy. See, we like to imagine that our programs will be neat and clean, without all these nuisance decisions we're always having to add.
Won't happen. Because programs are about the real world.
Let me give you an example: I work for a healthcare system, and we
havehad our own registration product. Naturally, it required the patient to have a name. But some patients don't have names. See item 40, and also this quote from the comments in the article:@Mahmoud Al-Qudsi said:
An example for number 40, please ;-)
@Patrick said:Someone born into slavery in the Sudan, a woman born in rural China, an American baby recovered after being born into a toilet, a feral child, an amnesiac, etc, etc.
Back to our registration system. When the first [amnesiac/toilet baby/unconscious victim with no ID] floated into our ED, well, that stupid edit on the name field was a problem wasn't it? So our users invented a name (like "John Doe", but something else, I don't remember it now). That name is on, literally, hundreds of thousands of accounts. And we've had to program for it, (IF NAME = 'DOE, JOHN' THEN PERFORM SPECIAL-BILLING...). And then the system started getting touchy about the fact that we had four "DOE, JOHN" patients in the system at the same time, so we started doing "DOE, JOHN A", "DOE, JOHN B" and sometimes the registration person had to go through several names to find one that wasn't in use right now. And then one day, a real person named "DOE, JOHN D" walked into the hospital...and we already had a fake person with that name in the hospital.
The real world is messy. Deal with it.
NULL is a halfway decent solution for messiness. Deal with it. Coming up with nonsense like "
Optional<T>
" just kicks the can down the road. Ultimately, you still have to deal with it.Note: I've had my opinion revised since I wrote the above paragraph.
Optional<T>
has a definite value and I can see using it in many cases. It does not solve the null problem entirely, and it doesn't solve the messy world entirely. But it would help, by explicitly informing the programmer that a value is optional.If you aren't willing to deal with the real world, why are you a programmer?
So, what proposals do we have for more easily dealing with messiness?
-
Fuck NULLs, TRWTF is an unique constraint on people's names.
-
Fuck NULLs, TRWTF is an unique constraint on people's names.
Yeah, we actually got rid of that quite a while back. Instead, the unique constraint is name+SSN+birthdate. But that doesn't really solve the problem either, because that amnesiac that just floated in doesn't have a name or an SSN or a birthdate...so you have to make something up...
Keep on kicking that can.
-
Coming up with nonsense like "Optional<T>" just kicks the can down the road.
FWIW, using NULL is just an efficient way to implement
Optional<T>
for pointer/reference types…
-
Seriously. C++ solved that problem 30 years ago with references (never null) and pointers (may be null). They're even typed nulls. But no, pointers are apparently hard to understand, so new languages have to come up with their own dumb solutions to a non-problem.
-
pointers are apparently hard to understand
But magic to work with when you do understand. I am not going to say what I have done with pointers because it would be a very revealing statement @obeselymorbid and this is TDWTF where characters and integrity have been ripped to shreds within a post or two
-
C++ solved that problem 30 years ago with references (never null) and pointers (may be null).
That's C++ terminology; other languages don't use the words to mean the same thing.
-
NULL is a halfway decent solution for messiness. Deal with it. Coming up with nonsense like "Optional<T>" just kicks the can down the road. Ultimately, you still have to deal with it.
Null certainly has its use cases. The problem is, 99.9999999999999999999999999999999% of times you don't need it, but it's forced upon you 100% of times. And because you can't (in Java, couldn't) statically assert that your parameter isn't null, your code ends up with millions of unnecessary null checks or thousands of subtle bugs undetectable at compile time. Usually both.
-
I agree to everything you said, but making names unique is a really stupid idea even if every person does have a name.
-
your code ends up with millions of unnecessary null checks
Compilers are pretty good at optimising that sort of stuff out. Once you've dereferenced a pointer, it can't be NULL…
-
@Gaska said:
your code ends up with millions of unnecessary null checks
Compilers are pretty good at optimising that sort of stuff out. Once you've dereferenced a pointer, it can't be NULL…And then you end up with something like this
-
That's C++ terminology; other languages don't use the words to mean the same thing.
Which is precisely my point. C++ got it right, and others chose to do things differently because C++ has cooties.
-
Compilers are pretty good at optimising that sort of stuff out.
I'm not talking about compilers - I'm talking about humans who later have to maintain your codebase.
-
And then you end up with something like this
Ok, that explains a lot to me (about the attitude of some programmers) .
Which only leaves me with one question (which I suspect this entire post is about): FFS WHY?
It is bad enough trying code for idiot users, without having to second guess what second guessing the compiler is going to do with it when your are done.
-
using NULL is just an efficient way to implement Optional<T> for pointer/reference types
The difference is that
std::optional<T>
has value semantics, not pointer/reference semantics.
-
The difference is that std::optional<T> has value semantics, not pointer/reference semantics.
I think you'll find that NULL has value semantics too.
An optional plus a reference type is isomorphic to a pointer type (ignoring all that pointer arithmetic stuff). Therefore a compiler could detect that situation and implement optional<ref> with a pointer (an efficient encoding), though the semantics at the language level would not see it. It's a multi-level mapping thing: I'm not just talking about C++ but also levels above and below it.
-
My point is that value semantics means language-level support for deep copying and no need to worry about ownership.
-
There you go: arguing on just one level at a time. Missing my points.
-
Oops, I misinterpreted the way you used the word 'levels' - sorry.
-
Which only leaves me with one question (which I suspect this entire post is about): FFS WHY?
Inlined/generic code which sometimes is used in possibly-null context and sometimes in never-null context.
-
An optional plus a reference type is isomorphic to a pointer type (ignoring all that pointer arithmetic stuff). Therefore a compiler could detect that situation and implement optional<ref> with a pointer (an efficient encoding), though the semantics at the language level would not see it.
Guess what language does exactly what you described here :P
-
99.9999999999999999999999999999999% of times you don't need it
Why not say never then, this is 100% for any practical purposes even if you live a thousand years and have to decide about using NULL 100 times a day.
-
Which is precisely my point. C++ got it right, and others chose to do things differently because C++ has cooties.
Well, maybe we'd look upon it more generously if C++ got a lot of stuff right instead of "one thing kind of right, and everything else far inferior to the C language it was trying to supplant".
-
Funny coming from a C# fan, a language so devoid of vision that even it's name is a mangled distortion of C++'s ideas:
Pity they couldn't improve on it instead of bastardize it.
-
Pity they couldn't improve on it instead of bastardize it.
OH SNAP!
NO YOU DID'T!
Oh wait I don't give a shit what you think of me or C#. If you honestly think C# has no improvements over C++, well, you're welcome to believe whatever stupid wrong things you want. Kudos.
-
And then you end up with something like this
Ugh, I just read this. I grew disgusted when he posted the snippet in question:
The code looks like this:
struct sock *sk = tun->sk; // initialize sk with tun->sk … if (!tun) return POLLERR; // if tun is NULL return error
This code looks perfectly ok, right?
No it doesn't. You check if the stuff you use is not null BEFORE you use it, not after. This is basic logic, it's not even some tricky edge case. I don't even need to read further to guess what went wrong. The compiler said "well fuck you too" and removed the check because it happened after the user had already assumed the pointer could not be null by trying to read through it.
I did read further, and the idiot that wrote that then goes on to say "the compiler will introduce the vulnerability to the binary code, which didn't exist in the source code." Of course it already existed! What the hell is the program supposed to do when it tries to read from a null-pointer? What value is sk supposed to have before the function returns? Or is a seg fault the correct behavior? No compiler in existence can make up for bad code.
-
And then one day, a real person named "DOE, JOHN D" walked into the hospital...and we already had a fake person with that name in the hospital.
It's a good thing that hotel registration systems figured that out a long time ago or else guys banging their secretaries on their lunch break would have to get more creative than just "John Smith".
-
So our users invented a name (like "John Doe", but something else, I don't remember it now). That name is on, literally, hundreds of thousands of accounts. And we've had to program for it, (IF NAME = 'DOE, JOHN' THEN PERFORM SPECIAL-BILLING...). And then the system started getting touchy about the fact that we had four "DOE, JOHN" patients in the system at the same time, so we started doing "DOE, JOHN A", "DOE, JOHN B" and sometimes the registration person had to go through several names to find one that wasn't in use right now. And then one day, a real person named "DOE, JOHN D" walked into the hospital...and we already had a fake person with that name in the hospital.
So a problem that would have been fixed by allowing Null or Optional<string> as names.
Deal with it. Coming up with nonsense like "Optional<T>" just kicks the can down the road. Ultimately, you still have to deal with it.
? Isn't that a way to deal with it? What else do you need? By your logic it's not possible to solve anything because there's always a harder case.
Null (or Optional<t>, or None in Python, etc) are good for most cases. If a case comes up where that's not enough, you turn it into a custom data type with your own logic and data, that's why they were invented.
-
Falsehoods Programmers Believe About Names
All those things can be summed up as "allow any arbitrary Unicode string (including empty) as name, don't rely on it to identify people".
Except for two weird ones:
- People’s names fit within a certain defined amount of space.
Is there anyone around with >5000 characters on their name?
- People’s names are all mapped in Unicode code points.
What writing system is there that's not in Unicode? Is he talking about this guy?
https://www.youtube.com/watch?v=hNoS2BU6bbQ
-
No it doesn't. You check if the stuff you use is not null BEFORE you use it, not after. This is basic logic, it's not even some tricky edge case.
I thought the same thing - what code review wouldn't catch that?
-
What writing system is there that's not in Unicode? Is he talking about this guy?
People didn't have names before writing came along?
-
Coming up with nonsense like "Optional<T>" just kicks the can down the road. Ultimately, you still have to deal with it.
The difference between
Optional<T>
andnull
is the difference between "the compiler forces you to deal with it" and "you may or may not have to deal with it, go figure that out yourself without any compiler support". One is clearly better than the other.
-
Is there anyone around with >5000 characters on their name?
now that you have asked, probably yes. They may even be a Royal...
-
-
Does it? Looking at the Java 8 docs
Optional<T>.get()
throwsNoSuchElementException
if the value ofOptional<T>
isnull
; aRuntimeException
. Those are unchecked so...what exactly do I get?Instead of:
if(foo != null) { foo.doSomething(); }
I get:
if(foo.isPresent()) { foo.doSomething(); }
-
Oh, we were talking about Java? In that case, I retract my statement.
-
Adolph Blaine Charles David Earl Frederick Gerald Hubert Irvin John Kenneth Lloyd Martin Nero Oliver Paul Quincy Randolph Sherman Thomas Uncas Victor William Xerxes Yancy Zeus Wolfeschlegelsteinhausenbergerdorffwelchevoralternwarengewissenhaftschaferswessenschafewarenwohlgepflegeundsorgfaltigkeitbeschutzenvorangreifendurchihrraubgierigfeindewelchevoralternzwolfhunderttausendjahresvorandieerscheinenvonderersteerdemenschderraumschiffgenachtmittungsteinundsiebeniridiumelektrischmotorsgebrauchlichtalsseinursprungvonkraftgestartseinlangefahrthinzwischensternartigraumaufdersuchennachbarschaftdersternwelchegehabtbewohnbarplanetenkreisedrehensichundwohinderneuerassevonverstandigmenschlichkeitkonntefortpflanzenundsicherfreuenanlebenslanglichfreudeundruhemitnichteinfurchtvorangreifenvorandererintelligentgeschopfsvonhinzwischensternartigraum, Senior.
Including white space, punctuation and 144 hidden (apparently double) "hyphens". My "on the fly" text editor of choice claims 1139 characters.
As this is supposed to be an unbroken world record, how do you it will remain so and will there be an ultimate limit?
-
Why not say never then
Because sometimes you want.this is 100% for any practical purposes
Except it's not. Once every few months, you really need optional values, but all the other minutes of your life it's just burden. Especially if you remember empty strings are a thing too.No it doesn't. You check if the stuff you use is not null BEFORE you use it, not after.
It's kernel code. Read from address 0 is perfectly legal (on asm level, not C level), and under some circumstances you encounter in this niche, even justified.What writing system is there that's not in Unicode?
I'm pretty sure some Asian names require precise pronunciation that's not written down in those weird drawings they use for scrolls and signposts.Does it? Looking at the Java 8 docs Optional<T>.get() throws NoSuchElementException if the value of Optional<T> is null; a RuntimeException. Those are unchecked so...what exactly do I get?
Just because Java fucked up the implementation doesn't mean the idea is bad. In Rust, for example, if you have value of Option<T>, you must explicitly check if there is a value before using it, or else your code won't compile.
-
@Kian said:
Pity they couldn't improve on it instead of bastardize it.
OH SNAP!
NO YOU DID'T!
Oh wait I don't give a shit what you think of me or C#.
this is funny because @Kian said nothing about @blakeyrat personally, yet he reacted as if insulting C# also insulted him - which is the best proof of his fanboyism.
-
this is funny because @Kian said nothing about @blakeyrat personally, yet he reacted as if insulting C# also insulted him - which is the best proof of his fanboyism.
Except I didn't do that.
But great theory otherwise. Keep it up and you'll learn to read in no time at all.
-
I did not mean to suggest the idea is bad, just in the case of Java I didn't see any benefit to the code I'd have to write using
Optional<T>
.Having a compiler yell at you won't magically fix a large class of errors. If anything I just see a whole new rebirth of
On Error Resume Next
because Copy-Pasting a template for dealing withNull/Nothing/Nil/None
being much less resistance than actually fucking thinking for a large portion of developers.
-
Well, if you declare everything Optional, even if it cannot be, then that will happen, yes. But the whole point of Optional is that in practice, very few variables can be null, so it only annoys developers in the few places where it needs to annoy them.
-
And then you end up with something like this
Yeah, I know: programmer expects compiler to go backward to previous statement to fix programmer's broken logical sequence.
It is bad enough trying code for idiot users, without having to second guess what second guessing the compiler is going to do with it when your are done.
That's why the PL/1 model never took off. No one on earth had any idea what that compiler would do with anything.
The difference is that std::optional<T> has value semantics, not pointer/reference semantics.
But the people who proposed this still ignored the fact that sooner or later you have to know the difference. Suppose you have
Person p
, which is anOptional<T>
and it has similar name and address members. You want to create a mailing label:Label lab = new Label; lab.setName(p.name()); lab setAddress(p.name());
If p didn't have valid a valid name or address, you've just now created a bogus mailing label. The fact that it is sent to "No name / No address / No city / ?? / 00000-0000" doesn't work in the real world any better than "null / null / null / null / null".
You have to think about the fact that there will be person objects that don't have a name and address, and blaming that on null is just shortsighted.
My point is that value semantics means language-level support for deep copying and no need to worry about ownership.
That way you can copy bogus data without having to worry about the copy failing. Does this improve the output?
-
Is there anyone around with >5000 characters on their name?
Well, when you're writing my Person object, you can just set the length to 5000 and then you'll find out.
-
That way you can copy bogus data without having to worry about the copy failing. Does this improve the output?
std::optional<T>
which @LB_ is talking about doesn't work that way. You're either copying anoptional
that doesn't contain a value, or you are making a copy ofT
(or moving them around, where applicable). This is unlike aT*
, where you are copying the pointer (and the new pointer may not point to a value, i.e., it's null; or to the sameT
as the original pointer).You still have to eventually deal with the fact that there's no
T
around - that hasn't changed from using pointer & null.(IMO, I think it's a mistake to have the
operator*
andoperator->
overloads with undefined behaviour when the optional doesn't contain a value.)
-
What writing system is there that's not in Unicode? Is he talking about this guy?
(clicked on the wrong @%#$@! reply button again)
There's actually quite a few. Because Unicode has the same problem as the rest of computer science. It started out with the simple and laudable goal of having a code point for each of symbols in each of the world's alphabets.Then it encountered the world. And combining diacritics. And languages that don't have written alphabets. And the need to represent things like non-breaking space. And, finally, of course, names like
Derek <plunk>
.Unicode was supposed to be simple, then it met the world, now it's just as messy as the world.
@LB_ said:I thought the same thing - what code review wouldn't catch that?
Sure didn't pass mine--got a world class wince.
Well, if you declare everything Optional, even if it cannot be, then that will happen, yes. But the whole point of Optional is that in practice, very few variables can be null, so it only annoys developers in the few places where it needs to annoy them.
What if you have to declare everything Optional because it can be?
-
If p didn't have valid a valid name or address, you've just now created a bogus mailing label.
I don't think we're on the same page. The code you wrote is not valid at all -
std::optional
is not a proxy type, it's just a container. You still need to first extract the data it might contain. In other words, you have to check it just like you would check a pointer/reference, but unlike pointers/referencesstd::optional
has value semantics. It's a wrapper.
-
-
I did not mean to suggest the idea is bad, just in the case of Java I didn't see any benefit to the code I'd have to write using Optional<T>.
34. Make your own Optional<T>.
56. ???
78. Profit.Having a compiler yell at you won't magically fix a large class of errors.
It will solve the entire "I forgot to null check" class of errors. "Forgot" is the key word here.If anything I just see a whole new rebirth of On Error Resume Next because Copy-Pasting a template for dealing with Null/Nothing/Nil/None being much less resistance than actually fucking thinking for a large portion of developers.
It won't make idiot developers' code any better, but it will make actual programmers' code better.
-
What if you have to declare everything Optional because it can be?
I think you're starting to invent stuff here to prove your point that Optional doesn't make sense. Sure, there are systems out there that have to deal with such cases, but in 80% of the applications out there 90% of the pointers are not nullable. Making that explicit helps programmers do the right thing.