Earnestly thinking NULL is a mistake is a symptom

CoyneTheDup

Continuing the discussion from NULL: the worst mistake of computer science:

It has festered in the most popular languages of all time and is now known by many names: NULL, nil, null, None, Nothing, Nil, nullptr. Each language has its own nuances.

Some of the problems caused by NULL apply only to a particular language, while others are universal; a few are simply different facets of a single issue.

NULL…
subverts types
is sloppy
is a special case
makes poor APIs
exacerbates poor language decisions
is difficult to debug
is non-composable

Assertion (based on Douglas Adams, The Restaurant at the End of the Universe):

“The story so far: In the beginning ~~the Universe was~~computers were created. This has made a lot of people very angry and been widely regarded as a bad move.”

Frustration with NULL is a symptom. The real world is messy...very messy. First go read Falsehoods Programmers Believe About Names (if you haven't already) and then we'll proceed.

Back now? Great. Here's your job: Make a program that handles every possible name (or lack thereof) in the world, so it can be stored with the canonical "Person" object.

Sound a bit hard? Of course it does. Because the world is messy. See, we like to imagine that our programs will be neat and clean, without all these nuisance decisions we're always having to add.

Won't happen. Because programs are about the real world.

Let me give you an example: I work for a healthcare system, and we ~~have~~had our own registration product. Naturally, it required the patient to have a name. But some patients don't have names. See item 40, and also this quote from the comments in the article:

@Mahmoud Al-Qudsi said:

An example for number 40, please ;-)
@Patrick said:
Someone born into slavery in the Sudan, a woman born in rural China, an American baby recovered after being born into a toilet, a feral child, an amnesiac, etc, etc.

Back to our registration system. When the first [amnesiac/toilet baby/unconscious victim with no ID] floated into our ED, well, that stupid edit on the name field was a problem wasn't it? So our users invented a name (like "John Doe", but something else, I don't remember it now). That name is on, literally, hundreds of thousands of accounts. And we've had to program for it, (IF NAME = 'DOE, JOHN' THEN PERFORM SPECIAL-BILLING...). And then the system started getting touchy about the fact that we had four "DOE, JOHN" patients in the system at the same time, so we started doing "DOE, JOHN A", "DOE, JOHN B" and sometimes the registration person had to go through several names to find one that wasn't in use right now. And then one day, a real person named "DOE, JOHN D" walked into the hospital...and we already had a fake person with that name in the hospital.

The real world is messy. Deal with it.

NULL is a halfway decent solution for messiness. Deal with it. Coming up with nonsense like "Optional<T>" just kicks the can down the road. Ultimately, you still have to deal with it.

Note: I've had my opinion revised since I wrote the above paragraph. Optional<T> has a definite value and I can see using it in many cases. It does not solve the null problem entirely, and it doesn't solve the messy world entirely. But it would help, by explicitly informing the programmer that a value is optional.

If you aren't willing to deal with the real world, why are you a programmer?

So, what proposals do we have for more easily dealing with messiness?

Maciejasjmj

Fuck NULLs, TRWTF is an unique constraint on people's names.

CoyneTheDup

@Maciejasjmj said:

Fuck NULLs, TRWTF is an unique constraint on people's names.

Yeah, we actually got rid of that quite a while back. Instead, the unique constraint is name+SSN+birthdate. But that doesn't really solve the problem either, because that amnesiac that just floated in doesn't have a name or an SSN or a birthdate...so you have to make something up...

Keep on kicking that can.

dkf

@CoyneTheDup said:

Coming up with nonsense like "Optional<T>" just kicks the can down the road.

FWIW, using NULL is just an efficient way to implement Optional<T> for pointer/reference types…

Kian

Seriously. C++ solved that problem 30 years ago with references (never null) and pointers (may be null). They're even typed nulls. But no, pointers are apparently hard to understand, so new languages have to come up with their own dumb solutions to a non-problem.

loose

@Kian said:

pointers are apparently hard to understand

But magic to work with when you do understand. I am not going to say what I have done with pointers because it would be a very revealing statement @obeselymorbid and this is TDWTF where characters and integrity have been ripped to shreds within a post or two

dkf

@Kian said:

C++ solved that problem 30 years ago with references (never null) and pointers (may be null).

That's C++ terminology; other languages don't use the words to mean the same thing.

Gąska

@CoyneTheDup said:

NULL is a halfway decent solution for messiness. Deal with it. Coming up with nonsense like "Optional<T>" just kicks the can down the road. Ultimately, you still have to deal with it.

Null certainly has its use cases. The problem is, 99.9999999999999999999999999999999% of times you don't need it, but it's forced upon you 100% of times. And because you can't (in Java, couldn't) statically assert that your parameter isn't null, your code ends up with millions of unnecessary null checks or thousands of subtle bugs undetectable at compile time. Usually both.

ChrisH

I agree to everything you said, but making names unique is a really stupid idea even if every person does have a name.

dkf

@Gaska said:

your code ends up with millions of unnecessary null checks

Compilers are pretty good at optimising that sort of stuff out. Once you've dereferenced a pointer, it can't be NULL…

ashkante

@dkf said:

@Gaska said:
your code ends up with millions of unnecessary null checks

Compilers are pretty good at optimising that sort of stuff out. Once you've dereferenced a pointer, it can't be NULL…

And then you end up with something like this

Kian

@dkf said:

That's C++ terminology; other languages don't use the words to mean the same thing.

Which is precisely my point. C++ got it right, and others chose to do things differently because C++ has cooties.

Gąska

@dkf said:

Compilers are pretty good at optimising that sort of stuff out.

I'm not talking about compilers - I'm talking about humans who later have to maintain your codebase.

loose

@ashkante said:

And then you end up with something like this

Ok, that explains a lot to me (about the attitude of some programmers) .

Which only leaves me with one question (which I suspect this entire post is about): FFS WHY?

It is bad enough trying code for idiot users, without having to second guess what second guessing the compiler is going to do with it when your are done.

LB_

@dkf said:

using NULL is just an efficient way to implement Optional<T> for pointer/reference types

The difference is that std::optional<T> has value semantics, not pointer/reference semantics.

dkf

@LB_ said:

The difference is that std::optional<T> has value semantics, not pointer/reference semantics.

I think you'll find that NULL has value semantics too.

An optional plus a reference type is isomorphic to a pointer type (ignoring all that pointer arithmetic stuff). Therefore a compiler could detect that situation and implement optional<ref> with a pointer (an efficient encoding), though the semantics at the language level would not see it. It's a multi-level mapping thing: I'm not just talking about C++ but also levels above and below it.

LB_

My point is that value semantics means language-level support for deep copying and no need to worry about ownership.

dkf

There you go: arguing on just one level at a time. Missing my points.

LB_

Oops, I misinterpreted the way you used the word 'levels' - sorry.

Gąska

@loose said:

Which only leaves me with one question (which I suspect this entire post is about): FFS WHY?

Inlined/generic code which sometimes is used in possibly-null context and sometimes in never-null context.

Gąska

@dkf said:

An optional plus a reference type is isomorphic to a pointer type (ignoring all that pointer arithmetic stuff). Therefore a compiler could detect that situation and implement optional<ref> with a pointer (an efficient encoding), though the semantics at the language level would not see it.

Guess what language does exactly what you described here :P

dse

@Gaska said:

99.9999999999999999999999999999999% of times you don't need it

Why not say never then, this is 100% for any practical purposes even if you live a thousand years and have to decide about using NULL 100 times a day.

blakeyrat

@Kian said:

Which is precisely my point. C++ got it right, and others chose to do things differently because C++ has cooties.

Well, maybe we'd look upon it more generously if C++ got a lot of stuff right instead of "one thing kind of right, and everything else far inferior to the C language it was trying to supplant".

Kian

Funny coming from a C# fan, a language so devoid of vision that even it's name is a mangled distortion of C++'s ideas:

Pity they couldn't improve on it instead of bastardize it.

blakeyrat

@Kian said:

Pity they couldn't improve on it instead of bastardize it.

OH SNAP!

NO YOU DID'T!

Oh wait I don't give a shit what you think of me or C#. If you honestly think C# has no improvements over C++, well, you're welcome to believe whatever stupid wrong things you want. Kudos.

Kian

@ashkante said:

And then you end up with something like this

Ugh, I just read this. I grew disgusted when he posted the snippet in question:

The code looks like this:
struct sock *sk = tun->sk; // initialize sk with tun->sk … if (!tun) return POLLERR; // if tun is NULL return error

This code looks perfectly ok, right?

No it doesn't. You check if the stuff you use is not null BEFORE you use it, not after. This is basic logic, it's not even some tricky edge case. I don't even need to read further to guess what went wrong. The compiler said "well fuck you too" and removed the check because it happened after the user had already assumed the pointer could not be null by trying to read through it.

I did read further, and the idiot that wrote that then goes on to say "the compiler will introduce the vulnerability to the binary code, which didn't exist in the source code." Of course it already existed! What the hell is the program supposed to do when it tries to read from a null-pointer? What value is sk supposed to have before the function returns? Or is a seg fault the correct behavior? No compiler in existence can make up for bad code.

Polygeekery

@CoyneTheDup said:

And then one day, a real person named "DOE, JOHN D" walked into the hospital...and we already had a fake person with that name in the hospital.

It's a good thing that hotel registration systems figured that out a long time ago or else guys banging their secretaries on their lunch break would have to get more creative than just "John Smith".

anonymous234

@CoyneTheDup said:

So our users invented a name (like "John Doe", but something else, I don't remember it now). That name is on, literally, hundreds of thousands of accounts. And we've had to program for it, (IF NAME = 'DOE, JOHN' THEN PERFORM SPECIAL-BILLING...). And then the system started getting touchy about the fact that we had four "DOE, JOHN" patients in the system at the same time, so we started doing "DOE, JOHN A", "DOE, JOHN B" and sometimes the registration person had to go through several names to find one that wasn't in use right now. And then one day, a real person named "DOE, JOHN D" walked into the hospital...and we already had a fake person with that name in the hospital.

So a problem that would have been fixed by allowing Null or Optional<string> as names.

@CoyneTheDup said:

Deal with it. Coming up with nonsense like "Optional<T>" just kicks the can down the road. Ultimately, you still have to deal with it.

? Isn't that a way to deal with it? What else do you need? By your logic it's not possible to solve anything because there's always a harder case.

Null (or Optional<t>, or None in Python, etc) are good for most cases. If a case comes up where that's not enough, you turn it into a custom data type with your own logic and data, that's why they were invented.

anonymous234

@CoyneTheDup said:

Falsehoods Programmers Believe About Names

All those things can be summed up as "allow any arbitrary Unicode string (including empty) as name, don't rely on it to identify people".

Except for two weird ones:

People’s names fit within a certain defined amount of space.

Is there anyone around with >5000 characters on their name?

People’s names are all mapped in Unicode code points.

What writing system is there that's not in Unicode? Is he talking about this guy?
https://www.youtube.com/watch?v=hNoS2BU6bbQ

LB_

@Kian said:

No it doesn't. You check if the stuff you use is not null BEFORE you use it, not after. This is basic logic, it's not even some tricky edge case.

I thought the same thing - what code review wouldn't catch that?

Kian

@anonymous234 said:

What writing system is there that's not in Unicode? Is he talking about this guy?

People didn't have names before writing came along?

asdf

@CoyneTheDup said:

Coming up with nonsense like "Optional<T>" just kicks the can down the road. Ultimately, you still have to deal with it.

The difference between Optional<T> and null is the difference between "the compiler forces you to deal with it" and "you may or may not have to deal with it, go figure that out yourself without any compiler support". One is clearly better than the other.

loose

@anonymous234 said:

Is there anyone around with >5000 characters on their name?

now that you have asked, probably yes. They may even be a Royal...

tar

Adolph Blaine Charles David Earl Frederick Gerald Hubert Irvin John Kenneth Lloyd Martin Nero Oliver Paul Quincy Randolph Sherman Thomas Uncas Victor William Xerxes Yancy Zeus Wolfeschlegelsteinhausenbergerdorffwelchevoralternwarengewissenhaftschaferswessenschafewarenwohlgepflegeundsorgfaltigkeitbeschutzenvorangreifendurchihrraubgierigfeindewelchevoralternzwolfhunderttausendjahresvorandieerscheinenvonderersteerdemenschderraumschiffgenachtmittungsteinundsiebeniridiumelektrischmotorsgebrauchlichtalsseinursprungvonkraftgestartseinlangefahrthinzwischensternartigraumaufdersuchennachbarschaftdersternwelchegehabtbewohnbarplanetenkreisedrehensichundwohinderneuerassevonverstandigmenschlichkeitkonntefortpflanzenundsicherfreuenanlebenslanglichfreudeundruhemitnichteinfurchtvorangreifenvorandererintelligentgeschopfsvonhinzwischensternartigraum, Senior.

MathNerdCNU

Does it? Looking at the Java 8 docs Optional<T>.get() throws NoSuchElementException if the value of Optional<T> is null; a RuntimeException. Those are unchecked so...what exactly do I get?

Instead of:

if(foo != null)
{
     foo.doSomething();
}

I get:

if(foo.isPresent())
{
     foo.doSomething();
}

asdf

Oh, we were talking about Java? In that case, I retract my statement.

loose

@tar said:

Adolph Blaine Charles David Earl Frederick Gerald Hubert Irvin John Kenneth Lloyd Martin Nero Oliver Paul Quincy Randolph Sherman Thomas Uncas Victor William Xerxes Yancy Zeus Wolfeschlegelsteinhausenbergerdorffwelchevoralternwarengewissenhaftschaferswessenschafewarenwohlgepflegeundsorgfaltigkeitbeschutzenvorangreifendurchihrraubgierigfeindewelchevoralternzwolfhunderttausendjahresvorandieerscheinenvonderersteerdemenschderraumschiffgenachtmittungsteinundsiebeniridiumelektrischmotorsgebrauchlichtalsseinursprungvonkraftgestartseinlangefahrthinzwischensternartigraumaufdersuchennachbarschaftdersternwelchegehabtbewohnbarplanetenkreisedrehensichundwohinderneuerassevonverstandigmenschlichkeitkonntefortpflanzenundsicherfreuenanlebenslanglichfreudeundruhemitnichteinfurchtvorangreifenvorandererintelligentgeschopfsvonhinzwischensternartigraum, Senior.

Including white space, punctuation and 144 hidden (apparently double) "hyphens". My "on the fly" text editor of choice claims 1139 characters.

As this is supposed to be an unbroken world record, how do you it will remain so and will there be an ultimate limit?

Gąska

@dse said:

Why not say never then

Because sometimes you want.

@dse said:

this is 100% for any practical purposes

Except it's not. Once every few months, you really need optional values, but all the other minutes of your life it's just burden. Especially if you remember empty strings are a thing too.

@Kian said:

No it doesn't. You check if the stuff you use is not null BEFORE you use it, not after.

It's kernel code. Read from address 0 is perfectly legal (on asm level, not C level), and under some circumstances you encounter in this niche, even justified.

@anonymous234 said:

What writing system is there that's not in Unicode?

I'm pretty sure some Asian names require precise pronunciation that's not written down in those weird drawings they use for scrolls and signposts.

@MathNerdCNU said:

Does it? Looking at the Java 8 docs Optional<T>.get() throws NoSuchElementException if the value of Optional<T> is null; a RuntimeException. Those are unchecked so...what exactly do I get?

Just because Java fucked up the implementation doesn't mean the idea is bad. In Rust, for example, if you have value of Option<T>, you must explicitly check if there is a value before using it, or else your code won't compile.

Gąska

@blakeyrat said:

@Kian said:
Pity they couldn't improve on it instead of bastardize it.

OH SNAP!

NO YOU DID'T!

Oh wait I don't give a shit what you think of me or C#.

this is funny because @Kian said nothing about @blakeyrat personally, yet he reacted as if insulting C# also insulted him - which is the best proof of his fanboyism.

blakeyrat

@Gaska said:

this is funny because @Kian said nothing about @blakeyrat personally, yet he reacted as if insulting C# also insulted him - which is the best proof of his fanboyism.

Except I didn't do that.

But great theory otherwise. Keep it up and you'll learn to read in no time at all.

MathNerdCNU

I did not mean to suggest the idea is bad, just in the case of Java I didn't see any benefit to the code I'd have to write using Optional<T>.

Having a compiler yell at you won't magically fix a large class of errors. If anything I just see a whole new rebirth of On Error Resume Next because Copy-Pasting a template for dealing with Null/Nothing/Nil/None being much less resistance than actually fucking thinking for a large portion of developers.

asdf

Well, if you declare everything Optional, even if it cannot be, then that will happen, yes. But the whole point of Optional is that in practice, very few variables can be null, so it only annoys developers in the few places where it needs to annoy them.

CoyneTheDup

@ashkante said:

And then you end up with something like this

Yeah, I know: programmer expects compiler to go backward to previous statement to fix programmer's broken logical sequence.

@loose said:
It is bad enough trying code for idiot users, without having to second guess what second guessing the compiler is going to do with it when your are done.

That's why the PL/1 model never took off. No one on earth had any idea what that compiler would do with anything.

@LB_ said:
The difference is that std::optional<T> has value semantics, not pointer/reference semantics.

But the people who proposed this still ignored the fact that sooner or later you have to know the difference. Suppose you have Person p, which is an Optional<T> and it has similar name and address members. You want to create a mailing label:
  Label lab = new Label;
  lab.setName(p.name());
  lab setAddress(p.name());
If p didn't have valid a valid name or address, you've just now created a bogus mailing label. The fact that it is sent to "No name / No address / No city / ?? / 00000-0000" doesn't work in the real world any better than "null / null / null / null / null".

You have to think about the fact that there will be person objects that don't have a name and address, and blaming that on null is just shortsighted.

@LB_ said:
My point is that value semantics means language-level support for deep copying and no need to worry about ownership.

That way you can copy bogus data without having to worry about the copy failing. Does this improve the output?

CoyneTheDup

@anonymous234 said:

Is there anyone around with >5000 characters on their name?

Well, when you're writing my Person object, you can just set the length to 5000 and then you'll find out.

cvi

@CoyneTheDup said:

That way you can copy bogus data without having to worry about the copy failing. Does this improve the output?

std::optional<T> which @LB_ is talking about doesn't work that way. You're either copying an optional that doesn't contain a value, or you are making a copy of T (or moving them around, where applicable). This is unlike a T*, where you are copying the pointer (and the new pointer may not point to a value, i.e., it's null; or to the same T as the original pointer).

You still have to eventually deal with the fact that there's no T around - that hasn't changed from using pointer & null.

(IMO, I think it's a mistake to have the operator* and operator-> overloads with undefined behaviour when the optional doesn't contain a value.)

CoyneTheDup

@anonymous234 said:

What writing system is there that's not in Unicode? Is he talking about this guy?

(clicked on the wrong @%#$@! reply button again)
There's actually quite a few. Because Unicode has the same problem as the rest of computer science. It started out with the simple and laudable goal of having a code point for each of symbols in each of the world's alphabets.

Then it encountered the world. And combining diacritics. And languages that don't have written alphabets. And the need to represent things like non-breaking space. And, finally, of course, names like Derek <plunk>.

Unicode was supposed to be simple, then it met the world, now it's just as messy as the world.
@LB_ said:

I thought the same thing - what code review wouldn't catch that?

Sure didn't pass mine--got a world class wince.

@asdf said:

Well, if you declare everything Optional, even if it cannot be, then that will happen, yes. But the whole point of Optional is that in practice, very few variables can be null, so it only annoys developers in the few places where it needs to annoy them.

What if you have to declare everything Optional because it can be?

LB_

@CoyneTheDup said:

If p didn't have valid a valid name or address, you've just now created a bogus mailing label.

I don't think we're on the same page. The code you wrote is not valid at all - std::optional is not a proxy type, it's just a container. You still need to first extract the data it might contain. In other words, you have to check it just like you would check a pointer/reference, but unlike pointers/references std::optional has value semantics. It's a wrapper.

http://en.cppreference.com/w/cpp/experimental/optional

Gąska

@blakeyrat said:

Except I didn't do that.

Yeah, you totally didn't say "me or C#".

Gąska

@MathNerdCNU said:

I did not mean to suggest the idea is bad, just in the case of Java I didn't see any benefit to the code I'd have to write using Optional<T>.

34. Make your own Optional<T>.
56. ???
78. Profit.

@MathNerdCNU said:

Having a compiler yell at you won't magically fix a large class of errors.

It will solve the entire "I forgot to null check" class of errors. "Forgot" is the key word here.

@MathNerdCNU said:

If anything I just see a whole new rebirth of On Error Resume Next because Copy-Pasting a template for dealing with Null/Nothing/Nil/None being much less resistance than actually fucking thinking for a large portion of developers.

It won't make idiot developers' code any better, but it will make actual programmers' code better.

asdf

@CoyneTheDup said:

What if you have to declare everything Optional because it can be?

I think you're starting to invent stuff here to prove your point that Optional doesn't make sense. Sure, there are systems out there that have to deal with such cases, but in 80% of the applications out there 90% of the pointers are not nullable. Making that explicit helps programmers do the right thing.