Earnestly thinking NULL is a mistake is a symptom

Kian

I don't see anything hard to reason about. I have to check if value exists anyway - the only difference is whether I check before I get the element or after I get the element.

Ok, imagine you have these functions:

optional<Bar> BarFactory();
optional<Baz> BazFactory(const Bar& bar);

// We receive an optional Bar so it's easy to combine with other functions that return optionals.
// So handy! If we have a Baz we do our thing, if we don't we do nothing
optional<Baz> Foo(optional<Bar> optBar) {
  if (optBar.IsValid()) {
    auto baz = BazFactory(optBar.Get());
    return baz;
  }
  else return {};
}

Now you do something like:

auto optBaz = Foo(BarFactory());

Quiz time: when optBaz is empty, is that because BarFactory failed, and thus returned nothing to Foo, or because BazFactory failed?

Bonus round:

auto optBar = BarFactory();
if (optBar.IsValid) {
  auto optBaz = Foo(optBar.Get());
}

I gave a Bar to Foo, so it should have done it's thing and optBaz now should definitely have something, right? Well, no. BazFactory could have still failed. Sure, Rust won't let me make the error of getting at the value if BazFactory failed.

So optional not only signals a value is optional, it's also going to have to signal error states. Which means optional will quickly become StatusOr<T>, where instead of a None state you'll have an OkEmpty, OkFull, and Error.

Basically, if you use exceptions, you simplify all this nonsense to:

// BarFactory and BazFactory are now constructors.
// They return a value or throw an exception

Baz Foo(const Bar& bar) {
  return Baz(bar);
}

And later:

auto baz = Foo(Bar());
// baz either has a value, or an exception was thrown and we never reach past this point.

There is no need to make piles of conditional blocks, tracking if a value exists or if some function failed at some point. Optional is basically one more in a long series of crutches people use to make up for not using exceptions. With exceptions, it's between useless and extremely rare.

@Gaska said:

We had this discussion the other month.

It's the argument that keeps on giving!

@Gaska said:

This particular thing is so esoteric that if I were to create C++ curriculum, I would put it after integer overflow handling.

Really? Not in the section talking about pointers, or arrays? That seems like a weird place to talk about it.

"So now that we covered overflow, remember when I said a pointer points to a valid object or is null? I lied, actually it can also point to one past the end of an array."
"Can we treat any given variable as an 'array' of length one?"
"Excellent question. Now let's move on to virtual inheritance."

LB_

@Kian said:

Not in the section talking about pointers, or arrays?

IMO pointers and arrays are among the last things I would teach. I'd teach the more useful stuff first. (Even if you don't agree with me, you should at least agree that std::vector should be taught before dynamic memory or arrays.)

Kian

Hmm, I don't know where in the curriculum I'd put those. On the one hand, yeah, new students would probably be better served by using std::array and std:vector for all their storage needs (fixed or variable respectively). On the other hand, they'll need to know about references soonish. Else they'll try to modify some string, and all their functions will operate on copies. And once you teach references, there might be a desire to explain pointers. So you can't push it too far back.

LB_

I'd probably teach references at the same time I teach functions and variables.

Captain

So optional not only signals a value is optional, it's also going to have to signal error states. Which means optional will quickly become StatusOr<T>, where instead of a None state you'll have an OkEmpty, OkFull, and Error.

Basically, if you use exceptions, you simplify all this nonsense ...

Yes, this is all true. In Haskell, a common newbie mistake is to make their own monads like:

data MyMonad a = Success a
               | InvalidSyntax 
               | InvalidSemantics 
               | SubmittedByGuyIHate 
               | Etc

This is a bit of an anti-pattern, because you can just do

data States = InvalidSyntax | InvalidSemantics | SubmittedByGuyIHate | Etc

and then use the Error monad:

data Either a b = Left a | Right b
type Error a b = Either a b

myValue :: Error States Int
myValue = ...

In other words, the monad lets you throw an error from the list of States (with Left) or return a good value with Right.

In this way, you can just make an enumeration of the errors and re-use the plumbing code instead of making a new monad for each.

Even still, Maybe is still useful, because sometimes you just don't care why a value doesn't exist.

dkf

@sloosecannon said:

Checked exceptions have to be handled somewhere. They don't have to be handled at the lowest level.

The difference between a checked exception and an unchecked exception is simple: checked exceptions are considered to be part of the contract of the method, whereas unchecked exceptions are considered to be usually a violation of the method contract. The former is useful in things like IO code where you really do need to be prepared for shit happening on any syscall (because it does sometimes, OK?) but the latter is pragmatically necessary because shit sometimes happen despite your best efforts. And programmers fuck up.

Java also separates out highly critical stuff as errors, which are where the shit hits the fan and you're probably going to have to kill the process soon. (They're usually for situations where an equivalent C++ process would be processing one of the less pleasant signals, such as SIGILL or SIGSEGV.) I'm not exactly convinced that separating errors and runtime exceptions was a good idea, but whatever.

C# says that all exceptions are unchecked, and that effectively the failure model is never part of the contract of a method. I can see where they're coming from, but I'm not convinced.

LB_

I think checked exceptions shouldn't be exceptions; the language should have a construct that forces you to deal with both the success and failure cases (just like Rust does with Optional). I don't know of any language that does that though, and the way it's done in Java just seems lazy.

Gąska

@Kian said:

Quiz time: when optBaz is empty, is that because BarFactory failed, and thus returned nothing to Foo, or because BazFactory failed?

You don't care. If you do care, you should use something else than optional. A right tool for a right job.

@Kian said:

So optional not only signals a value is optional, it's also going to have to signal error states. Which means optional will quickly become StatusOr<T>, where instead of a None state you'll have an OkEmpty, OkFull, and Error.

Yes, by necessity it does. And it serves a different purpose than plain optional now - that's why it's different type.

@Kian said:

There is no need to make piles of conditional blocks, tracking if a value exists or if some function failed at some point.

But you have to write exactly as many catch blocks now as you would have written ifs.

@Kian said:

It's the argument that keeps on giving!

Because you constantly necro it.

@Kian said:

Really? Not in the section talking about pointers, or arrays? That seems like a weird place to talk about it.

The "absurdly specific exceptions from rules in C++ specification" chapter is about as difficult and IMO less important than the "quirks of underlying hardware that can bite you through any number of abstraction layers and how to deal with them" chapter. It seems very simple to say "pointer immediately after the end of array is OK", but only until some student asks you why.

@Kian said:

And once you teach references, there might be a desire to explain pointers.

Remember that people new to programming don't have this brain worm that makes you think "references are const pointers in disguise".

dkf

@LB_ said:

I think checked exceptions shouldn't be exceptions; the language should have a construct that forces you to deal with both the success and failure cases (just like Rust does with Optional).

I'm also not convinced by that. A lot of IO exceptions are quite reasonably caught quite a bit higher up the stack than the frame they originate from. Which is fine: the code to read a byte probably doesn't know how to handle EOF nearly so well as the code that's reading a whole text document. Exceptions really simplify error handling a lot, so much so that programmers are more likely to take the effort to handle errors in the first place.

Using a result status value is much more likely to tempt people into just not handling the failure at all. There's decades of sad evidence to support that. Yes, you can make the error state bubble out by default (which is what Haskell's IO monad has been set up to do) when you're really creating exceptions anyway, but with sillier syntax. (Error state value passing is one way that an exception system implementation might be coded under the hood, but writing it all out by hand is really annoying and easy to fuck up. Let the compiler/language engine do it.)

Gąska

@dkf said:

Exceptions really simplify error handling a lot, so much so that programmers are more likely to take the effort to handle errors in the first place.

As evident in the articles on TDWTF.

Kian

@Gaska said:

You don't care. If you do care, you should use something else than optional. A right tool for a right job.

No one writes a program so that it maybe does something. You write a program to do something. At some point, you have to resolve the ambiguity and use all these optional values. At any of those points, where you care, you might wonder why the thing you hoped would hold a value doesn't.

@Gaska said:

But you have to write exactly as many catch blocks now as you would have written ifs.

No you don't. Look at optional<Baz> Foo(optional<Bar> optBar) vs Baz Foo(Bar bar):

optional<Baz> Foo(optional<Bar> optBar) {
  if (optBar.IsValid()) {
    return BazFactory(optBar.Get());
  }
  else return {};
}

Baz Foo(const Bar& bar) {
  return Baz(bar);
}

Before you object that the optional version takes an optional<Bar> while the other one takes a const Bar&, keep in mind that that allows me to pass the output from the factory directly. Otherwise to call it I'd have to do the if check outside, so I have one if either way.

Basically, whenever you call a function that needs a value, you have a series of ifs (one per necessary parameter). With exceptions, you necessarily have a value, or you already threw an exception. The only places where you need catch clauses is when you actually have enough context to do something about it. Which is where in optional land you'd do some error handling instead of returning empty.

You could "solve" that by making every parameter everywhere optional. But then optional would no longer signal optionality, it's just the default so you never have to write an if.

@Gaska said:

Because you constantly necro it.

I didn't necro it! It evolved organically out of a conversation about a feature that exists mostly to replace exceptions.

@Gaska said:

Remember that people new to programming don't have this brain worm that makes you think "references are const pointers in disguise".

Well, I was thinking in the context of function parameters. I guess you could defer that part and just tell them not to do it. I mean, what are the chances they'll ever need optional parameters? And you can just have two functions for that.

Gąska

@Kian said:

No one writes a program so that it maybe does something.

fopen() maybe opens file. find() maybe gives you the found element. MS Word save dialog maybe saves the file. CloneCD maybe copies the CD. And so on and so on. "Maybe" here means that it might fail. Reason for failure is sometimes very important, but sometimes it's enough for you to know that something has failed, one way or another - and sometimes it isn't really an error but expected behavior.

@Kian said:

At some point, you have to resolve the ambiguity and use all these optional values.

We're not talking about whether an optional has a value or not - we're talking about different kinds of not having the value. In practice, reason for failure is very rarely taken into consideration - and when it is, there are also tools to do it. Optional isn't one of them.

@Kian said:

```
optional<Baz> Foo(optional<Bar> optBar) {
if (optBar.IsValid()) {
return BazFactory(optBar.Get());
}
else return {};
}

Baz Foo(const Bar& bar) {
return Baz(bar);
}

And you're handling the exceptions where?

@Kian <a href="/t/via-quote/51020/261">said</a>:<blockquote>Before you object that the optional version takes an optional&lt;Bar&gt; while the other one takes a const Bar&, keep in mind that that allows me to pass the output from the factory directly. Otherwise to call it I'd have to do the if check outside, so I have one if either way.</blockquote>
You don't get it, do you? Foo() requires Bar, so it should be required argument, not optional. If it's optional, it suggests that the function works if there's no Bar - but what it really does is nothing. It doesn't do any work. In other words, it doesn't work. So Bar is necessary for it to working. Your example is bad design. Null check should happen on the caller's side. It's the caller's fucking responsibility to provide valid object for Foo() to work.

Not to mention your example is terribly unfair - not only in the second case you sidestep the whole issue with Bar not being there by using reference, which by definition requires Bar to be there, but also you pass Bar in one case by value and in the other you pass it by... well, reference. In Rust terms, first function consumes Bar and the other just borrows it - a big difference in semantics.

@Kian <a href="/t/via-quote/51020/261">said</a>:<blockquote>I didn't necro it!</blockquote>
Yes you did. You started talking about error handling.

@Kian <a href="/t/via-quote/51020/261">said</a>:<blockquote>It evolved organically out of a conversation about a feature that exists mostly to replace exceptions.</blockquote>
**YOU** brought it into this discussion in the first place!

@Kian <a href="/t/via-quote/51020/261">said</a>:<blockquote>Well, I was thinking in the context of function parameters. I guess you could defer that part and just tell them not to do it. I mean, what are the chances they'll ever need optional parameters? And you can just have two functions for that.</blockquote>
OK, now I'm completely lost. What were you saying in the context of function parameters that I've taken out of the context of function parameters?

Salamander

@Kian said:

I say optional is bad in general. For example, in a map, the correct thing to do would be to check if a key exists and then get the value. But that's expensive, as it requires two searches or some complicated caching behavior. So as an optimization, you can return an optional type of some kind (for example, a pointer), which will either have the value you searched for, or be empty.

When your map is accessed by multiple threads, doing a check and then getting the value is completely the wrong thing to do.

@Kian said:

That happens to makes writing code easier, but it makes reasoning about the code harder

How does it make reasoning about code harder?
It makes it very explicit that if you call map.TryGet("bar") it may not have have any value.

@Kian said:

Some newer languages try to help by knowing about this ambiguity at the syntax level, and making sure you remember to check before using it. That helps. But then they try to extend this to types with optional members, or functions with optional variables, which is when you need to add a bit to say whether the value actually exists or not. So you have a bit of memory set aside for the object, but you don't actually know if the object exists.

Which is a fundamentally different concept from the pointer kind of optional. But despite these two concepts being fundamentally different (one is something that may exist, the other is something that may point to something that exists), because they seem to be used the same way, they are treated the same.

They are treated the same because that's an implementation detail over saying that something may not exist.
Incidentally, how would you go about using null pointers on a system where every possible value of an int32 is a valid pointer?

Salamander

@Kian said:

I didn't necro it! It evolved organically out of a conversation about a feature that exists mostly to replace exceptions.

They don't replace exceptions. They have nothing to do with exceptions.
It's in the name: it's for cases where a value does not necessarily exist, and is not needed to exist.
It's got the same use case as nullable references, except it makes it explicit by having them type-checked by the compiler.

asdf

I say optional is bad in general. For example, in a map, the correct thing to do would be to check if a key exists and then get the value. But that's expensive, as it requires two searches or some complicated caching behavior. So as an optimization, you can return an optional type of some kind (for example, a pointer), which will either have the value you searched for, or be empty.

If you don't like that, I guess you don't like futures either then? Because it's basically the same concept. You're passing a handle around that may or may not point to an actual value up until the point where you actually need to use the value. Then, you check/wait for its existence there, in the correct place.

asdf

Yeah, that's where C++ really sucks. That's why I only claimed non-virtual destructors are not necessarily a problem.

LB_

I'm not sure how C++ sucks there - value semantics and polymorphism are incompatible concepts. It is literally impossible to design a programming language where copy assignment is a safe thing to do with polymorphic types. C++ doesn't try very hard to stop you from slicing yourself in half, but at least it lets you choose between value semantics and polymorphism on a case-by-case basis.

If you know a language where it is safe to value copy assign a polymorphic type, I'd be interested to know - it would probably be highly restricted because of how many things can go wrong.

dse

@LB_ said:

I still don't understand why you would ever want to derive from any string class ever anyway

App developers want to add new functionalities to string all the time (I am not app developer but have seen them adding things to NSString for all sort of crap). It does not have to be through inheritance, in ObjC [spoiler]eek[/spoiler] one can add it as a category, which is actually a nice feature.

LB_

Adding functionality is fine. Adding data members is an entirely different thing.

dkf

@LB_ said:

If you know a language where it is safe to value copy assign a polymorphic type, I'd be interested to know - it would probably be highly restricted because of how many things can go wrong.

Value semantics doesn't mean that you are dealing with actual values. You can get the same thing with handle-based implementations (which cope with polymorphism fine) if you use a copy-on-write modification system. Getting good speed with that is tricky, but doable.

CoyneTheDup

@sloosecannon said:

@Kian said:
each function receives it and converts it,

is incorrect. Because they recieve it but don't attempt to do anything with it. It just goes up to the caller.

I'm wondering if @Kian isn't thinking of the "conversion" some people do, which breaks the stack trace:

    try { ... something ... }
    catch (WhateverFailException e) {
         throw new MyException("blah faled");
    }

If you don't chain the stack trace, errors do get hard to find.

@Kian said:

Quiz time: when optBaz is empty, is that because BarFactory failed, and thus returned nothing to Foo, or because BazFactory failed?

I would argue that the example is a problem due to a bad pattern of usage. A better use would be:

Baz Foo(optional<Bar> optBar) throws BazException {
  if (optBar.IsValid()) {
    auto baz = BazFactory(optBar.Get());
    return baz;
  }
  else throw new BazException();
}

(Assuming of course that a Baz cannot be created unless it contains a valid Bar.)

Optional serves the role of retrieval of sub members that are optional. This is a factory, which isn't the same thing at all: You call a BazFactory, you expect to get a Baz; and if one cannot be returned, then that should be an Exception of some kind.

I'm specifically arguing that returning an Optional from a factory is an anti-pattern. The reason that is so is because the Optional, in this case, is being used as a semaphore to indicate that BazFactory failed. But a failure should be handled with an Exception, not a semaphore.

This kind of changes the rest of your example, so I'll skip that.

Gąska

@CoyneTheDup said:

I would argue that the example is a problem due to a bad pattern of usage. A better use would be:

No it wouldn't be any better. You're still using optional type for required argument.

CoyneTheDup

@Gaska said:

@CoyneTheDup said:
I would argue that the example is a problem due to a bad pattern of usage. A better use would be:

No it wouldn't be any better. You're still using optional type for required argument.

Okay, yes, that's true. My focus was on the use of Optional as a semaphore for the result, and I really didn't think of that.

dkf

@CoyneTheDup said:

If you don't chain the stack trace, errors do get hard to find.

The other anti-pattern that I've seen rather a lot of is where someone catches an exception, logs it, and rethrows it on its merry way. In a complex application, this probably results in each exception being logged with a full stack trace 10 or 20 times. Yay.

Gąska

Option isn't that bad here, actually. Of course a better approach would be something that suggests there might be unwanted error conditions that shouldn't happen under normal circumstances - one way is exceptions, the other is the "StatusOr" @Kian proposed (though the name he chose sucks). In idiomatic Rust, it would be probably this:

fn foo(bar: &Bar) -> Result<Baz,MyError> {
    try!(bar.get_baz().map_err(|_| MyError::Whatever)
}

Clear, concise, and no exceptions needed.

Kian

@Gaska said:

You don't get it, do you? Foo() requires Bar, so it should be required argument, not optional. If it's optional, it suggests that the function works if there's no Bar - but what it really does is nothing.

Hmm, I did misunderstand. I bundled up optionality with error handling, since one can be implemented in terms of the other. My bad.

@Gaska said:

Yes you did. You started talking about error handling.

Maybe because the main reason why a function might or might not return a value is because of errors.

@Salamander said:

Incidentally, how would you go about using null pointers on a system where every possible value of an int32 is a valid pointer?

I'd have a global one byte thing (or however small I can make it) serve as my null object. Then when some TryWhatever function fails, I'd have them point at the address of my null object. Then my nullptr comparisons would be against the address of that global. Since that will get resolved at link time, the comparison should be against a constant, which is nice. If I can get the global to be at zero, even better, although much would depend on what language I'm using and what the architecture is like.

Why, would you have doubled the size of every pointer instead?

@Gaska said:

(though the name he chose sucks)

I didn't choose it. I've actually seen it used when exceptions are disabled.

sloosecannon

@CoyneTheDup said:

I'm wondering if @Kian isn't thinking of the "conversion" some people do, which breaks the stack trace

Yeah he probably is. That's so outrageously stupid to do though - even if you're going to eat the exception at a certain level, you should pass the cause in so you stacktrace includes it in the caused by.

Salamander

@Kian said:

I'd have a global one byte thing (or however small I can make it) serve as my null object. Then when some TryWhatever function fails, I'd have them point at the address of my null object.

Now the compiler has to have an implicit "null" check around every single pointer access to determine if you can actually dereference it or not.
Which is now the size of a pointer + asm to check for "null", anywhere there isn't already a null check.

Using Optional means the compiler knows which pointers are always non-null, and which can be null, so it can skip the null checks altogether in the "required" case.
It can also combine your idea for a placeholder value, so that it takes only one word of extra space across the entire application.

Gąska

@Kian said:

Maybe because the main reason why a function might or might not return a value is because of errors.

A map not having a value is not an error. Neither is iterator coming to the end - just a thing to remember when someone claims Python is perfect.

@Kian said:

I didn't choose it. I've actually seen it used when exceptions are disabled.

Okay, the name they chose sucks.

Kian

@sloosecannon said:

Yeah he probably is.

Yeah I am. Not saying it's the right way to do it, but when forced to do something to get the compiler to shut up, some people just do the dumbest thing that works.

@Salamander said:

Now the compiler has to have an implicit "null" check around every single pointer access to determine if you can actually dereference it or not.

Not every one. If you have something like C++'s references, nearly all your pointers are already assumed non-null. If you were saying that you have to have some way to tell the compiler "this may be null" and "this is never null", we agree. I just disagreed on what the best way of saying "this may be null" is.

@Gaska said:

A map not having a value is not an error.

It's also not the main reason a function might not return a value.

Gąska

@Kian said:

It's also not the main reason a function might not return a value.

You must differentiate the cases where a function might not return a value and the cases where an error occured that prevented the function to work right. The line between the two is very blurred and it's not always obvious (for example - is empty config file an error or not? Or a listening socket that received no data - error or not?), so it's left to the developer's best judgement whether he treats a given code path as normal behavior or not. But if he finally decides which case it is, he's not free to do whatever he wants anymore - he should either stick to propagating errors (via returning or exceptions or whatever), or stick to nulls/optionals. Using optional for signaling errors is wrong, just as using errors for signaling normal conditions (ie. if iterator reached the end, it shouldn't throw up).

LB_

@dkf said:

Value semantics doesn't mean that you are dealing with actual values.

Sorry for the confusion - I'm used to C++'s value semantics, and I didn't realize 'value semantics' was such a vague term. I want to see another language where operator= and polymorphism are both present.

Gąska

Does interfaces-only polymorphism count too for you?

LB_

If you only allow inheriting abstract classes, I guess that solves the problem, but that's quite a huge restriction.

CoyneTheDup

@dkf said:

The other anti-pattern that I've seen rather a lot of is where someone catches an exception, logs it, and rethrows it on its merry way. In a complex application, this probably results in each exception being logged with a full stack trace 10 or 20 times. Yay.

Twenty times the log: twenty times the fun!

@LB_ said:

If you only allow inheriting abstract classes, I guess that solves the problem, but that's quite a huge restriction.

Second.

dkf

@Salamander said:

Now the compiler has to have an implicit "null" check around every single pointer access to determine if you can actually dereference it or not.

Would you mind if I did that check in hardware?

The current cost of such checks is one page taken out of the address space, and some complexity in the memory handling which the hardware makers and OS makers have sorted out for me. If something bad happens and an attempt to dereference NULL is done, the CPU traps it and my process gets a signal.

dkf

@LB_ said:

I'm used to C++'s value semantics, and I didn't realize 'value semantics' was such a vague term.

That's another reason it is good to be a polyglot. You understand the concepts behind things better, and don't mix them up with a particular language's interpretation of them.

Salamander

@dkf said:

Would you mind if I did that check in hardware?

That would be the non-wtfy solution, but if every pointer value is considered valid then it is most likely something not supported by whatever hardware you're compiling for, which was kinda where I was going with that point.

dkf

@Salamander said:

if every pointer value is considered valid

Then you're targeting something like an 8-bit or possibly a 16-bit platform. For 8-bit platforms, you might as well hand-verify the entire program, and some of them still didn't like you smashing around the zero page in the first place. (IIRC, the interrupt table was at the start of memory on the Z80.)

Scarlet_Manuka

@dkf said:

IIRC, the interrupt table was at the start of memory on the Z80.

More or less, depending on interrupt mode. NMI always went to 0066, maskable interrupts in IM 0 didn't automatically jump (you'd call RST nn usually, so you could avoid using RST 00 for your ISR in mode 0), IM 1 interrupts always went to 0038. It's only in IM 2 that the interrupt address was supplied by the interrupt vector.

Of course if you got a reset then execution always started off from 0000 again. So if a reset was possible you still wouldn't want to screw around with the start of memory. (Most of the Z80 micros had their ROM at the start of memory, so it wasn't an option anyway.)