Why does the C# compiler think I'm accessing unassigned variables?

Unperverted Vixen

Consider the simplified class at the end of this example. Right now, the compiler claims that in my call to B(), parameters b and c are accessing a variable that's unassigned.

If they're not assigned, then I would have bailed out in the d == null check.

Asides from defaulting b and c to null, is there any way to get it to realize that no, I'm not using unassigned variables?

Or alternately, am I the real , and it's possible for them to end up unassigned at the time of the call to B() that I'm missing?

namespace WantToGoToSpace
{
    class Test
    {
        public static string GetA(int key)
        {
            return null;
        }

        public static string B(string a, string b, string c, string d)
        {
            return null;
        }

        public static bool Busted(int key)
        {
            var a = GetA(key);
            string b;
            string c;
            string d = null;
            if (a != null)
            {
                b = GetA(key);
                if (b != null)
                {
                    c = GetA(key);
                    if (c != null)
                    {
                        d = GetA(key);
                    }
                }
            }
            if (d == null)
            {
                // Do some shared "we failed" logging, which is why I didn't just return immediately.
                return false;
            }

            if (d.Length > 0)
            {
                B(a, b, c, d);
            }

            return true;
        }
    }
}

Benjamin Hall

@unperverted-vixen could they be being garbage collected before then? I've had such weirdness before when I wasn't tracking the lifetime of variables and they were going out of scope somewhere before that. That's my only idea.

Unperverted Vixen

@benjamin-hall I’m getting compile failures, not runtime failures.

masonwheeler

@unperverted-vixen Eric Lippert had a couple very interesting articles on a related case.

ericlippert / Feb 21, 2011

Never say never, part one

This is part one of a two-part series. Part two is here. Can you find a lambda expression that can be implicitly converted to Func<T> for any possible T?

ericlippert / Feb 24, 2011

Never say never, part two

This is part two of a two-part series about determining whether the endpoint of a method is never reachable. Part one is here. A follow-up article is here. Whether we have a “never” ret…

They're not specifically about this, but they deal with the C# compiler's analysis to prove things about method calls. The relevant conclusion is that trying to give the compiler a way to prove things in the general case turns out to be equivalent to solving the Halting Problem, and so their decision was to use heuristics that might have false positives, which are annoying to the developer, but can't have false negatives, which could cause runtime issues and be annoying to the end-user.

What you're dealing with is one of those false positive cases. The compiler doesn't have a heuristic such that in one context you can define "if x then y" and then outside of that context you can say "if x" and it knows that y holds true.

Benjamin Hall

@unperverted-vixen Checking it in VS2017 I see the same behavior. I think what it's seeing is the code path a == null (so the first if fails to execute). Because of this, it's hitting the "unassigned variables" check before it hits the "well, d is null" check and bailing out. That is, since it can't assume that a isn't null and that d hasn't been adjusted elsewhere (maybe in another thread?) it bails on the first thing it sees.

One potential fix (WOMM): put the if (d.Length) branch into the nested ifs:

      if (a != null)
            {
                b = GetA(key);
                if (b != null)
                {
                    c = GetA(key);
                    if (c != null)
                    {
                        d = GetA(key);
                    }
                    if (d.Length > 0)
                    {
                        B(a, b, c, d);
                    }
                }
            }
        if (d == null)
                {
                    return false;
                }

This compiles fine.

dkf

@masonwheeler Basically, at the point where the semantic rule assigned-before (i.e., the safety guard for reading a variable) is evaluated, the hasn't been any flow-graph rewriting done, as that rewriting is a lot easier to do if you start with things that are known to be semantically correct. The standard way of describing the assigned-before rule for a converging flow graph is that a variable must be assigned-before on all paths through the graph without considering what the results of evaluating conditional expressions actually are, since that gets expensive (and is done later during the refinement/optimisation stages). It's relatively simple to convert the conceptual level for that into its implications for an if (loops are different).

The optimiser will be able to throw out the dummy initialisations of b and c once it has proved that the call of B must always follow the non-initialisation assignment to d, but they do need to be there until that reorganisation is done.

dkf

@benjamin-hall said in Why does the C# compiler think I'm accessing unassigned variables?:

This compiles fine.

But moves the “we failed” check to after the length check, and in this case will cause the code to generate a NullReferenceException. Ooops!

Benjamin Hall

@dkf said in Why does the C# compiler think I'm accessing unassigned variables?:

@benjamin-hall said in Why does the C# compiler think I'm accessing unassigned variables?:

This compiles fine.

But moves the “we failed” check to after the length check, and in this case will cause the code to generate a NullReferenceException. Ooops!

That's a micro-optimization. Really.

So rewrite that as

if (d != null && d.Length > 0) {
#stuff
}
else
{
return false;
}

and remove the other branch. That seems to meet all the criteria and show the basic point of having to have checked b and c before getting to the B(a, b, c, d) call.

Unperverted Vixen

@masonwheeler, @dkf Thanks for the links and background. Annoying, but understandable.

@Benjamin-Hall, @dkf Thanks for the thoughts on code fixes. The last version looks good to my eyes. I'll give it a whirl and make sure csc agrees. :)

ben_lubar

@masonwheeler I scrolled down on that text website and got an unskippable video ad. WHAT THE FUCK IS GOING ON THAT THOSE WORDS MAKE ANY SENSE?

masonwheeler

@ben_lubar Argh. I'm sorry that happened. Maybe use the contact info (I think there is some somewhere) to email him about bad advertisements on his site? :(

ben_lubar

@masonwheeler said in Why does the C# compiler think I'm accessing unassigned variables?:

@ben_lubar Argh. I'm sorry that happened. Maybe use the contact info (I think there is some somewhere) to email him about bad advertisements on his site? :(

I found a Chrome extension because that pushed me over the edge:

AutoMute

Automatically mutes each new tab the instant it is opened.

Now I'll never have to worry about it ever again!

Unperverted Vixen

@benjamin-hall said in Why does the C# compiler think I'm accessing unassigned variables?:

That seems to meet all the criteria and show the basic point of having to have checked b and c before getting to the B(a, b, c, d) call.

Alas, csc disagrees. :(

I ended up rewriting it a bit more - the four GetX methods (simplified version had one, but there's separate GetA/GetB/GetC/GetD methods) now have null checks built into them, instead of relying on the caller doing that check. Slightly more inefficient, but it's more reliable and readable code.