EnumerableExtensions

blakeyrat

        public static T FindFirstMinOrDefault<T, S>(this IEnumerable<T> enumerable, Func<T, S> selector) where S : IComparable<S>
        {
            if (enumerable == null)
            {
                throw new ArgumentNullException("Enumeration not instantiated");
            }

            if (selector == null)
            {
                throw new ArgumentNullException("Selector not specified");
            }

            T result;

            if (FindFirstNonNullOrDefault<T>(enumerable, out result))
            {
                S rank = selector(result);

                foreach (var item in enumerable)
                {
                    if (item != null)
                    {
                        S itemRank = selector(item);

                        if (rank != null && itemRank != null)
                        {
                            if (rank.CompareTo(itemRank) > 0)
                            {
                                result = item;
                                rank = itemRank;
                            }
                        }
                        else if (rank == null && itemRank != null)
                        {
                            result = item;
                            rank = itemRank;
                        }
                    }
                }
            }

            return result;
        }

Does this code do what it claims to do? (It looks as if it finds the max value, not the min. Look at the comparison operator used.)
How can I write this to be both more clear and more correct? Without screwing performance.

EDIT: FindFIrstNonNullOrDefault is:

        private static bool FindFirstNonNullOrDefault<T>(IEnumerable<T> enumerable, out T result)
        {
            foreach (T item in enumerable)
            {
                if (item != null)
                {
                    result = item;
                    return true;
                }
            }

            result = default(T);
            return false;
        }

Maciejasjmj

@blakeyrat said:

(It looks as if it finds the max value

No, you're just doing the comparison backwards - changing the item if the old rank is bigger than the new itemRank. I'd rename the variables to something more descriptive.

As for the rest - it seems fine, you can maybe conflate the innermost ifs into something like:

if (itemRank != null && (rank == null || rank.CompareTo(itemRank) > 0))
{
     result = item;
     rank = itemRank;
}

but it's up to you whether that makes the readability better or worse.

Also, that if (FindFirstNonNullOrDefault<T>(enumerable, out result)) check might be skippable, but I couldn't get it to work on my test cases, so maybe you're better off this way.

Magus

@blakeyrat said:

```
private static bool FindFirstNonNullOrDefault<T>(IEnumerable<T> enumerable, out T result)
{
foreach (T item in enumerable)
{
if (item != null)
{
result = item;
return true;
}
}
        result = default(T);
        return false;
    }
:wtf:?

Neither of these yield return, and the second randomly decides to return two things? No one would ever use it expecting the out param, because you can normally chain these things. It's acting like TryCast, not OrDefault!

All of this code is horrific.

Bort

Go to https://dotnetfiddle.net/ and paste this in:

Play with it and figure out if it's what you want

I couldn't save it because i don't want to have to sign up

using System;
using System.Collections.Generic;
using System.Linq;

public class Program
{
	public static void Main()
	{
		IEnumerable<int> values = new int []{1, 4, 1, 2, 6, 1, 4, 3, 6, 8, 1, 4, 2, 5};
		IEnumerable<int> indicies = Enumerable.Range(0, values.Count());
		IEnumerable<Pair> pairs = values.Zip(indicies, (v, i) => new Pair { Index = i, Value = v });
		Console.WriteLine(pairs.FindFirstMinOrDefault(x => x.Value).Index);
	}
}

public class Pair
{
	public int Index {get; set;}
	public int Value {get; set;}
}

public static class Extensions
{
	private static bool FindFirstNonNullOrDefault<T>(IEnumerable<T> enumerable, out T result)
	{
		foreach (T item in enumerable)
		{
			if (item != null)
			{
				result = item;
				return true;
			}
		}

		result = default (T);
		return false;
	}

	public static T FindFirstMinOrDefault<T, S>(this IEnumerable<T> enumerable, Func<T, S> selector)where S : IComparable<S>
	{
		if (enumerable == null)
		{
			throw new ArgumentNullException("Enumeration not instantiated");
		}

		if (selector == null)
		{
			throw new ArgumentNullException("Selector not specified");
		}

		T result;
		if (FindFirstNonNullOrDefault<T>(enumerable, out result))
		{
			S rank = selector(result);
			foreach (var item in enumerable)
			{
				if (item != null)
				{
					S itemRank = selector(item);
					if (rank != null && itemRank != null)
					{
						if (rank.CompareTo(itemRank) > 0)
						{
							result = item;
							rank = itemRank;
						}
					}
					else if (rank == null && itemRank != null)
					{
						result = item;
						rank = itemRank;
					}
				}
			}
		}

		return result;
	}
}

blakeyrat

@Magus said:

Neither of these yield return,

The first one can't, since it explicitly returns the lowest item (or default(T)).

The second one, it also explicitly returns only one value so yield is useless there, too.

But... good observation?

blakeyrat

@Maciejasjmj said:

No, you're just doing the comparison backwards - changing the item if the old rank is bigger than the new itemRank. I'd rename the variables to something more descriptive.

Thanks, that's what was throwing me. I guess neither way is "more correct" but generally you compare the new item to the old minimum, not the other way around.

@Maciejasjmj said:

Also, that if (FindFirstNonNullOrDefault<T>(enumerable, out result)) check might be skippable, but I couldn't get it to work on my test cases, so maybe you're better off this way.

I think that is there to handle the case where every single item in the enum is null, but the foreach also checks that so... I'm also not sure why it's necessary.

Might be there just to avoid assigning to result "too early" (i.e. before there's actually a result) by someone who hates short-circuiting.

blakeyrat

@Bort said:

Go to https://dotnetfiddle.net/ and paste this in:

Thanks, if I dig into this I'll make use of that.

The reason I'm asking is our codebase has a bunch of places where we pick the max value by doing a .Sort().First(); and my boss put out a "standing order" (so to speak) to replace those with these enum extensions he wrote. Because of efficiency.

Bort

@blakeyrat said:

my boss put out a "standing order" (so to speak) to replace those with these enum extensions he wrote. Because of efficiency.

I guess they should be more efficient as they only do a single pass.

By the way, morelinq has MaxBy/MinBy methods. If you're down with Nuget, it would make much more sense to just use that.

MoreLINQ/MoreLinq/MaxBy.cs at master · morelinq/MoreLINQ

Extensions to LINQ to Objects. Contribute to morelinq/MoreLINQ development by creating an account on GitHub.

Magus

Right, good point. Still absolutely atrocious, especially with the TryX pattern, which is almost never the right choice.

@Bort said:

I guess they should be more efficient as they only do a single pass.

More efficient than "Go through and sort all the things exactly once" dot "Get me thing[0]"? No, blakey's boss is stupid.

accalia

@Magus said:

More efficient than "Go through and sort all the things exactly once" dot "Get me thing[0]"?

which is Order O(n log(n)) at the best, as opposed to "walk through teh list and give me the smallest thing you find" which is O(n)

... I smell a

blakeyrat

We need MaxOrDefault, does it have those?

blakeyrat

@Magus said:

Right, good point. Still absolutely atrocious, especially with the TryX pattern, which is almost never the right choice.

What do you mean by "TryX pattern"? There's no exception handling in this code...
What would be a better choice? I explicitly asked that in the OP, you might recall.

Or are you posting just to show-off you're smarter than me?

@Magus said:

More efficient than "Go through and sort all the things exactly once" dot "Get me thing[0]"? No, blakey's boss is stupid.

How is he stupid? How could iterating an enum once possibly be slower than sorting the enum?

I 100% get that there is probably a better/less confusing way to implement this (thus my asking for it in the OP!), but the concept seems perfectly sound.

Magus

@blakeyrat said:

1) What do you mean by "TryX pattern"? There's no exception handling in this code...

Idiocy. You have programmed in C# before, right? Have you ever seen a method called TryParse?

Point is, it's almost always the wrong choice, and whoever wrote it should be ashamed.

@blakeyrat said:

How is he stupid? How could iterating an enum once possibly be slower than sorting the enum?

Because I misread it and was wrong. I'm interested to know how the benchmarks that sparked this change turned out, though.

blakeyrat

@Magus said:

Idiocy. You have programmed in C# before, right? Have you ever seen a method called TryParse?

Yes.

What does that have to do with the code we're talking about here?

The TryParse pattern is to attempt to do an operation, then swallowing the exception and returning a bool to indicate success instead. This code doesn't do that. At all.

@Magus said:

Point is, it's almost always the wrong choice, and whoever wrote it should be ashamed.

Well, ok, but what does that have to do with the code we are discussing!

@Magus said:

ecause I misread it and was wrong. I'm interested to know how the benchmarks that sparked this change turned out, though.

It's not about benchmarks (this function's actually called from only like 3-4 places in the code.) It's about "shit missing from LINQ but we use it all the time so we should write it".

Magus

@blakeyrat said:

The TryParse pattern is to attempt to do an operation, then swallowing the exception and returning a bool to indicate success instead. This code doesn't do that. At all.

It's equivalent, in that it's called the same way. I'm comparing the two, and the way you use them is not noticeably different. out is awful, and anything designed this way is unchainable. It can only be used in a conditional, and is nothing like the rest of linq.

@blakeyrat said:

It's not about benchmarks

It is, if you're doing it for efficiency's sake.

blakeyrat

@Magus said:

It's equivalent, in that it's called the same way.

The fuck?

@Magus said:

I'm comparing the two, and the way you use them is not noticeably different.

... right. calling TryParse and FindFirstMinOrDefault are not noticeably different. At all. Truly you are a genius.

@Magus said:

out is awful, and anything designed this way is unchainable.

Well, you're welcome to your opinion. Oh, you're talking about the little helper function? That returns false if every element in the enum is null?

I... guess that's vaguely like the TryX pattern, except (you know) not having the try keyword anywhere remotely close to it.

@Magus said:

It is, if you're doing it for efficiency's sake.

I would argue pulling the min value from an enum is a completely different operation than sorting an enum, and thus this code (or something like it) is more technically correct. Using Sort() when we don't want sorted results is dumb.

I'm filing you into the "this person has no idea what they are talking about" bucket I think. Between this, your complete inability to communicate what the hell you meant by the TryX pattern, and your "misreading" of the function so you could conveniently call me wrong and an idiot even though I was not-- three strikes, you're out.

Maciejasjmj

@blakeyrat said:

I think that is there to handle the case where every single item in the enum is null, but the foreach also checks that so... I'm also not sure why it's necessary.

It gets kind of wonky when assigning rank - selector(result) might throw if result is null, and rank is not assignable from null, so you'd need something like:

T result = default(T); //should be null except for pathological cases where you compare value types by property
S currentRank = (result == null ? default(S) : selector(result));

except that doesn't really work since default(S) might be, say, 0, and a list full of negative ranks will not return a value. So yeah, there should be a sensible solution that can skip the check, but I can't find a good one.

@blakeyrat said:

Might be there just to avoid assigning to result "too early" (i.e. before there's actually a result) by someone who hates short-circuiting.

I don't think you can really do that with Min(), you need to evaluate everything anyway.

@Bort said:

MaxBy/MinBy

Worth investigating, but I don't think it handles nulls in the list well.

@Magus said:

out is awful, and anything designed this way is unchainable

It's unchainable anyway, since it doesn't return an enumerable!

@Magus said:

It can only be used in a conditional, and is nothing like the rest of linq.

Unlike, say, Any(), or FirstOrDefault()... oh wait.

@blakeyrat said:

complete inability to communicate what the hell you meant by the TryX pattern

I think it's related to returning a bool for success/failure (or in this case, found/not found), and assigning the actual value returned to an out parameter, like the TryParse family.

This whole helper function looks like it boils down to someEnumerable.FirstOrDefault(x => x != null);, but there's an edge case for an empty list of a non-nullable type - FirstOrDefault() will happily return default(T) with no indication of whether said default(T) happened because nothing was found, or because that's the value found.

Magus

@Maciejasjmj said:

It's unchainable anyway, since it doesn't return an enumerable!

I chain things off First all the time. Because it returns a useful object.

Any doesn't, but that isn't strange, because it isn't trying to get anything other than a bool.

This thing is made to get an element, except only through the out parameter. Do you really not get my point?

This stuff is a mess to use, unless you're absolutely certain you'll only ever want to call it in a conditional, and that level of certainty is rare and usually misplaced.

fatbull

Disclaimer: I rarely use C#. Use with caution.

    public static T FindFirstMinOrDefault<T, S>(this IEnumerable<T> enumerable, Func<T, S> selector) where S : IComparable<S>
    {
        return enumerable.Where(x => x != null && selector(x) != null).DefaultIfEmpty()
            .Aggregate((a, b) => selector(a).CompareTo(selector(b)) <= 0 ? a : b);
    }

(For FindFirstMaxOrDefault use >= instead of <=.)

Maciejasjmj

@Magus said:

I chain things off First all the time. Because it returns a useful object.

Or throws a not-so-useful exception. And FirstOrDefault() makes you eat a null.

I assume edge cases of nulls in the list and empty lists are an important issue here, and that there's a good reason for why there isn't a where T : class on all those methods. So again - how, for a non-nullable type (say, int) do you distinguish an empty list from a list whose first value is zero, having only an int return value?

Magus

@Maciejasjmj said:

And FirstOrDefault() makes you eat a null.

Elvis that!

@Maciejasjmj said:

I assume edge cases of nulls in the list and empty lists are an important issue here, and that there's a good reason for why there isn't a where T : class on all those methods. So again - how, for a non-nullable type (say, int) do you distinguish an empty list from a list whose first value is zero, having only an int return value?

This just seems like a case of not using the right types.

Maciejasjmj

@Magus said:

This just seems like a case of not using the right types.

Those are generic methods. Granted, the "not null" part doesn't make much sense with value types, but "first or default" does.

Magus

@Maciejasjmj said:

but "first or default" does

Unless you need to distinguish between something being there and being zero. But since you can specify a default behavior to be int.MaxValue (afaik) and not use OrDefault, it isn't so bad.

blakeyrat

Just cast to int?

Jaloopa

No, you need to TryParse

Jaime

@Magus said:

Idiocy. You have programmed in C# before, right? Have you ever seen a method called TryParse?

Point is, it's almost always the wrong choice, and whoever wrote it should be ashamed.

TryParse doesn't catch the exception and convert it to a bool. TryParse calls the same NumberToInt32 that Parse calls, but instead of throwing an exception, it returns the bool returned from NumberToInt32 directly. If you simply ignore the out param, it's a better pattern than catching the exception from Parse. If you use the out param in a "while if there, I might as well grab this" way, it adds a bit of value.

Sometimes your code is a little clearer if you parse twice. Example:

var x = Int32.TryParse(y, out z) ? Int32.Parse(y) : 0;

I don't like the extra variable in the call to TryParse, so I would turn this into an library method like this:

public static int? ParseToNullable(string s)
{
    int i;
    if(Int32.TryParse(s, out i))
    {
        return i;
    }
    else
    {
        return null;
    }
}

Then you could simplify the previous example to this:

var x = Lib.ParseToNullable(y).GetValueOrDefault(0);

Maciejasjmj

@Jaime said:

TryParse doesn't catch the exception and convert it to a bool. TryParse calls the same NumberToInt32 that Parse calls, but instead of throwing an exception, it returns the bool returned from NumberToInt32 directly.

And catching exceptions is rather costly, not to mention pointless in this case. The only reasonable thing to do with a Parse exception is to wrap it and pass it upwards as a general failure of the process.

If the data being wrong can be dealt with at the level you're parsing it, then just check if it's valid.

Ragnax

@fatbull said:

Disclaimer: I rarely use C#. Use with caution.

Close, but not the same performance characteristics: consider selector being a rather expensive operation.

I think if ou want to build from existing LINQ operations, the following gets closest:

public static TResult FindFirstMinOrDefault<TSource, TResult>(
  this IEnumerable<TSource> source,
  Func<TSource, TResult> selector
)
  where
    TSource : class,
    TResult : class,
    TResult : IComparable<TResult>, 
{
  return source
    .Select(x => new { Item = x, Key = x == null ? null: selector(x) })
    .Where(x => x.Key != null)
    .Aggregate((a, b) => a.Key.CompareTo(b.Key) <= 0 ? a.Item : b.Item);
}

with additional overloads needed for the various source/struct combination of TSource and TResult.

Jaime

@Maciejasjmj said:

And catching exceptions is rather costly, not to mention pointless in this case. The only reasonable thing to do with a Parse exception is to wrap it and pass it upwards as a general failure of the process.

If the data being wrong can be dealt with at the level you're parsing it, then just check if it's valid.

Yes. My point was that's exactly what TryParse does. I said this in response to @Magus saying the person who wrote TryParse should be ashamed, yet TryParse does exactly the right thing.

Magus

I said the person who wrote something similar should be ashamed, and my problem the whole time has been with the out param.

Jaime

TryParse returns two pieces of data. Since C# doesn't have multiple returns and returning a struct ruins the simplicity of the if condition, there aren't many alternatives.

Magus

What's your point exactly? TryParse is the only case where this makes any sense, and I already said that. Let me guess, you're going to explain why tryparse isn't wrong again in response to me not saying it is?

Jaime

Are you suggesting that I (or anyone else) should have known that the exception you were referring to when you said "almost never the right choice" was TryParse? It certainly wasn't clear from what you wrote.

blakeyrat

Dude. There's been enough posts here to adequately demonstrate that Magus has no fucking clue what he's talking about.

Give up.

Bort

@Jaime said:

Since C# doesn't have multiple returns

Does any language? Do you mean returning a tuple and assigning each value to a different variable using tuple destructuring syntax?

C# doesn't have tuple destructuring syntax...

Choonster

@Bort said:

@Jaime said:
Since C# doesn't have multiple returns

Does any language? Do you mean returning a tuple and assigning each value to a different variable using tuple destructuring syntax?

C# doesn't have tuple destructuring syntax...

Lua does actual multiple returns, no tuples:

Lua 5.3 Reference Manual

I'm not sure if there are any statically-typed languages that do this.

Bort

@Choonster said:

Lua does actual multiple returns, no tuples

I think that statement is self-contradictory.

The syntax shown in the reference manual looks like tuple-destructuring-assignments (what's the proper term?) to me. I guess Lua does something different with the evaluation order.

Lua:

name, date = getBirthday()

Haskell:

(name, date) <- getBirthday

PleegWat

nope, it's proper multiple return. name = getBirthday() is equivalent to name, _ = getBirthday().

Bort

@PleegWat said:

nope, it's proper multiple return. name = getBirthday() is equivalent to name, _ = getBirthday().

That still looks like destructuring assignment syntax to me.

I guess I draw a distinct between what a function returns and what you can do with it in the calling code. Can you do x = y, x? Does the expression y, x mutliple-return in that case?

"Multiple Returns" sounds like it applies to something like generator co-routines - they actually return multiple times.

Captain

Use your debugger and figure it out.

Choonster

@Bort said:

@PleegWat said:
nope, it's proper multiple return. name = getBirthday() is equivalent to name, _ = getBirthday().

That still looks like destructuring assignment syntax to me.

I guess I draw a distinct between what a function returns and what you can do with it in the calling code. Can you do x = y, x? Does the expression y, x mutliple-return in that case?

"Multiple Returns" sounds like it applies to something like generator co-routines - they actually return multiple times.

A function can return any number of values. The right side of an assignment statement is a list of expressions whose values should be assigned to the variables on the left side of the statement.

If a function call is the final expression in an expression list (i.e. an assignment or function call), all of its return values are appended to the value list; otherwise only the first value is appended and the rest are discarded.

In an assignment statement, each variable is assigned one of the values from the value list. If there are fewer values than variables, the value list is padded with nils. If there are more values than variables, the excess values are discarded. The arguments of a function call are treated in the same way.

x = y, x would evaluate the expressions y and x, assign the value of y to the variable x and discard the value of x.

y, x isn't an expression itself, just a list of expressions.

The manual explains this much better than I can.

Zecc

@Bort said:

@Choonster said:
Lua does actual multiple returns, no tuples:

I think that statement is self-contradictory.

I don't see why you think it's self-contradictory. Contradictory would be returning both multiple values and a tuple. (unless the tuple is one of the multiple returned values

)

I don't think Lua even has tuples out of the box. It doesn't have arrays. It's all just associative tables.

And no destructuring assignment either.

Bort

@Zecc said:

Contradictory would be returning both multiple values and a tuple.

A tuple is a value that consists of multiple values. So when you return a tuple, you are returning multiple values. And when you say a function is returning multiple values, it's really returning a tuple (even in the form of a one-row Lua table) and destructuring it as a matter of course.

It's a fine, blurry line in the terminology. It's like saying an if statement can have multiple statements in its body. It can't. It can have exactly one statement. You can group multiple statements into a single statement using a pair of { } (in C-family languages, anyway).

YOU'RE FUCKING WRONG ZECC! NOW GO CURL UP IN A BALL AND CRY!

@Choonster said:

{more Lua nonsense}

It sounds like Lua put restrictions on features other languages already had (destructuring) and called it a new "feature" (multiple returns).

Can you do this in Lua?

(x, y), z = f(), g()

Zecc

@Bort said:

And when you say a function is returning multiple values, it's really returning a tuple (even in the form of a one-row Lua table) and destructuring it as a matter of course.

But Lua does not return a table, is what I was saying:

[quote=http://www.lua.org/manual/5.1/manual.html#lua_call]

void lua_call (lua_State *L, int nargs, int nresults);
[...]
The function results are pushed onto the stack when the function returns. The number of results is adjusted to nresults, unless nresults is LUA_MULTRET. In this case, all results from the function are pushed. Lua takes care that the returned values fit into the stack space.
[/quote]
Emphasis to show you there is a palpable difference between returning a bunch of values and returning a single pointer to a tuple on the stack mine.

@Bort said:

Can you do this in Lua?

(x, y), z = f(), g()

No, not even if you return a 2-tuple from f(). Which goes to prove your point that «It sounds like Lua put restrictions on features other languages already had (destructuring) and called it a new "feature" (multiple returns).», but also proves my point that they are different things.

Bort

@Zecc said:

void lua_call (lua_State *L, int nargs, int nresults);

there is a palpable difference between returning a bunch of values and returning a single pointer to a tuple on the stack

No, not even if you return a 2-tuple from f().

This conversation has gone farther down the pedantry rabbit whole than I want to go. There's definitely a meaningful difference in usage.

Zecc

@Bort said:

This conversation has gone farther down the pedantry rabbit whole than I want to go.

You are not alone.

Jaloopa

@Bort said:

"Multiple Returns" sounds like it applies to something like generator co-routines - they actually return multiple times.

You're thinking of stuff like yield return in C#? That's not what "Multiple return" is generally accepted to mean

Onyx

@Bort said:

This conversation has gone farther down the pedantry rabbit whole than I want to go.

FTFY

Bort

@Jaloopa said:

You're thinking of stuff like yield return in C#? That's not what "Multiple return" is generally accepted to mean

Right, I don't want to warp the terminology. I was just thinking that multiple return is not a useful concept. The only example shown (the Lua one) is an artificial restriction that only looks like a feature. An artificial restriction that I find confusing. Unless those multiple returns have some other semantics that a casual skim of the manual did not reveal.

PleegWat

@Jaloopa said:

@Bort said:
"Multiple Returns" sounds like it applies to something like generator co-routines - they actually return multiple times.

You're thinking of stuff like yield return in C#? That's not what "Multiple return" is generally accepted to mean

To be honest, 'multiple return' initially makes me think of functions like setjmp() in C - where a single call can return multiple times in the same process.