Question on Source Control and Alternatives-Testing Code

Dreikin

For my refactoring of a project of mine in relation to @blakeyrat's feedback here, I made and executed some code in LINQPad to see which way was most performant for doing something, out of the ones I'd come up with. The two pieces of code are along these lines:

void Main()
{
	var acc = new { tw = new List<long>(), ss = new List<long>() };
	for (int i = 0; i < 10000; i++)
	{
		var watch = new Stopwatch();
		var strings = RandomStrings(100);

		watch.Start();
		foreach (var s in strings)
		{
			var c = TW(s);
		}
		watch.Stop();
		acc.tw.Add(watch.ElapsedTicks);

		watch.Restart();

		foreach (var s in strings)
		{
			var c = SS(s);
		}
		watch.Stop();
		acc.ss.Add(watch.ElapsedTicks);
	}

	Console.WriteLine("TW:");
	Console.WriteLine("=====");
	Console.WriteLine($"Avg: {acc.tw.Average()}");
	Console.WriteLine($"Max: {acc.tw.Max()}");
	Console.WriteLine($"Min: {acc.tw.Min()}");
	Console.WriteLine();
	Console.WriteLine("SS:");
	Console.WriteLine("=====");
	Console.WriteLine($"Avg: {acc.ss.Average()}");
	Console.WriteLine($"Max: {acc.ss.Max()}");
	Console.WriteLine($"Min: {acc.ss.Min()}");
}

// Define other methods and classes here
List<string> RandomStrings(int count)
{
	var random = new Random();
	var strings = new List<string>();
	for (int i = 0; i < count; i++)
	{
		var sb = new StringBuilder();
		sb.Append(random.Next(5000).ToString());
		sb.Append(".00");
		sb.Append(random.Next(2) == 1 ? " " : "");
		sb.Append("asdf");
		strings.Add(sb.ToString());
	}
	return strings;
}

string TW(string s)
{
	return new string(s.TakeWhile(c => !Char.IsLetter(c)).ToArray());
}

string SS(string s)
{
	var index = s.TakeWhile(c => !Char.IsLetter(c)).Count();
	return s.Substring(0, index);
}

My question is: should I put them in source control somewhere (cleaned up, of course)? Or is just writing a comment that I did it (in order to justify the possibly odd-looking final choice) enough? Or something else?

Zecc

Create a gist — which is a form of source control I guess — and put a link to it in a comment in your code.

Or put a link to this thread.

Maciejasjmj

@dreikin I'd say you definitely can leave the timing harness out. As for the tested code itself, it probably belongs in the documentation of whatever function it's related to - if you're using XMLDocs, I'd put it together with an explanation of what it does and the conclusions you drew somewhere in the remarks section.

Although if you just put a //this is faster than <other code> close to the implementation I'd be fine with it too. I don't think I've ever had anyone tell me "well, prove it!" after I explained the reasoning behind my code.

Maciejasjmj

@zecc said in Question on Source Control and Alternatives-Testing Code:

Create a gist — which is a form of source control I guess — and put a link to it in a comment in your code.

Or that, although I'm personally wary of putting links to random places in my code due to link rot.

dkf

@maciejasjmj said in Question on Source Control and Alternatives-Testing Code:

Or that, although I'm personally wary of putting links to random places in my code due to link rot.

If the link is to a proper paper discussing it (or other equivalent), they can use a DOI to refer to it from elsewhere. The main advantage of those is exactly that they're things that have long-term organisational commitments to support; unlike general URLs, DOIs are considered to be academic-quality references, at least in terms of longevity and fixed-ness of what they're talking about.

Failing that, linking to an issue report isn't too terrible, assuming you're able to use some sort of ID in the link that will be maintainable if/when you move the issue to a new system (even if just by mentioning it in a comment field; some sort of search ought to still find it).

blakeyrat

@dreikin You should use the easiest-to-understand function ("TW" in this case, but for God's sake give it a name) and completely ignore your benchmark until it measurably impacts the performance of the application.

Dreikin

@blakeyrat said in Question on Source Control and Alternatives-Testing Code:

@dreikin You should use the easiest-to-understand function ("TW" in this case, but for God's sake give it a name) and completely ignore your benchmark until it measurably impacts the performance of the application.

In that vein, I decided to do what I should have done in the first place:

public static int IndexOf(this string s, Predicate<char> predicate)
{
    for (int i = 0; i < s.Length; i++)
    {
        if (predicate(s[i])) { return i; }
    }
    return -1;
}

Easier to read/understand and faster.

It's kind of annoying that I keep finding little holes like this in the standard library. The default Distinct can't take a lamdba, nor can IndexOf.

Maciejasjmj

@dreikin said in Question on Source Control and Alternatives-Testing Code:

In that vein, I decided to do what I should have done in the first place:

It looks like a good place to use a nullable type, if we're going with code reviews. Make sure the performance doesn't suffer due to boxing, but it shouldn't.

As for Distinct, I usually end up either writing my own DistinctBy method or pulling in MoreLinq. I can understand why LINQ folks didn't want to include that, as depending on the collection it's not defined which object you'll get if a duplicate shows up, and in case of generated enumerables it might not even be the same object twice.

Also, LINQ deprecates Predicate

Do use the new LINQ types “Func<>” and “Expression<>” instead of custom delegates and predicates, when defining new APIs.

masonwheeler

@maciejasjmj said in Question on Source Control and Alternatives-Testing Code:

Also, LINQ deprecates Predicate

Do use the new LINQ types “Func<>” and “Expression<>” instead of custom delegates and predicates, when defining new APIs.

Yeah, C# is TRWTF there. There's no good reason why it can't treat two delegate types of identical signatures as exactly equivalent and interchangeable. There's a bit of technical jiggery at the IL level that will get in the way, but it can be trivially worked around in the compiler. (I know this because the Boo compiler had a solution for this problem, which I ended up improving on as part of implementing async/await.)

Maciejasjmj

@masonwheeler said in Question on Source Control and Alternatives-Testing Code:

There's no good reason why it can't treat two delegate types of identical signatures as exactly equivalent and interchangeable.

I always figured they wanted to do something similar to typedefs, only nobody asked for it and eventually they just gave up and decided to use Func everywhere. Kind of like C# 7 is now trying to deprecate structs in favor of tuples.

masonwheeler

@maciejasjmj said in Question on Source Control and Alternatives-Testing Code:

I always figured they wanted to do something similar to typedefs, only nobody asked for it and eventually they just gave up and decided to use Func everywhere.

Problem is, there are places where Func doesn't work. Try making a Func type that will accept the int.TryParse method, for example...

Magus

@maciejasjmj said in Question on Source Control and Alternatives-Testing Code:

C# 7 is now trying to deprecate structs in favor of tuples which are a type of struct.

In general, tuples are useful for smudging things a bit when dealing with a complex mass of data for a short period of time, but they don't really replace a well designed class...

As for this topic, why would you use Linqpad anymore? C# Interactive is great, and it's just... there already!

Dreikin

@magus said in Question on Source Control and Alternatives-Testing Code:

As for this topic, why would you use Linqpad anymore? C# Interactive is great, and it's just... there already!

So is Linqpad on my machine. And I can write the whole thing out without having to go through the rigamarole of creating a file to send to interactive, not try to get it right as I'm creating it on the CLI.

hungrier

Speaking of things like DotNetFiddle, is there an in-browser one that can auto-fill usings?

Magus

@dreikin said in Question on Source Control and Alternatives-Testing Code:

go through the rigamarole of creating a file to send to interactive

Why would anyone do that?

@dreikin said in Question on Source Control and Alternatives-Testing Code:

not try to get it right as I'm creating it on the CLI.

You just type. It has VS's full completion!

Dreikin

@magus said in Question on Source Control and Alternatives-Testing Code:

@dreikin said in Question on Source Control and Alternatives-Testing Code:

go through the rigamarole of creating a file to send to interactive

Why would anyone do that?

@dreikin said in Question on Source Control and Alternatives-Testing Code:

not try to get it right as I'm creating it on the CLI.

You just type. It has VS's full completion!

Yes, you just type... In order. I tend to be a bit more back and forth than works well for that. Also, linqpad has the same autocompletion, including snippets and such.

Dreikin

@maciejasjmj said in Question on Source Control and Alternatives-Testing Code:

It looks like a good place to use a nullable type, if we're going with code reviews.

Where would that come into play? My best guess is with respect to the return type, where I chose -1 for the "not present" sigil value to be consistent with the standard IndexOf methods.

Maciejasjmj

@dreikin said in Question on Source Control and Alternatives-Testing Code:

@maciejasjmj said in Question on Source Control and Alternatives-Testing Code:

It looks like a good place to use a nullable type, if we're going with code reviews.

Where would that come into play? My best guess is with respect to the return type, where I chose -1 for the "not present" sigil value to be consistent with the standard IndexOf methods.

Hm, forgot about those. Though they predate nullables, and to me personally methods like these are a poster child for them - returning -1 as a "not found" value is just asking for an IndexOutOfRangeException somewhere along the way...

Consistency vs. improvement, I guess.

Dreikin

@maciejasjmj said in Question on Source Control and Alternatives-Testing Code:

@dreikin said in Question on Source Control and Alternatives-Testing Code:

@maciejasjmj said in Question on Source Control and Alternatives-Testing Code:

It looks like a good place to use a nullable type, if we're going with code reviews.

Where would that come into play? My best guess is with respect to the return type, where I chose -1 for the "not present" sigil value to be consistent with the standard IndexOf methods.

Hm, forgot about those. Though they predate nullables, and to me personally methods like these are a poster child for them - returning -1 as a "not found" value is just asking for an IndexOutOfRangeException somewhere along the way...

Consistency vs. improvement, I guess.

Yeah, I imagine the language and library would look a bit different if they could reset like Python 2->3. (Although PHP's model for language improvement might be better, since they don't seem to be having the same problem Python has with breaking changes.)

dkf

@magus said in Question on Source Control and Alternatives-Testing Code:

In general, tuples are useful for smudging things a bit when dealing with a complex mass of data for a short period of time, but they don't really replace a well designed class...

There are a number of developers about who really don't get this fundamental fact, and instead create horrors based on sending tuples all over the place under the misguided impression that this is faster. It's virtually never the case; the advantage of tuples is for when you need a class whose scope is very local and where you don't need any methods.

We've got some of this sort of code at work; the devs in question were former students of ours. They were convinced that they could do things faster by doing tuples all over the place. We've been converting their algorithms to use our production library — with classes! the horror! — and are getting two to three orders of magnitude of speedup, despite the whole thing still being in Python (and therefore disgustingly slow).