I think I've just found a case when a premature optimisation isn't necessarily evil

wft

I'm now chest-deep in a sea of shit (Javascript, Selenium, Jasmine, Protractor, testing an Angular app, how shittier can it get?), refactoring and optimizing it.

The problem is that the current way it's written, each of the 50 tests takes 2 minutes on average. And while I fix them, I need to re-run them every so often. The absence of a simple luxury of being able to hit "Run" and watch a result in a few seconds, given there are subtle intricacies, race conditions, and maybe somewhere outdated reference values (the easiest kind), can seriously strip a soul of productivity joy.

I think those paradigm zealots who believe in "get it to run", "get it to run right", "get it to run fast" may start throwing stones at me. Sweet zombie Jesus, it's a test. It can't take as much in maintenance costs (or even more in some parts) as the application it actually checks, dammit!

Captain

@wft Yeah, that sucks. You might as well be using a compiled language, and get type safety checks "for free".

LB_

If only there were some way to cache unchanged code so that it doesn't always have to be reprocessed...

Wait, isn't that what makefiles did decades ago?

Kian

How can a test take two minutes? What the hell is it doing? It's one thing to not optimize prematurely, it's another to pessimize at every step.

dkf

@wft said in I think I've just found a case when a premature optimisation isn't necessarily evil:

each of the 50 tests takes 2 minutes on average

That sounds really annoying. There's a case to be made for setting up the test fixture once and then running the individual tests within it; while there's an issue with possible inter-test interactions, if it speeds things up it can be worthwhile.

flabdablet

@wft said in I think I've just found a case when a premature optimisation isn't necessarily evil:

how shittier can it get?

Terry Gilliam's Movie Brazil - Plumbers' Nasty End – 02:59
— girlsngear

sloosecannon

@Kian said in I think I've just found a case when a premature optimisation isn't necessarily evil:

How can a test take two minutes? What the hell is it doing? It's one thing to not optimize prematurely, it's another to pessimize at every step.

Err, I've written one that takes two minutes...

Granted it's downloading a several-meg zip file via screen scraping.... And I only have one that takes that long..... But still.

asdf

@wft said in I think I've just found a case when a premature optimisation isn't necessarily evil:

each of the 50 tests takes 2 minutes on average

I hope they're not supposed to be unit tests, then.

mott555

I don't know if this applies to you, but when I do Node.js work I often run my working copy from a ramdisk. My unit tests (very I/O intensive because they create a fresh default database for each test run) execute nearly 100x faster there than on my HDD.

anotherusername

@sloosecannon said in I think I've just found a case when a premature optimisation isn't necessarily evil:

downloading a several-meg zip file via screen scraping

Yamikuronue

@wft This optimization isn't premature, though. The end user for the tests (you) is experiencing unacceptable performance that's inhibiting your effort to get work done. If this is premature, there's never a time when optimization would be mature.

sloosecannon

@anotherusername Testing a data-import operation on a site that doesn't provide any form of API......

anotherusername

@sloosecannon the first is the fact that it's even producing a several-megabyte zip file in any sort of format that you can screen-scrape. What is it, base64 encoded? Why? And why does it take so long to download and base64-decode several megabytes?

Or is it more of "scrape a bunch of stuff out of the page and stuff it all in a local zip file"...that I might actually be able to believe.

sloosecannon

@anotherusername well, the screen scraping is more of "click on a bunch of web forms to make the server generate a zip to download"

But yeah, my code is a huge WTF to work around the even bigger WTF of having no API whatsoever....

anotherusername

@sloosecannon ah, yeah that would be a pain.

dse

Bounds on the amount of time it takes to test your software should be part of the requirements. In that light, anything you do to optimize and bring it in line with the requirements is just software. Assuming the unit tests close the loop in the software development feedback, high latency will result in instability.

</rant>

Why is my <pre></rant></pre> not displayed?

sloosecannon

@anotherusername yeah.

The code involves loading the login page, getting a csrf token, submitting a POST, verifying the login was successful, loading another page to get another token, submitting a form on that page, triggering the server to generate a zip, then scraping the link from the response, then downloading it.

It's very prone to break and so I test the hell out of it.........

PleegWat

@dse said in I think I've just found a case when a premature optimisation isn't necessarily evil:

</rant>
Why is my <pre></rant></pre> not displayed?

HTML sanitizer. use <

dse

@PleegWat said in I think I've just found a case when a premature optimisation isn't necessarily evil:

HTML sanitizer

I am no expert but Is it supposed to sanitize within the <pre> block?

PleegWat

@dse Ask @ben_lubar

ben_lubar

@dse HTML doesn't parse <pre> any differently than it parses <div> with some CSS.

dkf

@PleegWat said in I think I've just found a case when a premature optimisation isn't necessarily evil:

use <

Use ```

fbmac

any cargo culting is a terrible thing.

the premature optimising saying is something to be understood and you're supposed to know when it's appropriate to ignore it.

I prefer cowboy coders than annoying cultists telling me what to do provide a reasonable reason.

kt_

@wft said in I think I've just found a case when a premature optimisation isn't necessarily evil:

Sweet zombie Jesus

You mean like:

Tim Minchin - Woody Allen Jesus – 04:58
— BestOfTimMinchin

Jarry

@sloosecannon

Scrapy | A Fast and Powerful Scraping and Web Crawling Framework

it's a bit on the heavy side, but is has almost every problem for a scraper resolved. give it a look, maybe it'll help you

Yamikuronue

@fbmac said in I think I've just found a case when a premature optimisation isn't necessarily evil:

I prefer cowboy coders

I don't. I have to clean up their shit they leave everywhere.

At least if you do the right thing for the wrong reasons, when you're long gone, the right thing will still be there for someone else to maintain.

fbmac

@Yamikuronue said in I think I've just found a case when a premature optimisation isn't necessarily evil:

At least if you do the right thing for the wrong reasons, when you're long gone, the right thing will still be there for someone else to maintain.

The cultist moron that forbidden me from using "+" to concatenate strings on C# wasn't doing the right thing.

And the overengineered clusterfuck they did have there wasn't easy to maintain too.

RaceProUK

@fbmac said in I think I've just found a case when a premature optimisation isn't necessarily evil:

The cultist moron that forbidden me from using "+" to concatenate strings on C# wasn't doing the right thing.

You're not meant to be doing that anyway; it's hideously inefficient. You should either be using String.Format(), a StringBuilder, or in C# 6, a formattable string.

accalia

@RaceProUK said in I think I've just found a case when a premature optimisation isn't necessarily evil:

You're not meant to be doing that anyway; it's hideously inefficient

while technically true for short strings, or operations that happen extremely infrequently the horribly inefficient way can actually be faster as it doesn't have the object construction/teardown overhead

whether it is or isn't in your particular case will require careful instrumentation and testing to determine

also, if you are dealing with string constants only, you absolutely should use the + concatenation operator because the C# compiler will perform that concatenation at compile time where using string builder or string concat will always be done at run time.

RaceProUK

@accalia If I have a string constant that long, I'd prefer to make it a multiline string instead of a series of concatenated single-line string constants.

Then again, if I have a string constant that long, I'd probably want to rethink my design

Jaime

@RaceProUK said in I think I've just found a case when a premature optimisation isn't necessarily evil:

You're not meant to be doing that anyway; it's hideously inefficient. You should either be using String.Format(), a StringBuilder, or in C# 6, a formattable string.

The compiler converts almost all "+" operations to String.Concat where it matters. The only ones that are a cardinal sin are the ones done in a loop. Forbidding "+" only serves to make code harder to read.

dkf

@accalia The + operator is the most efficient way to concatenate strings to give another string. It's only poor when you're doing many concatenation operations, and horrible when used in a loop; the problem is the shuffle back and forth between string and (internal) string builder. Avoiding the problem requires doing more work explicitly, or greater shenanigans with the coupling between immutable strings and mutable builders (which is problematic when threads are about).

asdf

@RaceProUK said in I think I've just found a case when a premature optimisation isn't necessarily evil:

it's hideously inefficient. You should either be using String.Format()

String.format() is not efficient at all. The reason you're told to use that is because it makes strings easy to translate.

Onyx

The whole "premature optimization" was always a silly term to me. What constitutes "premature" anyway? Am I supposed to avoid early returns and breaks because each one took extra 10 minutes of thought and testing? Or thinking about an efficient way to model my data before I commit to it? Should I not use any indexes in my database before I notice I might need one? Where is the line between "premature optimization", "good planning" and "good code"? Who gets to decide?

dkf

@Jaime said in I think I've just found a case when a premature optimisation isn't necessarily evil:

The only ones that are a cardinal sin are the ones done in a loop.

No, you also see code like this sometimes:

string a = "fish" + squirrel;
a = a + "sodom";
a = a + gomorrah;
a = a + "some other" + "stuff";
a = a + yaddayadda();

Yes, that code is bad and it should feel bad. Yes, only the ignorant write it like that. Yes, it occurs far too often, sometimes even from people who you'd think would know better…

fbmac

@accalia said in I think I've just found a case when a premature optimisation isn't necessarily evil:

whether it is or isn't in your particular case will require careful instrumentation and testing to determine

but none of you cultists asked about the particular case.

oh yeah, let's setup some instrumentation to see if my concatenation in my website button click that happens in 1ms is good to go.

accalia

@fbmac said in I think I've just found a case when a premature optimisation isn't necessarily evil:

button click that happens in 1ms

if it happens in less than a thousanth of a second who the flip cares if it's efficient? unless its literally going to be clicked more than a thousand times a second you care more about readability than performance. and in that case concatenation wins, particularly when the compiler is already going to be doing a fairly substantial amount of optimization for you anyway once you compile in release mode.

fbmac

@accalia the definition of cargo cult, thats what I'm complaining about, is that they don't know or care of why they are doing something.

the cases where replacing the plus operator with something else like a string builder matter are very rare in the type of software I was working at the time.

Jaime

@Onyx said in I think I've just found a case when a premature optimisation isn't necessarily evil:

The whole "premature optimization" was always a silly term to me. What constitutes "premature" anyway? Am I supposed to avoid early returns and breaks because each one took extra 10 minutes of thought and testing? Or thinking about an efficient way to model my data before I commit to it? Should I not use any indexes in my database before I notice I might need one? Where is the line between "premature optimization", "good planning" and "good code"? Who gets to decide?

There's a difference between "optimization" and "writing proper code". In the case of premature optimization, the rules refers to deferring any optimization operation that has a cost - usually in maintainability, occasionally in time it takes to write the code, sometimes in another area of performance (like indexes slowing writes while speeding reads).

dkf

@Jaime said in I think I've just found a case when a premature optimisation isn't necessarily evil:

In the case of premature optimization, the rules refers to deferring any optimization operation that has a cost - usually in maintainability, occasionally in time it takes to write the code, sometimes in another area of performance (like indexes slowing writes while speeding reads).

That warning is particularly about not putting effort into optimizing stuff that has no actual useful effect anyway. Don't optimize things that aren't on a critical path.

Making the code clear to read is independent, and virtually always a good plan…

Yamikuronue

@Onyx said in I think I've just found a case when a premature optimisation isn't necessarily evil:

What constitutes "premature" anyway?

Honestly, the warning is meant to combat the itch to meddle. "Well I might be able to refactor this loop to be faster... actually I bet I could use that new technique I heard about here... well what if I moved this all to separate threads..."

We like to tweak things endlessly, and the returns are often minimal. We should, once something works and is reasonably well written, stop and move on unless we find there's a real, honest performance gain to be had.

Jaime

@dkf said in I think I've just found a case when a premature optimisation isn't necessarily evil:

@Jaime said in I think I've just found a case when a premature optimisation isn't necessarily evil:

The only ones that are a cardinal sin are the ones done in a loop.

No, you also see code like this sometimes:
string a = "fish" + squirrel;
a = a + "sodom";
a = a + gomorrah;
a = a + "some other" + "stuff";
a = a + yaddayadda();
Yes, that code is bad and it should feel bad. Yes, only the ignorant write it like that. Yes, it occurs far too often, sometimes even from people who you'd think would know better…

This code isn't even that bad from a performance perspective. Iterative string concatenation "horribleness" is proportional to the length of the string squared. Your example (and 99% of similar blocks of code) only frivolously allocates a small amount of memory and only creates a few short-lived instances.

Gąska

@wft said in I think I've just found a case when a premature optimisation isn't necessarily evil:

each of the 50 tests takes 2 minutes on average.

In my current project, we have over thousand tests that run 2 minutes each. Granted, we don't often need to run all of them at once continuously, and they can run in parallel (up to 50 at once, although it makes each of them run longer - still a net gain), but they're still all run as part of CI. Given that we have 10 configurations that all need to be compiled and tested, and that our CI system is so fucked up that despite having really overblown infrastructure, it still takes 2 hours to make a single CI build. TWO HOURS. IN 200KLOC PROJECT.

Oh, and we have tests failing randomly more often than not. And even if they all pass, CI often breaks anyway for a number of reasons, half of them unknown. And we need to pass CI to ship. And we need to ship every feature to up to 5 different branches.

dkf

@Jaime It does depend on the context. When it's inside a method that's called from somewhere else a lot of times, it matters.

dkf

@Gąska said in I think I've just found a case when a premature optimisation isn't necessarily evil:

TWO HOURS. IN 200KLOC PROJECT.

Either you're doing something insanely elaborate in C++ probably involving boost, or you've got a serious problem. (Like you can distinguish these cases. )

flabdablet

@Jaime said in I think I've just found a case when a premature optimisation isn't necessarily evil:

Iterative string concatenation "horribleness" is proportional to the length of the string squared.

That's really only true of runtimes that don't maintain an explicit length as part of their string data structure. Those that do should be able to do string concatenation somewhere between constant time and O(length) depending on how smart their memory allocators are.

dkf

@flabdablet said in I think I've just found a case when a premature optimisation isn't necessarily evil:

That's really only true of runtimes that don't maintain an explicit length as part of their string data structure.

Or where the string is a true immutable type, which also forces the full copy.

flabdablet

@dkf That depends on how immutability is actually implemented. If it's done by using shared read-only access to an underlying text pool, it's quite feasible to copy descriptors that identify chunks of text rather than the text itself; this makes most string operations take time dependent on the number of chunks composing any given string rather than its raw length in characters. If the descriptors are arranged in trees, that doesn't even need to be O(chunks) - it could be O(log chunks).

Bulb

@flabdablet, you mean something like the ropes that STL had, but never made it to any standard? I have not seen them in any other language (standard runtime) either. Apparently the complexity of the code isn't worth the computational complexity benefits, especially when you take into account that they are much worse for the caches.

flabdablet

@Bulb That kind of thing, yes.