Programming Language Learning Curves

Captain

While I understand and pretty much agree with your points, I'm a little surprised by this one. What is the basis for it?

Sure, things can get complicated if you've got assignments on one or both of the sides,

The problem is referential opacity. The a in a == b can mean different things at different times. So, for example, there is no guarantee that

a = 0;
b = 0
a == b
...
a == b

because the meaning of a might have changed in those '...'

This is probably a different point of view on languages that most other people around here (and people might think it's a weak example, but it is what it is), since it treats expressions as definite descriptions or references (or some other similar construct).

Edit: Wrong person-oed.

Maybe in higher-level formal math. High school algebra students only recognize =.

Not really. Everybody will understand := means "is defined by the expression", but hardly anybody uses it. It's actually kind of pointless, since a definition for a quantity will usually have words to explain what the symbols mean.

Jaime

Remember, all this goes back to:
[quote=Jaime]I read a study a long time ago that gave a pre-test to a bunch of programming students and followed their progress. The one question on the test that was most predictive of success was

.. snip...

It was the difference between looking at the program as whole and thinking that every statement should be simultaneously satisfied and thinking of a program as a series of steps.[/quote]
The students that "didn't get it" looked at a computer program like it was work on an algebra problem instead of like a recipe. They don't get the whole idea of "meaning of a might have changed in those '...'", because that's not how it works in algebra class.

Gąska

If the teacher hasn't made it very very clear that programming is totally different from algebra and that it just looks similar, then that's the teacher's fault.

Captain

The students that "didn't get it" looked at a computer program like it was work on an algebra problem instead of like a recipe. They don't get the whole idea of "meaning of a might have changed in those '...'", because that's not how it works in algebra class.

Plenty of programming problems are algebra problems. Trying to express their solution as a recipe is painful.

For example, implementing Black-Scholes as a series of steps loses the clarity of the original formulation.

People who can write Black-Scholes without looking up the formula probably make 3-5 times what you do, over average. (No offense intended, but it's still true. Incidentally, I can do it and make almost nothing. But I am a career changer who is taking professional exams and had to learn how)

Yamikuronue

@Captain said:

Everybody will understand := means "is defined by the expression"

I guarantee Algebra I students will not.

Jaime

@Gaska said:

If the teacher hasn't made it very very clear that programming is totally different from algebra and that it just looks similar, then that's the teacher's fault.

So... they did a study to investigate student's experiences learning programming. The people who did the study determined, from the data they collected, that there was a question in the pre-test that was a strong predictor of future success. You have decided that the results of the study are completely irrelevant and you just know, without any data, that the teachers were bad. Not only bad, but bad with a distribution such that only students who got a particular pre-test question wrong got bad teachers.

Jaime

@Captain said:

Trying to express their solution as a recipe is painful

You're describing why many problems are best tackled with a functional paradigm rather than a procedural one. Great. That doesn't explain why a large portion of the student population of this planet have a hard time learning procedural programming. This study tried to tackle that question.

Gąska

@Jaime said:

So... they did a study to investigate student's experiences learning programming. The people who did the study determined, from the data they collected, that there was a question in the pre-test that was a strong predictor of future success. You have decided that the results of the study are completely irrelevant and you just know, without any data, that the teachers were bad. Not only bad, but bad with a distribution such that only students who got a particular pre-test question wrong got bad teachers.

Two things:

If they got this question wrong, it means they don't know the difference between programming and algebra. The most probable cause is that no one has ever told them they're different.
I think that a significant part of those who got this question right had previous experience with programming already.

I hope you have a citation that disproves at least one of the two.

EvanED

@Gaska said:

If they got this question wrong, it means they don't know the difference between programming and algebra. The most probable cause is that no one has ever told them they're different.

Certainly; the pre-test was given before the start of the class. Performance on that question was a strong predictor for how well they did in that class. (Also see below.) I think the point is that people who were able to half-infer, half-guess the meanings are more of a "programmer mindset", whatever that means.

@Gaska said:

I think that a significant part of those who got this question right had previous experience with programming already.

I am almost certain that I've read and discussed in a reading group the same paper as Jaime is referencing. It was a couple years ago, so my memory could falter, but I don't remember "the authors didn't control for prior experience" being an issue. And considering that the paper probably wouldn't have been published if they hadn't I'm inclined to think that they did. I think that we had some other discussion about questions left open, but not that one.

I will try to find the paper this evening; I think the fastest way will be for me to look through old emails on an account I don't have good access to currently.

Gąska

@EvanED said:

Certainly; the pre-test was given before the start of the class. Performance on that question was a strong predictor for how well they did in that class.

Maybe because no one told the even after the initial test, for the whole semester (or however much time the class took)?

@EvanED said:

I will try to find the paper this evening; I think the fastest way will be for me to look through old emails on an account I don't have good access to currently.

I'd be very thankful.

da_Doctah

Doesn't C also have "assign-ops"? And doesn't consistency require that X == Y means "assign to X the value of X = Y", an expression which I'm led to believe is simply whatever the current value of Y happens to be?

ben_lubar

X == Y
X = (X = Y)

color-coded that for you.

HardwareGeek

@Yamikuronue said:

@Captain said:
Everybody will understand := means "is defined by the expression"

I guarantee Algebra I students will not.

I've had high school Math through Calculus and several years of college Calculus, Diff Eq, Linear Algebra and Advanced Engineering Mathematics, and I didn't know (or didn't remember) that meaning. (Of course, that was all decades ago, so it's quite possible I did know it at one time and had forgotten, but still not "everybody will understand.")

EvanED

@Gaska said:

@EvanED said:
I will try to find the paper this evening; I think the fastest way will be for me to look through old emails on an account I don't have good access to currently.

I'd be very thankful.

So this turned out to be a much more complicated story than I thought.

The paper was about CS learning, so I assumed it was something I encountered while participating in the teaching & learning CS reading group I was in as a grad student. But I looked at the list of papers we covered, and unless it's incomplete or has a rather different title, I apparently invented the discussion. There are two possibilities for where I was familiar with it from. The first possibility is that I ran across some paper discussing this experiment when I was looking for a paper to moderate a discussion on for the reading group. This is totally feasible; I was skimming through a bunch of papers on mental models and this research falls squarely into that category.

However, I suspect the truth is different, and here's where things get hairy. From what I can reconstruct, the following is what happened. As a note, I haven't read most of the links I'm about to give you.

In the mid-2000s, Saeed Dehnad (a grad student of Richard Bornat) carried out the aforementioned experiment. They wrote up their experiments in a paper called "The Camel Has Two Humps." You can easily find it, but I won't link to it because of what I describe next. This paper, as I remembered, does actually control for prior programming experience (namely, they claim a cohort that doesn't have any). This paper made the rounds at the time, including showing up on our own @codinghorror's blog. (@codinghorror, you may want to amend your post according to the following. :-)) There's actually a reasonable chance this is where I originally came across it.

Later research has had... mixed success bearing out the conclusions. This includes a paper (ASE 2008) by the authors for which the abstract says "We now report that after six experiments, involving more than 500 students at six institutions in three countries, the predictive effect of our test has failed to live up to that early promise." Another study (ITiCSE 2007) by other authors also didn't substantiate the results. This alone isn't such a strike against them; that's just how science works. And this sort of subject is psychology, which by its nature is very difficult to study.

Furthermore, it sounds like the case isn't even completely closed; a number of researchers have tried to reproduce the results, and Dehnadi apparently conducted a meta-analysis (PPIG 2009) of these that has more positive results -- including that the test is a stronger predictor of end-of-course success than is prior programming experience.

(Part of the reason for this is the interpretation of Dehnadi's test hasn't been accurately described in this thread yet. The predictor is not whether the person guesses the operation of = correctly. Rather, they are asked a number of similar questions, and the answers are compared. Subjects are grouped depending on whether they have consistent or inconsistent answers between questions. If you guessed wrong about how = behaves but guessed wrong in the same way across all questions, you would be put into the "consistent" pile -- and it's whether you appear in the consistent pile that predicts your end-of-course success.) [Typo fixed]

Here's the problem with that paper. Actually here is not the problem because there are multiple:

"The Camel Has Two Humps" version was not and has not been published in a scientific sense (i.e. peer reviewed).
The 2006 paper is actually quite awful on a number of fronts; the most offensive (even if their experiments had been substantiated) is that it goes on to claim not only that the current courses aren't teaching the people who "fail" the pre-test CS, but that those who fail the pre-test are unteachable. They dramatically overclaim in many other ways as well.

For the above reasons, the authors (or at least Bornat) have actually issued a formal retraction of the original, 2006 paper. The retraction has more, but effectively it boils down to that he was being treated (and for a while, suspended from his job) for depression; here is his explanation:

I took the SSRI for three months, by which time I was grandiose, extremely self-righteous and very combative – myself turned up to one hundred and eleven. I did a number of verysilly things whilst on the SSRI and some more in the immediate aftermath, amongst them writing “The camel has two humps”. I’m fairly sure that I believed, at the time, that there were people who couldn’t learn to program and that Dehnadi had proved it. The paper doesn’t exactly make that claim, but it comes pretty close. Perhaps I wanted to believe it because it would explain why I’d so often failed to teach them. It was an absurd claim because I didn’t have the extraordinary evidence needed to support it. I no longer believe it’s true.
I also claimed, in an email to PPIG, that Dehnadi had discovered a “100% accurate” aptitude test (that claim is quoted in (Caspersen et al., 2007)). It’s notable evidence of my level of derangement: it was a palpably false claim, as Dehnadi’s data at the time showed.

So anyway, there are some references to chase. :-)

dkf

@Jaime said:

That's a complier optimization that improves a program's memory footprint. It doesn't change the semantics of the language.

If it changed the semantics, it would be a wrong compilation, and you entirely missed my point in the first place.

PleegWat

@ben_lubar said:

X = (X = Y)

That's undefined in C, because you are assigning to X twice. Although I think people have mentioned somewhere around here that newer versions of C++ allow it.

Gąska

@PleegWat said:

That's undefined in C

Weird. No possible order of evaluation will make the result anything but Y.

PleegWat

Which is why the most recent C++ standards allow it. But in C, it's undefined.

Gąska

If it's undefined, what are the possible interpretations besides X=Y?

Evaluation order of function parameters is undefined too, but atan2(x, y) is guaranteed to always work the same.

dkf

If the computation of the arguments is side-effect-free, it doesn't matter what order they're evaluated in. If they've got side-effects but the side-effects don't interfere, the order in which they are evaluated also doesn't matter. Both of these cases are defined.

It's only when there are interfering side effects that it matters, and that's when the result of the call ceases to be defined. That's not the poor function's fault; it just works with what it is given.

Gąska

Assignment X=Y results in X having value of Y and evaluates to value of Y.

Assignment X=(whatever that evaluates to Y) results in X having value of Y and evaluates to value of Y.
Assigning Y to variable that has value of Y results in the variable having value of Y.

So, no matter how insane interpretation you'll choose, X will get the value of Y. There's no other way. It might be undefined, but the result is perfectly predictable and independent of anything. X=Y.

Jaloopa

@da_Doctah said:

Only PL/1, alone among all the languages I've worked in, used the same symbol for both and left it to context to tell them apart

SQL?

DECLARE @a int  = 1
IF @a = 1 PRINT 'Hello World'

dkf

@Gaska said:

So, no matter how insane interpretation you'll choose, X will get the value of Y. There's no other way.

Having worked with compilers more now, I'd hesitate a bit before asserting that.
@Gaska said:

It might be undefined, but the result is perfectly predictable and independent of anything. X=Y.

I'd agree that the nasal demons are highly likely to be kindly in this case, but I really wouldn't want to be too certain that it will hold in general. It would be strongly predictable in Java or C# because they have a strict evaluation order defined in their semantics (the optimizers can vary things when they issue the code, but they have to preserve the semantics) but C and C++ simply don't make that guarantee.

dkf

@Jaloopa said:

SQL?

DECLARE is not standard SQL, but rather just one of the extensions to it.

Gąska

@dkf said:

I'd agree that the nasal demons are highly likely to be kindly in this case, but I really wouldn't want to be too certain that it will hold in general.

Look. This isn't "run Nethack" level of undefined. It's "do one assignment or the other" level of undefined - and since both assignments assign the same value, there's nothing to worry about. If it was like "a = b = c = b = a", then it would totally be nasal demon territory. But it's "x = x = y".

Jaloopa

Just because it's undefined doesn't mean it's not going to have the same behaviour in almost all compilers. If the standard says it's undefined, that means that whatever the compiler does if it encounters it is legal. If I wrote a compiler that formats your hard drive when it encounters a line x = x = y it would be perfectly legal. Stupid, but legal. So in general, don't rely on any UB behaving sensibly

Gąska

@Jaloopa said:

If I wrote a compiler that formats your hard drive when it encounters a line x = x = y it would be perfectly legal.

No it wouldn't. Assignment is perfectly defined operation. So is assignment. The only thing undefined is in what order they'll be carried out. And in this particular case it doesn't matter because either way, the result will be the same.

Jaloopa

I don't know the C specification, but this would depend on whether the order of operations is undefined or the whole statement is undefined. If it's the former then you're right. If the latter then I am

Gąska

It would be very very bad if assignment was undefined.

Jaloopa

There's a difference between assignment and multiple assignments in the same statement

Gąska

The only difference is that there are multiple assignments. Each of them works the same as if it was alone - except for the order of evaluation.

Yamikuronue

The whole thing is undefined. The standard says:

Between the previous and next sequence point a scalar object shall have its stored value modified at most once by the evaluation of an expression.

There's no sequence point between the first assignment and the second to the same variable. This particular example is the most benign type of UB, the type where it's obvious what should happen and just about all compilers agree, but the general case leads to some wonky examples like

Multiple compound assignments in a single statement: is it Undefined Behavior or not?

I can't find a definitive answer for this: does the following code have undefined behavior? int x = 2; x+=x+=x+=2.5;

(quoting because onebox is useless:

x += (x+=1)

x+=x+=x+=2.5;

)

Jaloopa

@Yamikuronue said:

sequence point

That's the keyword I was trying to remember.

Common sense really doesn't come into what's legal when it comes to UB. If the standard says it's undefined, literally anything can happen

PJH

@Jaloopa said:

If the standard says it's undefined, literally anything can happen

User:CompuHacker/CHDS9000 - Wikipedia

Buddy

If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined. If there are multiple allowable orderings of the subexpressions of an expression, the behavior is undefined if such an unsequenced side effect occurs in any of the orderings.

and

An assignment operator stores a value in the object designated by the left operand. An assignment expression has the value of the left operand after the assignment, but is not an lvalue. The type of an assignment expression is the type the left operand would have after lvalue conversion. The side effect of updating the stored value of the left operand is sequenced after the value computations of the left and right operands. The evaluations of the operands are unsequenced.

So there is no sequence specified between the side effects in a multiple assignment, thus the behavior is undefined. Undefined in the nasal demons sense.

Filed Under: Hanzo'd

Gąska

@Yamikuronue said:

Between the previous and next sequence point a scalar object shall have its stored value modified at most once by the evaluation of an expression.

Oh, I didn't know of that. I admit, I was wrong.

Bulb

@EvanED said:

```
while ((c = getchar()) != EOF)
Is best handled using iterators. But there is one kind of construct where `=` not returning value hurts:
m = re1.match(line)
if m:
whatever(m)...
else:
m = re2.match(line)
if m:
whatever2(m)…
else:
m = re3.match(line)
if m:
which would be just so much easier to write and read and understand if it could be written
if m = re1.match(line):
whatever(m)...
elif m = re2.match(line):
whatever2(m)
elif m = re3.match(line):
...

Bulb

@Gaska said:

It's "do one assignment or the other" level of undefined - and since both assignments assign the same value, there's nothing to worry about.

Everybody, please welcome the… optimizer.

Oh, hi, so what do we have here. The value of x depends on y. Oh, no, it actually depends on x (so it does not depend on y after all, because there can only be one side-effect to a variable between sequence points). But x depends on x is trivial, we can skip that. Therefore value of x is unchanged and the statement can be ignored.

Now I am not sure whether this is what the dependency analysis in gcc will do in this case, but there are many cases where gcc dependency analysis produce completely unexpected results when you feed it some undefined statements.

Gąska

As I said, I didn't know only one change can happen to a variable between sequence points. This invalidates all that I said before.

Gąska

@Bulb said:

```
m = re1.match(line)
if m:
whatever(m)...
else:
m = re2.match(line)
if m:
whatever2(m)…
else:
m = re3.match(line)
if m:
Obligatory Rust:
whatever(
if let Some(m) = re1.match(line) { m }
else if let Some(m) = re2.match(line) { m }
else if let Some(m) = re3.match(line) { m }
);

Bulb

Oh, Rust is the new cool. The pinnacle of milenia of the language development. Of course it handles the case nicely ;-P.

Onyx

DeathStation 9000 or DS9K is sometimes also used as an adjective, as in "a DS9K endianness", meaning an endianness which is neither big-endian nor little-endian, like the American date format MM/DD/YYYY.

Excuse me, there's a thread I need to visit

Captain

I'm pretty sure (but not certain) that P.R. Halmos popularized the usage in the 1940s and 1950s. But I get your point. One student (me) in one school (my school) does is not data, just an anecdote.

tar

@Bulb said:

Everybody, please welcome the… optimizer.

There's been some fairly indepth discussion of this, here, at least:

http://what.thedailywtf.com/t/nasal-string-length/5250/94

(I now relieve myself of the duty of discosearching for the rest of it—if I can't find a fucking URL which I know was posted multiple times, what chance do I have of finding anything useful...)

PJH

@Captain said:

One student (me) in one school (my school) does is not data, just an anecdote.

Anecdata.

da_Doctah

@Captain said:

One student (me) in one school (my school) does is not data, just an anecdote.

The sentence above has circumcise one too many verbs.

ijij

@da_Doctah said:

The sentence above has circumcise one too many verbs.

That's what I was thinking... but :shudder: not how I was thinking it.

Jaime

@dkf said:

DECLARE is not standard SQL, but rather just one of the extensions to it.

OK... so....

T-SQL?

SELECT @fname = FirstName
FROM Employees
WHERE EmployeeID = 42

PleegWat

Or just SQL?

UPDATE TABLE SET COL1='Value' WHERE COL2='Other Value'

dkf

@Jaime said:

OK... so....

T-SQL?

Every DB vendor does extensions to the baseline SQL standard.