Incredible Machine Learning

Benjamin Hall

@dcon said in Incredible Machine Learning:

@Benjamin-Hall said in Incredible Machine Learning:

@cvi said in Incredible Machine Learning:

That said, don't just copy-paste those BibTeX entries (or whatever). Go over the damn references, and make sure they look alright and are consistent (abbreviating the venues the same way etc).

Sorta off topic, but when I did my PhD dissertation and sent it in for college-level editorial approval, I got rejected the first time. Why?

Because in two out of N (I forget how many) references in the reference section, I used (wait for it) two spaces after a period instead of one.

That was the only rejection reason. <sarc>Money well spent, I tell you what.</sarc>

Now you get PRs rejected because you used tabs instead of spaces. See, it was good training!

Or I have liquibase barf on me because there was a tab somewhere in the JSON changeset file. And when editing them in VIM and it auto-inserting a mix of tabs and spaces... <rage>

cvi

@Benjamin-Hall said in Incredible Machine Learning:

Because in two out of N (I forget how many) references in the reference section, I used (wait for it) two spaces after a period instead of one.

That seems slightly petty. Realistically speaking, how many people are going to read any given PhD thesis anyway? On average, it's going to be like the members of the thesis committee and, well, ..., yeah, that's about it.

Filed under: Sandwich/stapled theses FTW.

error

@dkf said in Incredible Machine Learning:

@remi said in Incredible Machine Learning:

(*) like "never use active form" (or "never use 'we'") which leads to awful passive-form-everything. Not only this makes sentences much more awkward to read, but it allows for weasel-wording everything. No, "the parameter" did not "had its value chosen" by some abstract and unspecified entity. You "chose the parameter", man-up to it and admit it. If that makes you cringe because you can't justify that choice, well maybe it's a good thing you notice it before the paper is actually published?

There should always be a minimum of at least one paragraph in active voice in a paper (excluding papers that are explicitly just summarizing the field) being the part where you claim that you did the work. Without that, it could just as well be done by someone else and then why the fuck are you writing a paper about it you plagiarizing bastard?!

@error_bot xkcd crowdsourcing

error_bot

xkcd said in https://xkcd.com/1060/ :

Crowdsourcing

We don't sell products; we sell the marketplace. And by 'sell the marketplace' we mean 'play shooters, sometimes for upwards of 20 hours straight.'

1060: Crowdsourcing - explain xkcd

)

Benjamin Hall

@cvi said in Incredible Machine Learning:

@Benjamin-Hall said in Incredible Machine Learning:

Because in two out of N (I forget how many) references in the reference section, I used (wait for it) two spaces after a period instead of one.

That seems slightly petty. Realistically speaking, how many people are going to read any given PhD thesis anyway? On average, it's going to be like the members of the thesis committee and, well, ..., yeah, that's about it.

Filed under: Sandwich/stapled theses FTW.

That's not an average, that's a maximum for most theses.

dkf

@Benjamin-Hall said in Incredible Machine Learning:

That's not an average, that's a maximum for most theses.

It's certainly both the modal and median value. And the long tail isn't very long.

dkf

@Benjamin-Hall said in Incredible Machine Learning:

@cvi said in Incredible Machine Learning:

That said, don't just copy-paste those BibTeX entries (or whatever). Go over the damn references, and make sure they look alright and are consistent (abbreviating the venues the same way etc).

Sorta off topic, but when I did my PhD dissertation and sent it in for college-level editorial approval, I got rejected the first time. Why?

Because in two out of N (I forget how many) references in the reference section, I used (wait for it) two spaces after a period instead of one.

And then you have papers like https://arxiv.org/abs/2105.05956 (which I was reading last week) where every section seems to have radically different formatting. It was when I saw the part in a different colour that I started to think that maybe, just maybe, the lead authors should have done something about fixing it.

Also some sections are useless filler. (Others are very good unless you're deep in the field. I'm far enough in to know the difference at a glance. And to wonder why some of these idiots haven't cited enough related work.)

remi

@Benjamin-Hall said in Incredible Machine Learning:

@remi Turgid scientific writing is one of my big dislikes about how science is communicated. And it's cargo cult/traditions all the way down--people do it now because that's what people have done.

Circling back to the topic of this thread () or at least of the sub-thread at hand, this very point is why the AI thing is unlikely to be much help. It will be trained on a lot of papers that include this bad writing, including some highly-cited papers (or whatever other measure they use to tell the AI which papers are well-written or not), and as a result it will likely consider that this bad writing is how you must write a paper, and perpetuate it. And if it catches on, people will use even more of this bad writing because "it's recommended by the AI thingy!"

A lot of the early 20th century papers, even for complex subjects, were easy to read and had personality showing through. That got lost somewhere.

Science was very different at the time, honestly. People were still fumbling in the dark with very little in the manner of giants on whose shoulders to stand on. Describing for the first time something that now makes the second chapter of a text book isn't the same than "more and more precise measurement" (I'm riffing on "all that remains is...", allegedly said by Kelvin but likely not, not that I think it really relevant here but just to highlight how research has changed).

And there were some crummy papers at the time as well, either from the writing side or the scientific content (botched experiments and fanciful theories and the like). There is a huge survival bias at work here, we mostly remember the papers from great scientists, which often (not always, but more-than-average) are better written than those by piss-poor scientists. And some of those papers are actually unreadable, because sometimes despite being great scientists the author(s) didn't really understand what they were on to (the "fumbling in the dark" bit above...), and you need to read papers from one generation later to actually get a good explanation of the topic.

topspin

@remi there's also a huge amount of papers written by ESL speakers of which a significant fraction are also pretty terrible at English. (I include myself in the first but hope that while my English is non-idiomatic at times it's not quite as bad as some of the things I read)

That itself seems to perpetuate some styles and phrases that I consider somewhere between awful and plain wrong. I noticed that directly from collaborators writing things that sound at best like "how a German would translate this to English" rather than standard English and yet could not convince them to change because that's "how everyone does it." But also in other papers where it seemed that phrasing which sounds decidedly Indian-"English", for example, has caught on in the respective communities.

I don't want to start yet another fruitless debate about descriptive vs. prescriptive rules, but whoever thinks that there is no such thing as right or wrong in terms of language use, I disagree.

remi

@topspin also I've noticed that, weirdly, papers in trade publications that have low bars for review (or none) actually tend to be more enjoyable to read than "proper" peer-reviewed papers. Or rather, the average readability isn't better (and the average technical content is definitely much, much lower, since it's mostly empty marketing bullshit by companies trying to sell their tech), but the variance is much higher, meaning you sometimes get totally unreadable stuff, but also very enjoyable ones (*), whereas peer-reviewed journals tend to have a much tighter quality -- all papers are somewhat readable, but they're also all boring.

(*) these tend to be when someone with both a good technical understanding of the problem, and a long industry experience, takes some time to try and "dumb down" some complex idea (either just as a summary paper, or to introduce some new development that they made). My guess is that the "no proper peer-review" actually allows people to loosen a bit their writing standards, and to use some not very rigorous, but extremely useful writing techniques (such as explaining something with some basic sketches or an analogy, both being technically wrong, but helpful when done right).

dkf

@topspin said in Incredible Machine Learning:

not quite as bad as some of the things I read

On the evidence of your writing here, it can't possibly be as bad as some of the shit I've read in papers purporting to be in English.

That itself seems to perpetuate some styles and phrases that I consider somewhere between awful and plain wrong. I noticed that directly from collaborators writing things that sound at best like "how a German would translate this to English" rather than standard English and yet could not convince them to change because that's "how everyone does it." But also in other papers where it seemed that phrasing which sounds decidedly Indian-"English", for example, has caught on in the respective communities.

German doesn't translate too weirdly into English. Polish and Italian are significantly worse (as I remember from collaborating on writing a paper with Germans, Poles and Italians…)

boomzilla

@topspin said in Incredible Machine Learning:

@remi there's also a huge amount of papers written by ESL speakers of which a significant fraction are also pretty terrible at English. (I include myself in the first but hope that while my English is non-idiomatic at times it's not quite as bad as some of the things I read)

If I didn't know otherwise I'd have no reason to believe you were ESL based on the writing in your posts here.

HardwareGeek

@boomzilla I've noticed there are a few more-or-less distinct categories of writing here.

Clear, grammatically correct English (whether native or fluent ESL doesn't matter)
Fluent but either doesn't bother to be exactly grammatically correct or has the occasional oops (typo or rewrote a sentence and missed changing a verb conjugation or whatever)
Subtly ESL
Obviously ESL
Drug- or heavy metal poisoning-induced stream of gibberish

Some writers drift between fluent ESL, subtly ESL and obviously ESL, depending on how much effort they put into a specific post.

topspin

@HardwareGeek said in Incredible Machine Learning:

Drug- or heavy metal poisoning-induced stream of gibberish

Those are usually gramatically correct, even if indecipherable.

error

@boomzilla said in Incredible Machine Learning:

@topspin said in Incredible Machine Learning:

@remi there's also a huge amount of papers written by ESL speakers of which a significant fraction are also pretty terrible at English. (I include myself in the first but hope that while my English is non-idiomatic at times it's not quite as bad as some of the things I read)

If I didn't know otherwise I'd have no reason to believe you were ESL based on the writing in your posts here.

I can spot ESL people because they use "whom" correctly.

Gribnit

@error said in Incredible Machine Learning:

@boomzilla said in Incredible Machine Learning:

@topspin said in Incredible Machine Learning:

@remi there's also a huge amount of papers written by ESL speakers of which a significant fraction are also pretty terrible at English. (I include myself in the first but hope that while my English is non-idiomatic at times it's not quite as bad as some of the things I read)

If I didn't know otherwise I'd have no reason to believe you were ESL based on the writing in your posts here.

I can spot ESL people because they use "whom" correctly.

Which is otherwise not used.

Applied Mediocrity

@HardwareGeek said in Incredible Machine Learning:

Obviously ESL and doesn't bother

(but try to imagine that the warthog is )

Steve_The_Cynic

@cvi said in Incredible Machine Learning:

just because some people misunderstood a stupid style guide from about 100 years ago isn't quite the right thing either.

The key thing they've misunderstood is that the said guide is a guide (a set of recommendations), not a rulebook.

Carnage

@Steve_The_Cynic said in Incredible Machine Learning:

@cvi said in Incredible Machine Learning:

just because some people misunderstood a stupid style guide from about 100 years ago isn't quite the right thing either.

The key thing they've misunderstood is that the said guide is a guide (a set of recommendations), not a rulebook.

That's how guides tend to work. Lots of examples in programming.

cvi

@Steve_The_Cynic said in Incredible Machine Learning:

The key thing they've misunderstood is that the said guide is a guide (a set of recommendations), not a rulebook.

Well, there's that too. Then again, said guide actually mentions that passive voice is occasionally useful/necessary (and includes examples of that).

topspin

@cvi said in Incredible Machine Learning:

@Steve_The_Cynic said in Incredible Machine Learning:

The key thing they've misunderstood is that the said guide is a guide (a set of recommendations), not a rulebook.

Well, there's that too. Then again, said guide actually mentions that passive voice is occasionally useful/necessary (and includes examples of that).

I’ve never understood the hate of passive voice. Sure, overdoing it (as complained about here) sucks too. But write one sentence in passive voice in Word and it’ll yell at you with blue grammar squiggles and suggest you use active even when that doesn’t make sense.
Okay, Word is pretty stupid, so maybe there’s really no surprise here.

sloosecannon

@HardwareGeek said in Incredible Machine Learning:

Fluent but either doesn't bother to be exactly grammatically correct or has the occasional oops (typo or rewrote a sentence and missed changing a verb conjugation or whatever)

Which can totally happen to native speakers too, although the failure mode is usually slightly different. I am kinda curious how much effort it takes to be "fluent-sounding" when communicating on a forum, since I've never had a need to personally (I technically learned a little Spanish in high school but.. my brain did not adapt to a new language well at all .......)

cvi

@sloosecannon said in Incredible Machine Learning:

Which can totally happen to native speakers too, although the failure mode is usually slightly different. I am kinda curious how much effort it takes to be "fluent-sounding" when communicating on a forum

I'm also wondering what exactly "fluent-sounding" means. There are different writing styles, depending on the occasion (think: informal vs formal writing). I imagine that somebody could pass as "fluent-sounding" when writing in a formal style, but immediately out themselves as ESL if attempting to write informally.

From personal experience, writing/talking in a technical context is much easier than in some everyday contexts. For some specialized technical topics, I know the right terms and phrases. But when it comes to some every-day things, I've not had to write/talk about those things. Consequently, my vocabulary is much smaller. (To be fair, talking is an entirely different story anyway. You could likely be deaf and still tell that I'm not a native speaker within the first handful of words.)

dkf

@cvi said in Incredible Machine Learning:

(To be fair, talking is an entirely different story anyway. You could likely be deaf and still tell that I'm not a native speaker within the first handful of words.)

Would you claim to not be a native speaker of human languages?
🤜 🤛

error

@cvi said in Incredible Machine Learning:

I'm also wondering what exactly "fluent-sounding" means.

It means every method returns this so you can write an entire program as one unreadable line.

cvi

@dkf said in Incredible Machine Learning:

Would you claim to not be a native speaker of human languages?
🤜 🤛

I sometimes wonder...

The somewhat emo take on that is that "talking" to machines/computers sometimes seems to come more naturally than talking to actual humans.

Gribnit

@topspin said in Incredible Machine Learning:

@cvi said in Incredible Machine Learning:

@Steve_The_Cynic said in Incredible Machine Learning:

The key thing they've misunderstood is that the said guide is a guide (a set of recommendations), not a rulebook.

Well, there's that too. Then again, said guide actually mentions that passive voice is occasionally useful/necessary (and includes examples of that).

I’ve never understood the hate of passive voice. Sure, overdoing it (as complained about here) sucks too. But write one sentence in passive voice in Word and it’ll yell at you with blue grammar squiggles and suggest you use active even when that doesn’t make sense.
Okay, Word is pretty stupid, so maybe there’s really no surprise here.

The classic bullshit never ends. Computing techs cycle, we noticed that, but also so do many others - plus the argument over passive voice vs active voice is not not an argument over a computing tech.

Gribnit

@dkf said in Incredible Machine Learning:

@cvi said in Incredible Machine Learning:

(To be fair, talking is an entirely different story anyway. You could likely be deaf and still tell that I'm not a native speaker within the first handful of words.)

Would you claim to not be a native speaker of human languages?
🤜 🤛

Shit, I remember it. It was nice. Everything was ball.

error

@Gribnit said in Incredible Machine Learning:

@dkf said in Incredible Machine Learning:

@cvi said in Incredible Machine Learning:

(To be fair, talking is an entirely different story anyway. You could likely be deaf and still tell that I'm not a native speaker within the first handful of words.)

Would you claim to not be a native speaker of human languages?
🤜 🤛

Shit, I remember it. It was nice. Everything was ball.

https://www.youtube.com/watch?v=jT1W6BrGdcI

Gribnit

@error I like the Jurassic Park approach to predators. They're very sporting, to precede their attack with a threat display.

error

@Gribnit said in Incredible Machine Learning:

The classic bullshit never ends. Computing techs cycle, we noticed that, but also so do many others - plus the argument over passive voice vs active voice is not not an argument over a computing tech.

Do you indent with tabs or are you wrong?

Gribnit

@error said in Incredible Machine Learning:

@Gribnit said in Incredible Machine Learning:

The classic bullshit never ends. Computing techs cycle, we noticed that, but also so do many others - plus the argument over passive voice vs active voice is not not an argument over a computing tech.

Do you indent with tabs or are you wrong?

I indent all code with tabs that is not indented with spaces.

error

@Gribnit said in Incredible Machine Learning:

@error said in Incredible Machine Learning:

@Gribnit said in Incredible Machine Learning:

The classic bullshit never ends. Computing techs cycle, we noticed that, but also so do many others - plus the argument over passive voice vs active voice is not not an argument over a computing tech.

Do you indent with tabs or are you wrong?

I indent all code with tabs that is not indented with spaces.

So you indent the code of all those who don't indent their code. Who indents your code?

dcon

@error said in Incredible Machine Learning:

Do you indent with tabs or are you wrong?

I'm right! (My employer is wrong) But doesn't matter:
VSCode: Ctrl+Shift+I
VSStud: Ctrl+K,Ctrl+D

What? You don't have a custom .clang-format file in the root of your branch?

Gribnit

@error said in Incredible Machine Learning:

@Gribnit said in Incredible Machine Learning:

@error said in Incredible Machine Learning:

@Gribnit said in Incredible Machine Learning:

The classic bullshit never ends. Computing techs cycle, we noticed that, but also so do many others - plus the argument over passive voice vs active voice is not not an argument over a computing tech.

Do you indent with tabs or are you wrong?

I indent all code with tabs that is not indented with spaces.

So you indent the code of all those who don't indent their code. Who indents your code?

Doesn't matter, mine is all in Whitespace:

izzion

@dcon said in Incredible Machine Learning:

VSStud:

And that's why all of us who use it are single, it's stealing all the women!

error

@izzion said in Incredible Machine Learning:

@dcon said in Incredible Machine Learning:

VSStud:

And that's why all of us who use it are single, it's stealing all the women!

error

@Gribnit said in Incredible Machine Learning:

@error I like the Jurassic Park approach to predators. They're very sporting, to precede their attack with a threat display.

It's pretty common for venomous creatures to signal prominently, since the usual goal is "don't fuck with me."

Gribnit

@error said in Incredible Machine Learning:

@Gribnit said in Incredible Machine Learning:

@error I like the Jurassic Park approach to predators. They're very sporting, to precede their attack with a threat display.

It's pretty common for venomous creatures to signal prominently, since the usual goal is "don't fuck with me."

Fair, but Nedry is less in the threat than the potted meat category.

Applied Mediocrity

@error said in Incredible Machine Learning:

"don't fuck with me."

I don't even need to explicitly signal it!

BernieTheBernie

@error said in Incredible Machine Learning:

@boomzilla said in Incredible Machine Learning:

@topspin said in Incredible Machine Learning:

@remi there's also a huge amount of papers written by ESL speakers of which a significant fraction are also pretty terrible at English. (I include myself in the first but hope that while my English is non-idiomatic at times it's not quite as bad as some of the things I read)

If I didn't know otherwise I'd have no reason to believe you were ESL based on the writing in your posts here.

I can spot ESL people because they use "whom" correctly.

How do you expect us to hide?

by replacing "whom" with "who"?
by incidentally using "whom" where "who" is to be used?

Whom could possibly help me here?

Shoreline

@boomzilla said in Incredible Machine Learning:

Not credible, to be more precise:

Casey Ross / Jun 2, 2021

Machine learning is booming in medicine. It's also facing a credibility crisis

Researchers analyzed hundreds of papers on machine learning models developed to combat the Covid-19 pandemic. They found every single one was fatally flawed.

The use of statistical models for stuff like this is kind of interesting but ultimately not surprising that you can't just throw lots of stuff at it and get magic out the other end.

Mo' like "medicine is catching up technologically as tech becomes increasingly stable and fewer people remember all the irradiations". #amirite

Gribnit

@BernieTheBernie said in Incredible Machine Learning:

@error said in Incredible Machine Learning:

@boomzilla said in Incredible Machine Learning:

@topspin said in Incredible Machine Learning:

@remi there's also a huge amount of papers written by ESL speakers of which a significant fraction are also pretty terrible at English. (I include myself in the first but hope that while my English is non-idiomatic at times it's not quite as bad as some of the things I read)

If I didn't know otherwise I'd have no reason to believe you were ESL based on the writing in your posts here.

I can spot ESL people because they use "whom" correctly.

How do you expect us to hide?

by replacing "whom" with "who"?

by incidentally using "whom" where "who" is to be used?

Whom could possibly help me here?

Whom is always correct, but it makes us feel bad.

BernieTheBernie

@topspin said in Incredible Machine Learning:

passive voice

Hungarian has no passive voice.
Indonesian has "object focus" which some westeners may mis-take as passive voice.
But the full splendor of Latin grammar cannot be enjoyed without the thorough knowledge of passive voice (and Accusativus cum Inifitivo, Supinum, ...).

error

@BernieTheBernie said in Incredible Machine Learning:

@error said in Incredible Machine Learning:

@boomzilla said in Incredible Machine Learning:

@topspin said in Incredible Machine Learning:

@remi there's also a huge amount of papers written by ESL speakers of which a significant fraction are also pretty terrible at English. (I include myself in the first but hope that while my English is non-idiomatic at times it's not quite as bad as some of the things I read)

If I didn't know otherwise I'd have no reason to believe you were ESL based on the writing in your posts here.

I can spot ESL people because they use "whom" correctly.

How do you expect us to hide?

by replacing "whom" with "who"?

by incidentally using "whom" where "who" is to be used?

Whom could possibly help me here?

It's basically the Voight-Kampff test to detect replicants.

The tortoise lays on its back, its belly baking in the hot sun, beating its legs trying to turn itself over, but it can't. Not without your help. But you're not helping.

I mean: you're not helping! Why is that, Leon?

Gribnit

@error you could at least tw that shit

error

@Gribnit said in Incredible Machine Learning:

@BernieTheBernie said in Incredible Machine Learning:

@error said in Incredible Machine Learning:

@boomzilla said in Incredible Machine Learning:

@topspin said in Incredible Machine Learning:

@remi there's also a huge amount of papers written by ESL speakers of which a significant fraction are also pretty terrible at English. (I include myself in the first but hope that while my English is non-idiomatic at times it's not quite as bad as some of the things I read)

If I didn't know otherwise I'd have no reason to believe you were ESL based on the writing in your posts here.

I can spot ESL people because they use "whom" correctly.

How do you expect us to hide?

by replacing "whom" with "who"?

by incidentally using "whom" where "who" is to be used?

Whom could possibly help me here?

Whom is always correct, but it makes us feel bad.

Whomst'd've'ly'yaint'nt'ed'ies's'y'es

BernieTheBernie

@error said in Incredible Machine Learning:

Voight-Kampff test

Electric sheeps will be dreamed of by whom ever I believe has been taken to the test.

topspin

@Applied-Mediocrity said in Incredible Machine Learning:

@error said in Incredible Machine Learning:

"don't fuck with me."

I don't even need to explicitly signal it!

Chanting: One of us!

topspin

@BernieTheBernie said in Incredible Machine Learning:

Supinum

TIL, and I had four years of Latin.

But then reading some descriptions tells me that the Supinum I is used, for example, in the infinite future passive form and since I've just woken up from a nap this makes me dizzy, so in all likelyhood I will aready have forgotten about it the moment the moment I click "submit".

Filed under:

One of the major problems encountered in time travel is not that of becoming your own father or mother. There is no problem in becoming your own father or mother that a broad-minded and well-adjusted family can't cope with. There is no problem with changing the course of history—the course of history does not change because it all fits together like a jigsaw. All the important changes have happened before the things they were supposed to change and it all sorts itself out in the end.

The major problem is simply one of grammar, and the main work to consult in this matter is Dr. Dan Streetmentioner's Time Traveler's Handbook of 1001 Tense Formations. It will tell you, for instance, how to describe something that was about to happen to you in the past before you avoided it by time-jumping forward two days in order to avoid it. The event will be descibed differently according to whether you are talking about it from the standpoint of your own natural time, from a time in the further future, or a time in the further past and is futher complicated by the possibility of conducting conversations while you are actually traveling from one time to another with the intention of becoming your own mother or father.

Most readers get as far as the Future Semiconditionally Modified Subinverted Plagal Past Subjunctive Intentional before giving up; and in fact in later aditions of the book all pages beyond this point have been left blank to save on printing costs.