Scientific Science

boomzilla

@BernieTheBernie said in Scientific Science:

The number of citations is a key measure of academic merit. As usual, when you have such a Key Performance Indicator, some people can start gaming the system and improve their KPI. Like an Indian Dental School:

https://www.science.org/content/article/did-nasty-publishing-scheme-help-indian-dental-school-win-high-rankings

It seems to still take some time till The Science will learn such a simple fact.
Anyways, it has been known in business since decades, and ... - so no need to complain about The Science failing too.

Lots of fun stuff in here:

Saveetha, which calls itself a “pioneer in undergraduate publications,”

LOL.

asserts Theodore Eliades, a professor of orthodontics at the University of Zürich and editor-in-chief of The Korean Journal of Orthodontics.

I'm admittedly not familiar with personnel decisions at journals but that one seems a bit odd.

For example, a 2022 paper about different techniques for machining steel—published by a student and a faculty member at the Saveetha School of Engineering in Materials Today: Proceedings—includes this seemingly benign language in its introduction: “Our team has a wider research knowledge and experience that has been converted into high impact publications.” It is followed by more than a dozen citations of unrelated Saveetha publications.

That's like classic margin and font embiggening shenanigans.

But in an email he noted that, over the past few years, the firm noticed a 17-fold increase in publications linked to the school, many in journals that Clarivate no longer includes in its database because of problematic editorial standards or other issues.

cvi

@boomzilla said in Scientific Science:

I'm admittedly not familiar with personnel decisions at journals but that one seems a bit odd.

It's all online today, so location doesn't matter as much. Might have randomly ended up as one of the key journals specializing on ${subjects}. The guy might have been in Korea for a while. Or have collaborated with people there. (Journal itself could be old enough to date back to physical print and simply be one of the surviving venues in the topic.)

Editor is a somewhat thankless job that mainly involves a pile of cat herding. Helps if whoever is doing it is recognized by the community. There's some prestige involved in it if it's at a good venue. Alternatively it's the kind of thing somebody ends up doing because everybody else who could do it already did their time.

If it's anything like the journals I know, if it's a good and venerable journal, there might be a bit of cash involved (order of a thousand bucks/year), otherwise you're lucky to get a free dinner out of it. It's not a job or career, it's a post you hold for some time. (And one doesn't really do any hands on editing; some journal do have dedicated editors (-not-in-chief) that are paid to make publications worse, unless the submissions are in Word, in which case they're paid to make them marginally tolerable.)

Maybe orthodontics is really flush with cash, and things are different there, but I kinda doubt it.

Edit: As for massive self-citation and citation rings.... . People been playing that game for a long time. Most maybe not quite as blatantly as these guys, though. That said, the more people game the system, the less useful the metric becomes (and people are starting to recognize this already). I say more power to idiots like this. If they manage to make counting citations so useless that even beancounters stop doing it, it's a victory. (By "victory" I mean that we can move on to the next dumb metric and do the whole thing over.)

cvi

@BernieTheBernie said in Scientific Science:

It seems to still take some time till The Science will learn such a simple fact.

It's very well known. There's a whole pile of metrics of varying complexity and stupidity, each trying to fix a problem or other. Each will be gamed eventually (if it becomes important enough). I doubt anybody has really any illusions on that part.

FWIW, I think raw number of citations matter less now than it did 15 years ago (and relying on just raw citation count is a pretty major red flag).

If something has no citation or a very low amount, that's an indicator. What number passes for "low" will vary quite a bit between fields. Some fields, you're lucky to have half a dozen citations after a few years; in ML (apparently), if you're not raking in them by the dozens within months, you can basically write off your work.

Beyond that, one would typically more look where the publications are. In most places, if you claim to be in ${field} and somebody else from that field takes a look at your publication list and recognizes no or almost no venues in which you've published, you're pretty fucked. There are also some official lists to help with this (but they're no better than the institution that publishes them - I've seen some people end up using Norway's list, which actually seems somewhat reasonable.)

The lists get gamed as well (of course). A common reaction is that people will simply refuse to publish at unlisted venues, because they count for nothing. (And they might try to go for something with a better ranking even if it's less appropriate.)

remi

@cvi said in Scientific Science:

It's all online today, so location doesn't matter as much.

Yes, that part doesn't seem a problem to me. Of course, as always with "science" (taken as a broad and single entity), there are actually tons of differences per field, so maybe this is a problem for them. But I doubt it.

Edit: As for massive self-citation and citation rings.... . People been playing that game for a long time.

IIRC some citation indexes explicitly exclude self-citation (which is something that's fairly easy to check), though of course that can still be gamed by e.g. citing papers from other people in your team. Then again, it also makes sense to cite papers from other people in your team as your research is very likely a continuation of their work, so some degree of self-citation is expected.

(this link says this is the case for some JIF but the link doesn't seem to work)

Most maybe not quite as blatantly as these guys, though.

That part really sounds icky to me. If it was a paper that I had to review (but of course, are those papers even reviewed at all?), I would scratch that and tell the authors to explain which aspects of those previous papers are relevant to this paper.

cvi

@remi said in Scientific Science:

That part really sounds icky to me. If it was a paper that I had to review (but of course, are those papers even reviewed at all?), I would scratch that and tell the authors to explain which aspects of those previous papers are relevant to this paper.

Yeah, this. You can usually get away with a half-relevant self citation or two and some minor amount plugging, it's expected. But going overboard should get you slapped down during the review.

On the other hand, why not just formalize this? The standard paper structure for us has a "Previous Work" or "Related Work" section. I'd just add an explicit "Unrelated Work" section. You get to plug whatever you want there and if reviewers ask for stupid references to be added ... well, that's the place for it.

remi

@cvi said in Scientific Science:

You get to plug whatever you want there and if reviewers ask for stupid references to be added ... well, that's the place for it.

Like a reviewer suddenly becoming far less anonymous by suggesting that "authors should have cited the totally unrelated paper of Me, Myself and me, 2023?"

Zerosquare

@remi said in Scientific Science:

Me, Myself and me I, 2023

: Fixed your citation.

remi

@Zerosquare ah, yes, thanks! I knew there was a better way to complete that list of authors but couldn't think of it.

Arantor

@remi I was going to make some joke about pronouns then I remembered that “me” was still a pronoun.

jinpa

TIL where Spock's home planet got its name.

Journey to the Invisible Planet • Damn Interesting

The tangled history of humanity’s search for the solar system’s uncharted planets.

Bulb

@jinpa The writing's a bit sloppy, though. I'm always tripped by things like

[..] G is the gravitational constant (~0.0000000000667408) […]

Of course, G = 1. At least in Planck units. In SI units, it has a dimension, and the unit (m³kg¯¹s¯²) needs to be included for the value to make sense.

BernieTheBernie

@Bulb Are you trying to compete with ?

Steve_The_Cynic

@BernieTheBernie I might have raised the same objection about the lack of units...

BernieTheBernie

@Steve_The_Cynic Yes. Could have acre feet per pound per square-fortnight. Or Florida ounces instead of pounds.

Bulb

@BernieTheBernie When it comes to being sloppy with physics, yes.

BernieTheBernie

@LaoC said in Scientific Science:

https://www.science.org/content/article/fake-scientific-papers-are-alarmingly-common

They used a "tool" to tell fake from genuine articles:

Sabel’s tool relies on just two indicators—authors who use private, noninstitutional email addresses, and those who list an affiliation with a hospital. It isn’t a perfect solution, because of a high false-positive rate. Other developers of fake-paper detectors, who often reveal little about how their tools work, contend with similar issues.

"We know our methodology of using people's email addresses to detect fake papers is utter crap but others' ain't any better so it's fine and we get to publish it in Science"

And now a response has been received to the cräppy article:
https://www.science.org/doi/10.1126/science.adi7104

izzion

@BernieTheBernie said in Scientific Science:

@LaoC said in Scientific Science:

https://www.science.org/content/article/fake-scientific-papers-are-alarmingly-common

They used a "tool" to tell fake from genuine articles:

Sabel’s tool relies on just two indicators—authors who use private, noninstitutional email addresses, and those who list an affiliation with a hospital. It isn’t a perfect solution, because of a high false-positive rate. Other developers of fake-paper detectors, who often reveal little about how their tools work, contend with similar issues.

"We know our methodology of using people's email addresses to detect fake papers is utter crap but others' ain't any better so it's fine and we get to publish it in Science"

And now a response has been received to the cräppy article:
https://www.science.org/doi/10.1126/science.adi7104

The response gets very spicy at the end

Science should be on the vanguard of drawing attention to how machine intelligence can encode racism while perpetuating and exacerbating traditional inequities, not an accessory to such wrongs.

GuyWhoKilledBear

@boomzilla said in Scientific Science:

https://www.science.org/content/blog-post/always-same-warning-signs

STAT has a pretty wild story (for subscribers) about Laronde, a Boston/Cambridge area biotech firm that seems to have had some major problems reproducing the data that helped raise them hundreds of millions of dollars last year.

"Always the Same Warning Signs" is a very accurate headline for this piece.

dkf

@GuyWhoKilledBear quoted in Scientific Science:

STAT has a pretty wild story (for subscribers) about ...

I guess subscribers to Science can't usually cope with anything more spicy than a description of variations the mating behaviours of annelid worms between different parts of Kansas.

LaoC

Theo Baker / Sep 1, 2023 / University

Stanford president resigns over manipulated research

Stanford President Marc Tessier-Lavigne will resign effective Aug. 31. He will also retract or issue lengthy corrections to five widely cited papers for which he was principal author after a Stanford-sponsored investigation found “manipulation of research data.”

BernieTheBernie

So I started reading an intersting article on the genetic basis of asymptomatic covid:

Jul 19, 2023

A common allele of HLA is associated with asymptomatic SARS-CoV-2 infection

Nature - The human leukocyte antigen allele HLA-B*15:01 is associated with asymptomatic SARS-CoV-2 infection due to pre-existing T cell immunity.

The article starts with emphasizing that "Studies have demonstrated that at least 20% of individuals infected with SARS-CoV-2 remain asymptomatic".
Eventually they get to the summary of the epidemiological part of tehir study: "Overall, one in five individuals (20%) who remained asymptomatic after infection carried HLA-B15:01, compared with 9% among patients reporting symptoms."
And now I thought:
20% of HLA-B15:01 remained asymptomatic while infected, which is as much as the average population according to "studies"? So what? Do almost 100% of the population carry that allele?

Um, things are differnt. Only after digging through appendices, you can learn that in this study, only 136 out of 1428 patients stayed asymptomatic. That's 9.5%, and rather less than "at least 20%". That's not at all mentioned in the article proper...

Extended Data Table 1 Study population demographics

Beyond that, they do not mention the overall allele frequency in their group. I'd rather like to see how many people had 0, 1, or both alleles of HLA-B*15:01 among symptomatic and asymptomatic patients. I could try to calculate that from the data given, but ...

BernieTheBernie

When you do research on dishonesty, should you yourself be honest or dishonest?
:yes:
https://www.science.org/content/article/after-honesty-researcher-s-retractions-colleagues-expand-scrutiny-her-work

ixvedeusi

@BernieTheBernie said in Scientific Science:

Overall, one in five individuals (20%) who remained asymptomatic after infection carried HLA-B15:01

estimates P(HLA-B15:01 | Asymptomatic) at 20%

@BernieTheBernie said in Scientific Science:

20% of HLA-B15:01 remained asymptomatic while infected

estimates P(Asymptomatic | HLA-B15:01) at 20%

That's not the same claim. To know if that first statement is meaningful or not, you'd need to know how prevalent HLA-B15:01 is.

O'course, I didn't read the article, only your post, so possibly you might have quoted that wrong; but your criticism as stated in your post doesn't hold.

BernieTheBernie

@ixvedeusi said in Scientific Science:

you'd need to know how prevalent HLA-B15:01 is

See above: . But let me give you a rough estimate based on their data: 90% of population have no such allele, 10% carry one allele, 0.25% both alleles (note two alleles per person, one allele paternal, one allele maternal).

No, I do not want to say that their paper is wrong or contains sloppy data. My criticism is that they ought to have presented their data better.

I.e. clearly stating: overall 1,428 patients, 136 asymmptomatic, 1,292 symptomatic (and the %iges).
Next, show the allele frequency for all groups, i.e. something like (rough estimates as above, because ) 136 total asymptomatic, 2 both alleles, 25 single allele, 109 non-carriers, and similar for symptomatic and total.

And causes of confusion would be reduced severely...

BernieTheBernie

@BernieTheBernie Yes, I did it:
was by .
( Why don't we have a gun pointing to the right? Because our righties want to shoot lefties. )

From the appendices we learn that 136 people stayed asymptomatic. TFA says the frequency of HLA-B15:01 in that group was 0.1103. That is, the allele was found 30 times. Which resolves to 26 people with one allele plus 2 people with 2 alleles (or 28:1, or 24:3, or similar); 108 asymtpomatic people do not carry the allele at all.

Next, there are 1292 symptomatic people. TFA mentions an allele frequency of 0.0495 in that group, i.e. the allele was found 128 times. Which resolves to 124 people with a single allele and 2 with two alleles (or similar variations as above); 1166 non-carriers.

Next, we can sum that up for the total.

For the total, the Hardy-Weinberg law of allele distribution should be valid. That is with the total allele frequency f,
non-carriers make up total population * (1-f) * (1-f)
single allelic carrrieres = total population * f * (1-f) * 2
two allelic carriers = total population * f * f

We arrive at

	asymptomatic	symptomatic	total
non-carrier	108 (79%)	1166 (90%)	1274 (89%)
single allele	26 (19.1%)	124 (9.6%)	150 (10.5%)
two alleles	2 (1.47%)	2 (0.15%)	4 (0.28%)
total	136	1292	1428

So the overall allele frequencey of HLA-B15:01 is 5.53%. I did not check different sources if they came to the same frequency.

According to the Hardy-Weinberg fomula, we'd expect 1274.4 non-carriers, 149.3 single allelic carriers, and 4.27 two allelic carriers. Fits perfectly.

In my original post on this article, I mixed up some percentages. Let me calculate for the group of HLA-B15:01 carriers. Their total number is 154 people. 126 of them were symptomatic (82 %), and 28 asymptomatic (18%).

Things would have been much easier to follow had they provided that little table.

kazitor

@BernieTheBernie said in Scientific Science:

Things would have been much easier to follow had they provided that little table.

Tables take up space. Nature is allergic to anything that takes up space, such as figures, data, detail, nuance, methodologies and reproducibility, or anything else that detracts from IMPACT!!!!!

Arantor

@kazitor but also tables are harder to reliably put things in Word documents.

jinpa

Science Jokes:Why the chicken crossed the road according to scientists

GuyWhoKilledBear

@BernieTheBernie said in Scientific Science:

When you do research on dishonesty, should you yourself be honest or dishonest?
:yes:
https://www.science.org/content/article/after-honesty-researcher-s-retractions-colleagues-expand-scrutiny-her-work

I just went looking all over the website for the Garage version of this post where I saw a news story about this Cambridge/Boston area lab fabricating data and the title was "Always the Same Warning Signs."

Turns out, the post is in this topic and it's about a different Boston/Cambridge area lab that's also fabricating data.

Needless to say, the Same Warning Signs are present in this story too.

Maybe instead of doing this study, we should be researching why everything from Boston is terrible.

Bulb

@jinpa said in Scientific Science:

Science Jokes:Why the chicken crossed the road according to scientists

There are probably some good jokes out there, but the structure of the web site is abysmally depressing.

HardwareGeek

@Bulb Amazingly, the "simplified" mobile view is actually much less terrible.

Bulb

@HardwareGeek Is that a completely different structure? Because the part that irks me most is how the list points to random places across a bunch of joke pages.

Arantor

@Bulb it’s not random. It’s that he has multiple categorisations of each thing and the primary categorisation is not the one we see at first, e.g. the first 3 or so of the chicken jokes align to the first half of the “mathematics” category. I assume this follows for the others.

I am personally more bothered by the lack of correct character set.

jinpa

@Bulb said in Scientific Science:

There are probably some good jokes out there, but the structure of the web site is abysmally depressing.

I have countless complaints about web sites, (overlays, sticky headers, preventing the pasting of passwords, etc.), but a simple, old-fashioned amateur website that's easy to navigate doesn't bother me. (No, I don't design websites, so you can breathe easier.)

Any web page that has less than 100 lines (yes, I said that correctly) of HTML and no JS or anything fancy deserves a medal, IMO.

HardwareGeek

@jinpa said in Scientific Science:

but a simple, old-fashioned amateur website that's easy to navigate doesn't bother me.

Arantor

@jinpa said in Scientific Science:

Any web page that has less than 100 lines (yes, I said that correctly) of HTML

And presumably we're not including 'everything on a single line because MINIFICATION' in that?

jinpa

@Arantor Correct. But I think the "no JS" requirement more or less covers that.

Arantor

@jinpa no, I’ve seen geniuses minify their HTML for “performance” before now. Whether gzip performs better or worse is so variable at that point it almost doesn’t matter.

BernieTheBernie

@Arantor said in Scientific Science:

minify their HTML for “performance”

That's what old style programmers use to do.
A variable with 2 characters is slower than a single character variable.
Same holds true for function names.
And an

if (condition) doit();

is faster than

if (condition) 
    doit();

which is much faster than

if (condition) 
{
    doit();
}

Just ask Kevin & Co., they will be able to teach you!

Bulb

@BernieTheBernie Well, it's faster to write . You only write the first one if you have something to hide though.

jinpa

@BernieTheBernie said in Scientific Science:

@Arantor said in Scientific Science:

minify their HTML for “performance”

That's what old style programmers use to do.
A variable with 2 characters is slower than a single character variable.
Same holds true for function names.
And an
if (condition) doit();
is faster than
if (condition) 
    doit();
which is much faster than
if (condition) 
{
    doit();
}

I would have thought that even old-style compilers would compile those to the same thing. No? (I assume you're making a comparison and that we've moved off of discussing pure HTML since you have ifs.)

Arantor

@jinpa assume tongue is firmly in cheek here. The reference to “Kevin & co” is the giveaway - Kevin is someone we all know well who would believe those things to be true.

jinpa

@Arantor ?

BernieTheBernie

@Bulb said in Scientific Science:

@BernieTheBernie Well, it's faster to write . You only write the first one if you have something to hide though.

I am not a beginner of Intended WTFs.
If I wanted to hide something, it would rather look like

if (condition) doit();
{
    domore();
}

topspin

@jinpa No, just one of Bernie's notorious coworkers. (Kevin being , whereas Fritz and others being )

BernieTheBernie

@topspin Kevin is 3 years older than I am.

topspin

@BernieTheBernie said in Scientific Science:

@topspin Kevin is 3 years older than I am.

Huh, that is surprising given the stereotypes around Kevins and their prevalence after Home Alone aired.

So much for ass-u-ming...

izzion

Garisto, Dan / Jul 25, 2023

‘A very disturbing picture’: another retraction imminent for controversial physicist

Nature - Ranga Dias will have a second paper revoked. A journal’s investigation found apparent data fabrication.

Steve_The_Cynic

@izzion Looks like a top-tier infernal mess, that one.

dcon

@izzion Academia's "Publish or Perish". Oops. Wrong choice.