Monday, October 16, 2017

A PhD Nightmare: How a ‘Safe’ Paper Turned Into a ‘Horror’ Paper


Recently the last paper from my PhD has been accepted for publication. The paper describes the impact of current and potential future land-use intensification on bird species richness in Transylvania, Romania. Although the paper is maybe not groundbreaking, I always thought that it is still a relevant contribution to the scientific literature, based on our large field efforts, its statistical soundness and because it was well written. A solid paper. But instead, getting the paper published has been a tough ride. 
While we thought bats were difficult to publish (see our previous blog post on a rejection journey five years ago), we have now seen that birds can be even harder to get into journals. Ironically, this paper was considered the ‘safe paper’ of my PhD work. I was one of those lucky students that was part of a well-planned research project including great supervision. The bird work of my PhD was carefully planned and designed, was based on pilot studies and was set in a region rich in (protected) bird species. Very soon, however, my ‘safe’ paper turned into my ‘horror’ paper, with high levels of frustration, a shattered confidence, and – in the end – lots of sarcasm and laughter.
Here goes the story how my ‘safe’ paper was turned into my ‘horror’ paper.
Journal 1: Submitted Dec 2013, rejected with review Feb 2014: Lacking novelty and generality, and lacking clarity and focus of the analysis.
Journal 2: Submitted Feb 2014, rejected with review Mar 2014: Too broad discussion and lacking strong conclusions/management recommendations.
After these first two rejections, we made major changes to the manuscript. We narrowed down the manuscript considerably by deleting a part on species traits, and worked on the clarity of our methods section.
Journal 3: Submitted May 2014, rejected without review: Not general enough in concept, scope and approach.
Journal 4: Submitted May 2014, rejected with review Sep 2014: Lacking novelty.
Journal 5: Submitted Oct 2014, rejected with review Dec 2014: Lacking novelty, and lacking clarity in the methodology and results. As one reviewer put it: having a more complicated and complex design than other studies should not stand for novelty in scientific research.
By the time the paper was rejected 5 times I was pretty desperate and frustrated to hear over and over that the study lacked novelty. I figured that we couldn’t change that much on the novelty of our study’s outcome. However, another frequent critique was around the clarity of the methods and results, something I thought we could improve. Therefore, to give the paper a new and fresh boost, we received help from a new co-author. We re-analysed the entire paper focusing solely on species richness (taking out a part on bird communities), rewrote the entire paper for clarity and to put into a broader context, and even put in some pretty pictures to illustrate traditional farming landscapes. Now with our paper in a new jacket I was convinced we would be luckier in the review process.
Journal 6: Submitted Jun 2015, rejected with review Aug 2015: Methodology limited the study’s conclusion and its capacity to go beyond a regional example. For example, it was critiqued that the model averaging approach used poses limitations and regression coefficients should be used instead.
Journal 7: Submitted Aug 2015, rejected with review Sep 2015: Flawed study design which was deemed uncorrectable without significant reanalysis. Although reviewer 1 had significant problems with our study design, reviewer 2 seemed to be less unhappy: The study is well introduced (I particularly liked the introduction of traditional farming landscapes), the study design is appropriate, the analyses generally robust (although please see comment below), and the results clear, and the discussion well considered.
Journal 8: Submitted Nov 2015, rejected with review Dec 2015: Methodology – given our objectives and sampling design we used the wrong analytical unit.
Journal 9: Submitted Jan 2016, rejected with review Feb 2016: Lack of novelty, trivial findings and not taking into account the rarity of species (something we had excluded from the manuscript due to other reviewer comments).
Journal 10: Submitted Feb 2016, rejected with review June 2016: Goal of the work not addressed.
Journal 11: Submitted Sep 2016, Minor revisions Jan 2017, Submitted revised manuscript Jul 2017 (after maternity leave), Accepted Jul 2017. Hurrah, the reviewers liked the paper a lot!!
Having had 10 rejections on this paper, mostly after review, means that approximately 25 (!) reviewers were involved in getting this paper published. Importantly, of those reviewers probably half of them could have been satisfied with major revisions. Like in the example under journal 7, usually one of the reviewers did not dislike our paper that much, but I guess one more negative review is enough for a rejection. 
Even more interesting, we published two similar papers on butterflies and plants from the same region, based on the same study design and using similar analysis. While this paper on birds got continuous critique that our methodology was not clear, flawed, or limited, these other two papers on plants and butterflies received positive constructive reviews without much complaints about its novelty and/or study design. I am still not sure why this paper had such a hard time, is it just birds or something else, but I am happy it is finally out there! Enjoy the reading and you can always contact me for further clarifications on its methods or novelty J.

The IQ Test Wars: Why Screening for Intelligence is Still So Controversial

by Daphne Martschenko, The Conversation: https://theconversation.com/the-iq-test-wars-why-screening-for-intelligence-is-still-so-controversial-81428

File 20170921 21016 ld7zty.jpg?ixlib=rb 1.1

For over a century, IQ tests have been used to measure intelligence. But can it really be measured? via shutterstock.com
Daphne Martschenko, University of Cambridge
John, 12-years-old, is three times as old as his brother. How old will John be when he is twice as old as his brother?
Two families go bowling. While they are bowling, they order a pizza for £12, six sodas for £1.25 each, and two large buckets of popcorn for £10.86. If they are going to split the bill between the families, how much does each family owe?
4, 9, 16, 25, 36, ?, 64. What number is missing from the sequence?
These are questions from online Intelligence Quotient or IQ tests. Tests that purport to measure your intelligence can be verbal, meaning written, or non-verbal, focusing on abstract reasoning independent of reading and writing skills. First created more than a century ago, the tests are still widely used today to measure an individual’s mental agility and ability.

Education systems use IQ tests to help identify children for special education and gifted education programmes and to offer extra support. Researchers across the social and hard sciences study IQ test results also looking at everything from their relation to genetics, socio-economic status, academic achievement, and race.

Online IQ “quizzes” purport to be able to tell you whether or not “you have what it takes to be a member of the world’s most prestigious high IQ society”.

If you want to boast about your high IQ, you should have been able to work out the answers to the questions. When John is 16 he’ll be twice as old as his brother. The two families who went bowling each owe £20.61. And 49 is the missing number in the sequence.

Despite the hype, the relevance, usefulness, and legitimacy of the IQ test is still hotly debated among educators, social scientists, and hard scientists. To understand why, it’s important to understand the history underpinning the birth, development, and expansion of the IQ test – a history that includes the use of IQ tests to further marginalise ethnic minorities and poor communities.

Testing times

In the early 1900s, dozens of intelligence tests were developed in Europe and America claiming to offer unbiased ways to measure a person’s cognitive ability. The first of these tests was developed by French psychologist Alfred Binet, who was commissioned by the French government to identify students who would face the most difficulty in school. The resulting 1905 Binet-Simon Scale became the basis for modern IQ testing. Ironically, Binet actually thought that IQ tests were inadequate measures for intelligence, pointing to the test’s inability to properly measure creativity or emotional intelligence.

At its conception, the IQ test provided a relatively quick and simple way to identify and sort individuals based on intelligence – which was and still is highly valued by society. In the US and elsewhere, institutions such as the military and police used IQ tests to screen potential applicants. They also implemented admission requirements based on the results.

The US Army Alpha and Beta Tests screened approximately 1.75m draftees in World War I in an attempt to evaluate the intellectual and emotional temperament of soldiers. Results were used to determine how capable a solider was of serving in the armed forces and identify which job classification or leadership position one was most suitable for. Starting in the early 1900s, the US education system also began using IQ tests to identify “gifted and talented” students, as well as those with special needs who required additional educational interventions and different academic environments.

Ironically, some districts in the US have recently employed a maximum IQ score for admission into the police force. The fear was that those who scored too highly would eventually find the work boring and leave – after significant time and resources had been put towards their training.

Alongside the widespread use of IQ tests in the 20th century was the argument that the level of a person’s intelligence was influenced by their biology. Ethnocentrics and eugenicists, who viewed intelligence and other social behaviours as being determined by biology and race, latched onto IQ tests. They held up the apparent gaps these tests illuminated between ethnic minorities and whites or between low- and high-income groups.

Some maintained that these test results provided further evidence that socioeconomic and racial groups were genetically different from each other and that systemic inequalities were partly a byproduct of evolutionary processes.

Going to extremes

The US Army Alpha and Beta test results garnered widespread publicity and were analysed by Carl Brigham, a Princeton University psychologist and early founder of psychometrics, in a 1922 book A Study of American Intelligence. Brigham applied meticulous statistical analyses to demonstrate that American intelligence was declining, claiming that increased immigration and racial integration were to blame. To address the issue, he called for social policies to restrict immigration and prohibit racial mixing.

A few years before, American psychologist and education researcher Lewis Terman had drawn connections between intellectual ability and race. In 1916, he wrote:
High-grade or border-line deficiency … is very, very common among Spanish-Indian and Mexican families of the Southwest and also among Negroes. Their dullness seems to be racial, or at least inherent in the family stocks from which they come … Children of this group should be segregated into separate classes … They cannot master abstractions but they can often be made into efficient workers … from a eugenic point of view they constitute a grave problem because of their unusually prolific breeding.
There has been considerable work from both hard and social scientists refuting arguments such as Brigham’s and Terman’s that racial differences in IQ scores are influenced by biology.

Critiques of such “hereditarian” hypotheses – arguments that genetics can powerfully explain human character traits and even human social and political problems – cite a lack of evidence and weak statistical analyses. This critique continues today, with many researchers resistant to and alarmed by research that is still being conducted on race and IQ.

But in their darkest moments, IQ tests became a powerful way to exclude and control marginalised communities using empirical and scientific language. Supporters of eugenic ideologies in the 1900s used IQ tests to identify “idiots”, “imbeciles”, and the “feebleminded”. These were people, eugenicists argued, who threatened to dilute the White Anglo-Saxon genetic stock of America.

A plaque in Virginia in memory to Carrie Buck, the first person to be sterilised under eugenics laws in the state. Jukie Bot/flickr.com, CC BY-NC

As a result of such eugenic arguments, many American citizens were later sterilised. In 1927, an infamous ruling by the US Supreme Court legalised forced sterilisation of citizens with developmental disabilities and the “feebleminded,” who were frequently identified by their low IQ scores. The ruling, known as Buck v Bell, resulted in over 65,000 coerced sterilisations of individuals thought to have low IQs. Those in the US who were forcibly sterilised in the aftermath of Buck v Bell were disproportionately poor or of colour.

Compulsory sterilisation in the US on the basis of IQ, criminality, or sexual deviance continued formally until the mid 1970s when organisations like the Southern Poverty Law Center began filing lawsuits on behalf of people who had been sterilised. In 2015, the US Senate voted to compensate living victims of government-sponsored sterilisation programmes.

IQ tests today

Debate over what it means to be “intelligent” and whether or not the IQ test is a robust tool of measurement continues to elicit strong and often opposing reactions today. Some researchers say that intelligence is a concept specific to a particular culture. They maintain that it appears differently depending on the context – in the same way that many cultural behaviours would. For example, burping may be seen as an indicator of enjoyment of a meal or a sign of praise for the host in some cultures and impolite in others.

What may be considered intelligent in one environment, therefore, might not in others. For example, knowledge about medicinal herbs is seen as a form of intelligence in certain communities within Africa, but does not correlate with high performance on traditional Western academic intelligence tests.

According to some researchers, the “cultural specificity” of intelligence makes IQ tests biased towards the environments in which they were developed – namely white, Western society. This makes them potentially problematic in culturally diverse settings. The application of the same test among different communities would fail to recognise the different cultural values that shape what each community values as intelligent behaviour.

Going even further, given the IQ test’s history of being used to further questionable and sometimes racially-motivated beliefs about what different groups of people are capable of, some researchers say such tests cannot objectively and equally measure an individual’s intelligence at all.

Used for good

At the same time, there are ongoing efforts to demonstrate how the IQ test can be used to help those very communities who have been most harmed by them in the past. In 2002, the execution across the US of criminally convicted individuals with intellectual disabilities, who are often assessed using IQ tests, was ruled unconstitutional. This has meant IQ tests have actually prevented individuals from facing “cruel and unusual punishment” in the US court of law.

In education, IQ tests may be a more objective way to identify children who could benefit from special education services. This includes programmes known as “gifted education” for students who have been identified as exceptionally or highly cognitively able. Ethnic minority children and those whose parents have a low income, are under-represented in gifted education.

There is ongoing debate about the use of IQ tests in schools. via shutterstock.com

The way children are chosen for these programmes means that Black and Hispanic students are often overlooked. Some US school districts employ admissions procedures for gifted education programmes that rely on teacher observations and referrals or require a family to sign their child up for an IQ test. But research suggests that teacher perceptions and expectations of a student, which can be preconceived, have an impact upon a child’s IQ scores, academic achievement, and attitudes and behaviour. This means that teacher’s perceptions can also have an impact on the likelihood of a child being referred for gifted or special education.

The universal screening of students for gifted education using IQ tests could help to identify children who otherwise would have gone unnoticed by parents and teachers. Research has found that those school districts which have implemented screening measures for all children using IQ tests have been able to identify more children from historically underrepresented groups to go into gifted education.
IQ tests could also help identify structural inequalities that have affected a child’s development.

These could include the impacts of environmental exposure to harmful substances such as lead and arsenic or the effects of malnutrition on brain health. All these have been shown to have an negative impact on an individual’s mental ability and to disproportionately affect low-income and ethnic minority communities.

Identifying these issues could then help those in charge of education and social policy to seek solutions. Specific interventions could be designed to help children who have been affected by these structural inequalities or exposed to harmful substances. In the long run, the effectiveness of these interventions could be monitored by comparing IQ tests administered to the same children before and after an intervention.

Some researchers have tried doing this. One US study in 1995 used IQ tests to look at the effectiveness of a particular type of training for managing Attention Deficit/Hyperactivity Disorder (ADHD), called neurofeedback training. This is a therapeutic process aimed at trying to help a person to self-regulate their brain function. Most commonly used with those who have some sort of identified brain imbalance, it has also been used to treat drug addiction, depression and ADHD. The researchers used IQ tests to find out whether the training was effective in improving the concentration and executive functioning of children with ADHD – and found that it was.

Since its invention, the IQ test has generated strong arguments in support of and against its use. Both sides are focused on the communities that have been negatively impacted in the past by the use of intelligence tests for eugenic purposes.

The ConversationThe use of IQ tests in a range of settings, and the continued disagreement over their validity and even morality, highlights not only the immense value society places on intelligence – but also our desire to understand and measure it.

Daphne Martschenko, PhD Candidate, University of Cambridge

This article was originally published on The Conversation. Read the original article.