Download Language and publication in Cardiovascular Research articles

yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Inflection wikipedia , lookup

French grammar wikipedia , lookup

Sanskrit grammar wikipedia , lookup

Portuguese grammar wikipedia , lookup

Transformational grammar wikipedia , lookup

Serbo-Croatian grammar wikipedia , lookup

Lexical semantics wikipedia , lookup

Esperanto grammar wikipedia , lookup

Spanish grammar wikipedia , lookup

Scottish Gaelic grammar wikipedia , lookup

Macedonian grammar wikipedia , lookup

Swedish grammar wikipedia , lookup

Lithuanian grammar wikipedia , lookup

Latin syntax wikipedia , lookup

Yiddish grammar wikipedia , lookup

Article (grammar) wikipedia , lookup

Russian grammar wikipedia , lookup

Ancient Greek grammar wikipedia , lookup

Junction Grammar wikipedia , lookup

Malay grammar wikipedia , lookup

Polish grammar wikipedia , lookup

Pipil grammar wikipedia , lookup

Cardiovascular Research 53 (2002) 279–285 / locate / cardiores
Language and publication in Cardiovascular Research articles
R. Coates *, B. Sturgeon , J. Bohannan , E. Pasini
Centro Linguistico dell’ Universita` di Brescia, Contrada Santa Chiara, 25100 Brescia, Italy
P.G.C.E. L’ Universita` degli Studi di Bergamo, Bergamo, Italy
Fondazione ‘ Salvatore Maugeri’, IRCCS Gussago, Italy
Received 15 October 2001; accepted 5 November 2001
Background: The acceptance rate of non-mother English tongue authors is generally a lot lower than for native English tongue authors.
Obviously the scientific quality of an article is the principal reason for publication. However, is editorial rejection purely on scientific
grounds? English mother tongue writers publish more than non mother-tongue writers—so are editors discriminating linguistically? We
therefore decided to survey language errors in manuscripts submitted for publication to Cardiovascular Research (CVR). Method: We
surveyed language errors in 120 medical articles which had been submitted for publication in 1999 and 2000. The language ‘error’
categories were divided into three principal groups: grammatical, structural and lexical which were then further sub-divided into key
areas. The articles were corrected without any knowledge of the author’s nationality or the corrections made by other language
researchers. After an initial correction, a sample of the papers were cross-checked to verify reliability. Results: The control groups of US
and UK authors had an almost identical acceptance rate and overall ‘error’ rate indicating that the language categories were objective
categories also for the other nationalities. Although there was not a direct relationship between the acceptance rate and the amount of
language errors, there was a clear indication that badly written articles correlated with a high rejection rate. The US / UK acceptance rate
of 30.4% was higher than for all the other countries. The lowest acceptance rate of 9% (Italian) also had the highest error rate.
Discussion: Many factors could influence the rejection of an article. However, we found clear indications that carelessly written articles
could often have either a direct or subliminal influence on whether a paper was accepted or rejected. On equal scientific merit, a badly
written article will have less chance of being accepted. This is even if the editor involved in rejecting a paper does not necessarily identify
language problems as a motive for rejection. A more detailed look at the types and categories of language errors is needed. Furthermore
we suggest the introduction of standardised guidelines in scientific writing.  2002 Elsevier Science B.V. All rights reserved.
1. Introduction
Various editors of important medical journals [1–3] have
indicated the importance of well-written scientific research.
Today written English research is the principal means of
spreading scientific knowledge. The subject of publication
and the nationality of authors has been touched on in this
journal in the past [4] and physicians whose native
language is not English have additional problems when
presenting work for publication. Publications are available
which look at the problem from a strictly medical approach
to the IMRAD (Introduction, Materials, Results And
Discussion) structure [5] or which give English mother
tongue doctors an outline of how to write medical articles
*Corresponding author.
E-mail address: [email protected] (R. Coates).
[6–9]. However, an analysis of the influence of language
on publication in medical research has, to our knowledge,
not been made.
How can we define an article which is ‘well written’?
Given the large number of non-native English language
physicians this question should be answered in two words:
‘simple’ and ‘clear’. Unfortunately this is exactly the
opposite to how many, even native English users, write
medical research. We therefore decided to analyse the
language problems which could effect the clarity of
medical writing.
In co-operation with Cardiovascular Research, we
analysed 120 IMRAD articles which had been presented
for publication from eight different nationalities. Given
that we were looking for problems which made an article
difficult to understand, we had to consider style as well as
grammatical errors.
0008-6363 / 02 / $ – see front matter  2002 Elsevier Science B.V. All rights reserved.
PII: S0008-6363( 01 )00530-2
R. Coates et al. / Cardiovascular Research 53 (2002) 279 – 285
Some authors have indicated general language areas
which could create problems for comprehension [10]. The
problem with style is, however, objectivity; what is
difficult to understand for one editor might be perfectly
acceptable for another. It should be also noted that we give
a strict definition of ‘error’ categories in the Methods
section, and that this definition is limited to this research.
the Discussion section where authors switch back and
forward from his / her current research to published scientific literature.
However, it should be noted that this is a simplification
of possible verb tense use. Indeed, the ‘IMRAD’ structure
is itself a simplification of how medical work could be
presented. Many currently published texts (especially UK
English authors) use a considerably more complex scheme.
2. Methods
2.1.3. General grammar problems
These were all grammatical errors which should have
been eliminated either by mother tongue colleagues or
professional translators before submission to an editor. For
example these included, third person errors, plural errors,
preposition errors, etc.
2.1. Grammatical errors
2.1.1. Passives
Different sources [7–9] have indicated that high passive
use makes a text obscure to read and difficult to understand. We agree generally with this statement although it
should be noted that in a medical text which follows an
IMRAD structure there are times when use of the passive
is necessary. Indeed a very clear ratio of passive to active
has been shown for each of the IMRAD sections [11].
Thus, ‘A separate group of animals underwent coronary
artery perfusion . . . ,’ would count as one active use.
‘Coronary artery perfusion was performed on a separate group of animals . . . ’ would count as a passive use.
The total number of verbs was then counted and
expressed as a single figure, i.e. passive verbs divided by
active verbs.
Generally once a subject is introduced, it is quite normal
to use the passive. For this reason the Methods section
usually has a high passive to active ratio as one subject is
used throughout, e.g. ‘We studied our patient group over a
six-month period. The group was interviewed at the
beginning of the study and informed consent was obtained
according to guidelines from our ethics department. The
group was sub-divided . . . ’ etc.
Furthermore, we looked at the overall average active to
passive ratio to simplify these ratios. This method was a
little simplistic but gave a clear indication of whenever use
of the passive was more than for an English language
2.1.2. Tense
We simplified our definition of verb tense use as much
as possible using certain US sources [6,7] as the basic
criteria. Day [7] underlined that in a scientific work any
reference to a work which had been previously published,
and therefore accepted by the scientific community, should
be written in the present tense. Any reference to current
research work (i.e. that carried out and described by the
author) should be in the simple past tense (i.e. what the
authors did). With the exception of introducing new
concepts (the present perfect tense) and tables (the simple
present tense) and ‘reporting verbs’ e.g. said, found,
discovered etc. (the simple past tense). This ‘neatly’
simplifies tense use in scientific work, and especially so in
2.2. Structural errors (syntax)
We again restricted our categories to countable and
objectively verifiable groups.
2.2.1. Long sentences
We came across sentences which were often a paragraph
long or where the original subject was lost by the time the
end of the sentence was reached. For practical reasons we
considered any sentence which had more than one subordinate clause (except for a clearly defined reason, e.g. a list
of procedures in an experiment) in this category.
2.2.2. Word order
By mixing up a simple subject1verb1compliment
structure, very often a sentence became totally incomprehensible. We included in this category, split infinitives,
out of place subordinate clauses, etc.
It should be noted that often just one or two confused
sentences could make a complete IMRAD section very
difficult to understand so therefore these categories could
be relatively more important than others.
2.3. Lexical errors (word choice)
2.3.1. Jargon
These were words (or groups of words) which were
unnecessarily obscure or complicated for no apparent
reason. Thus a child would become a ‘paediatric patient’,
experimental mice were ‘sacrificed’ instead of killed etc.
Note that specific medical terminology was very seldom a
problem for any nationality and was not included in this
category. To be sure that the authors of this work
considered the same words as jargon, each word was
written down and agreed as such by consensus. Individual
prepositions, articles etc. were considered in the grammar
2.3.2. Noun misuse
A common specific lexical problem was the use of a
R. Coates et al. / Cardiovascular Research 53 (2002) 279 – 285
Table 1
Publication and average ‘error’ rates
% Acceptance
Overall error rate / article
United States
United Kingdom
lexical items which had been considered errors and give a
third opinion on any of the categories.
We divided the language categories first into the three
principal linguistic areas, i.e. grammatical, structural and
lexical. We then divided each of these groups into more
specific areas following indications both of leading editors
Day [7], Zeiger [6], and O’Connor [8], on clarity in
medical texts and our own previous findings (Coates
[12,13]). Given that nearly all articles should have already
been checked for spelling errors, printing errors etc., we
had to define exactly what we considered ‘errors’ to be.
The number of errors per nation can be seen in Table 2.
The overall error count was deemed to be reliable with a
standard deviation of 64.2 errors in articles with an
average of 39 errors.
However, the tense category was more difficult to agree
upon. As stated before, the original criteria for tense errors
was a simplification to make writing easier. In written
English there are numerous different ways of expressing
scientific knowledge and especially discussing them. While
some authors were confident with this usage, others
(notably non-English speakers) were not. Therefore we
were unable to find a clear definition of when tense use
constituted an error or not.
This was interesting in our control groups (US and UK)
as the only relevant difference between the two groups was
in the tense group (apart from active–passive use). This
would suggest that US writers prefer the simpler definition
we have already given while British writers were more
willing to use more complex forms. We were unable to
judge however, whether one form was clearer or ‘better’
than the other. However, we did include the data in the
tables as a point of reference.
The fact that both the error rate and the acceptance rate
for the two control groups was very similar would suggest
that they did offer a good comparison. Furthermore,
although some of these studies were far from perfect, they
did represent a standard. Thus the average of the two
control groups (22.5 errors per article) was considered the
zero point, i.e. what a normal native English speaker would
produce. The average of the other nationalities’ errors
(40.3) was considered the ‘upper error limit’. i.e. more
group of words with a noun when either a verb, adjective
or adverb would have been clearer and simpler [12]. Thus;
‘a recovery was achieved in a quick way . . . ’ instead of
‘being a quick recovery . . . ’
‘we studied the data in a statistical manner . . . ’ instead
of ‘statistically,’
‘we made an analysis of the data . . . ’ instead of
‘analysing the data . . . ,’ etc.
Most lexical corrections were not necessarily ‘true
errors’ as such. However, they were often pompous and
unclear thus obscuring the scientific data presented. Many
of these items have been noted by various authors on this
subject [6–9]. Used sparingly, many of these words would
not even be noticed, however in abundance together with
other structural and grammatical errors the result could
confuse the scientific message of an article. One hundred
and eighteen articles were surveyed (two were discounted
for procedural reasons) and the general error counts
together with acceptance percentages were noted (see
Table 1).
Each article was read twice and the errors in each of the
categories were counted. In the first reading, the passive–
active rate was counted and in the second all other errors
were counted. The research was ‘blind’ as we had no idea
which country the articles came from as all references to
country / hospital etc. as well as the title and authors had
been taken out by the editor of CVR. A sample of 10% of
the articles were double checked by a second researcher
who had no idea of the error counts of the first. Finally a
third researcher (a medical doctor) was consulted to check
Table 2
General error frequencies (average no. errors / article)
General grammar errors
Long sentences
Word order
Noun misuse
R. Coates et al. / Cardiovascular Research 53 (2002) 279 – 285
Table 3
Percentage of articles with low and high ‘error’ rates per nationality
% of articles with fewer than
control average errors (22.5)
% of articles with more than
total average errors (40.3)
United States
United Kingdom
errors than this and language would begin to make an
article difficult to read as can be seen in Table 3.
3. Results
As might be expected, there was no direct relationship
between the average number of language errors and the
acceptance rate (see Table 1). However, there was a
naturally high difference between the number of articles
with low error rates (below the control average number of
errors) presented by mother tongue writers and other
countries. These would represent those articles where
language (hopefully) would have no effect either good or
bad on the scientific value of a work. Furthermore, there
were considerably more articles with a high error rate
(above the total average of errors) in countries whose
publication rates were considerably less than the control
(see Table 3). The exception to this were the Spanish
papers who had a small number of both high and low error
3.1. Grammatical errors
3.1.1. Passive–active ratio
This ratio varied greatly both between individual authors
and between national averages. We found that normal
ratios for our controls were as follows (i.e. passive verbs
divided by active verbs): Abstract, 0.6; Introduction, 0.7;
Materials and methods, 2.0; Results, 0.67; Discussion 0.6.
Average passive:active use per nationality can be seen in
Table 4.
Table 4
Average passive / active use rate
Average passive / active ratio
United States
United Kingdom
An interesting point was that the British authors used the
passive considerably more than the US. This would fit the
general description that the US authors tended to prefer a
simpler style. However, there was no indication that this
effected the acceptance rate at all. The high passive use by
German authors reflected the high natural passive use in
the German language.
In reality, active–passive use is a much more complex
subject than mere frequency. When a subject is unequivocal, (for example when the subject of an experiment in the
Materials section has been well introduced) passive use is
perfectly acceptable. It is only when an author had
different subjects that this created problems for comprehension.
3.1.2. Tense
Given the procedural problems in agreeing with the
definition of ‘tense’ errors in scientific work we had to
come to the conclusion that this category was not an
accurate indicator of problems. We suggest that more
research should be carried out for a clear definition of what
editors expect in this category. Indeed there was no clear
difference between the nationalities. Interestingly, the
Japanese and the Swedes had fewer corrections in this
group than the control. This probably reflected the use of
simpler constructions used by these authors.
3.1.3. General grammatical errors
Given that these errors should have been eliminated
before presentation for publication, this category could be
considered a ‘general sloppiness’ category. Furthermore
this category would be almost instantly picked up on by
most referees and editors. It is not therefore surprising that
the group with the highest average of this group also had
the highest general error count and the lowest acceptance
rate (Italian authors) (Table 2).
3.2. Structural errors
Although the smallest group in terms of numbers, the
structural category was probably more important in terms
of overall comprehension. Thus only a few long rambling
sentences (often as long as a paragraph) would make a
R. Coates et al. / Cardiovascular Research 53 (2002) 279 – 285
whole article sometimes incomprehensible whereas a
relatively large number of lexical ‘errors’ would often have
no effect on an otherwise well-written article.
3.2.1. Long sentences
These were possibly the single most obvious problem
especially as the controls wrote relatively short sentences.
Indeed, several sentences seemed to try to hide rather than
clarify medical data! The following was a typical example.
‘The soluble form of B2 micro-globulin (B2 m) HLA class
I heavy chain (FHC) consists of three size variants,
namely the intact lipid-soluble 43 kDa heavy chain (A
variant), released through a shedding process; the truncated water soluble 39 kDa heavy chain (B variant),
which lacks the trans-membrane segment and is produced
by an alternative RNA splicing and the 34–36 kDa (C
variant), which lacks the trans-membrane and intratoplasmatic portion of the molecules.’
Simply breaking such sentences into a number of
smaller ones or using suitable connectors would have made
such sentences considerably easier to understand. German
and French authors had twice the amount of long sentences
than the other nationalities.
3.2.2. Word order
Given the importance of word order in English (with no
agreeing nouns / adjectives, declining verbs etc.), this category was very important for simple comprehension. Given
that the word order of the controls was very simple, this
problem was even more evident. The French and Italians
had the highest ‘error’ rate in this category.
Some examples: ‘In all patients, bioptic material was
taken and was studied in the period from December 1999
to May 2000.’
corrected to
‘Bioptic material was taken from all patients in the
period between December 1999 and May 2000. It was then
‘Brown detected, after LSD-treatment, by in-situ hybridisation, striking regional and cellular differences in the
rabbit spinal cord.’ (Was it Dr. Brown or the rabbits who
had had the LSD treatment?)
corrected to
‘Brown detected striking regional and cellular differences, by in-situ hybridisation, after LSD treatment in the
rabbit spinal cord.’
3.3. Lexical
3.3.1. Jargon
Numerically this group was the biggest source of
‘errors’ although taken independently many of the words
considered here as being unnecessarily complex could be
perfectly acceptable. However, we took as our criteria the
following points: (1) did the word / s appear in one of the
numerous lists of ‘words to avoid’ [6–9] already published? (2) Did a simpler word / s exist? (3) Did the word / s
unnecessarily complicate the text? We then sub-divided
these word categories into three groups, (a) confusing
words, (b) unnecessary words, (c) inaccurate words.
For the sake of uniformity, each lexical ‘error’ was
written down independently by the two language specialists and further checked by the medical specialist. Therefore in some of the ‘well-written’ articles (i.e. with fewer
errors than the control mean), we noted these lexical items
in any case to maintain uniformity. In context, many of
these articles could be considered to be without any real
mistakes at all. Many lexical ‘errors’, only really could
become a potential problem for editors when they were
summed with other errors, thus creating a ‘fog of words’.
3.3.2. Noun misuse
We divided this category from the jargon category
because noun misuse was generally very widespread.
This involved using nouns when either a verb, adjective
or adverb would have been simpler, easier to understand
and less pretentious. Furthermore this ‘error’ also mirrored
correct language use in the other languages.
‘are in agreement with . . . ’ rather than ‘agree with’ (a noun used instead of a
‘The care of the patient . . . ’ rather than ‘patient care’ (a noun used instead of
an adjective)
‘in recent years . . . ’ rather than ‘recently’ (a noun used instead of an adverb)
For an indicative list of some examples see Table 5.
4. Discussion
The purpose of this study was not to try to find a direct
correlation between language errors and acceptance rates.
Obviously all papers are accepted or rejected on scientific
merit rather than literary skill. However, we wanted to
pin-point certain language area problems which could
either directly or subliminally effect the possibility of an
otherwise sound medical work being rejected. Summarising the language areas we looked at, we can say the
1. Passive use. Apart from the Materials (patients) section,
the norm in medical articles was to have as high an
active–passive ratio as possible. However, if the subject
is clearly defined then the passive is acceptable.
2. Tense. We were unable to outline an objective tense use
in this study principally due to the different use of tense
by US and UK authors. However, we would prefer the
simplest possible use of tense as outlined by Day [7],
i.e. the past tense to refer to the current work being
described and the present tense to describe other
R. Coates et al. / Cardiovascular Research 53 (2002) 279 – 285
Table 5
Lexical categories
Lexical category
(a) Confusing word use
Paediatric patient
Studies in the scientific literature
It could be hypothesised
The above mentioned . . .
Experience a meaningful response
Until healing occurs
At variance with
The termination of
The number was fewer
In recent years
In a first step
Not in a specific way
By optical observation
Are in agreement with
The necessity of
Is the possibility to
For the treatment of
In detecting the presence of
Until healed
Were fewer
Not specifically
To treat
On detecting
(b) Inaccurate word use
(c) Unnecessary word use
Noun misuse
(a) instead of adjectives (participles)
(b) Instead of adverbs
(c) Instead of a verb
published work (see exceptions). This does not cover all
potential English use but it does considerably simplify
the task of writing, especially if the author is not
mother-tongue English.
General grammar errors. When a research paper is
presented for publication, there should be no general
grammar errors. If there are it means that the work has
not been checked by either mother-tongue colleagues or
professional scientific writers. A computer spell checker
alone is not enough!
Long sentences. Avoid sentences with more than one
subordinate clause. Shorter sentences in English denote
a simple style and clearer science.
Word order. In English the word order is fundamental
for understanding due to the lack of declensions or
agreeing adjective, nouns etc. Thus a simple word
structure (in simple sentences), i.e. subject1verb1
object would be easier to understand.
Jargon. Given that only a relatively small circle of
doctors will be comfortable with the precise vocabulary
of any given specialisation, there is already a lot of
effort required to understand a text without complicating general language. If a simpler alternative exists, use
Noun misuse. Given the formation of many European
languages, the ‘misuse’ of nouns in English was very
common. However, in English these structures tend to
be overly complex if not necessarily an error. The verb
is the strongest means of conveying meaning in English.
Generally we did not find a direct correlation between
the number of ‘errors’ written and the final acceptance rate
of articles presented for publication to Cardiovascular
Research. There was however a closer relationship between the number of well-written articles and acceptance
rates. That is, a well-written article would be judged solely
on its scientific merit without any language interference.
The partial exception to this were the Spanish articles
which had no very well written articles although a high
proportion were reasonably written. Thus it would seem
that a well-written medical article was one which had as
little ‘language interference’ as possible, i.e. as simple as
A large scale, cross reference survey including such data
as study design and data management are needed to
indicate exactly how important language interference is in
medical writing. We suggest that further work is done on
this subject to make these clear to publishing doctors.
Furthermore we would suggest that standardised if not
universal guidelines be made to make both the work of
medical writers and editors easier.
[1] Smith R. The case for structuring the discussion of scientific papers.
BMJ Educ 1999;318:1224–1225.
[2] Horton R. The rhetoric of research. Br Med J 1995;310:985–987.
[3] Rennie D. The present state of medical journals. Lancet
R. Coates et al. / Cardiovascular Research 53 (2002) 279 – 285
[4] Opthof T. Submissions, publications and reviewers from Europe:
focus on Spain. Cardiovasc Res 1999;43:265–267.
[5] Hall G, editor, 2nd ed, How to write a paper, BMJ Books, 1998.
[6] Zeiger M. In: 2nd ed, Essentials of writing biomedical research
papers, 1999.
[7] Day R. In: 5th ed, How to write a scientific paper, 1998.
[8] O’Connor M. In: 2nd ed, Writing successfully in science, 1992.
[9] Goodman N, Edwards M. In: 2nd ed, Medical writing: a prescription
for clarity, 1997.
[10] Kirkman J. Writing in English for an international readership. BMJ
Educ Debate 1996;313:1321–1323.
[11] Heslop J. Tense and other indexical markers in the typology of
scientific tests in English. In: Hoedt J et al., editor, Pragmatics and
LSP, Copenhagen School of Economics, 1982, pp. 83–103.
[12] Coates R. The use of jargon in Italian scientific research. In: II
National Congress A.N.C.E, 2000.
[13] Coates R. Errors in medical publications. In: 1st theoretical–practical course of ‘scientific writing’, Bari: IRCCS, 2001.