1.1
Introduction
There are
competing tensions in the phonetic transcription of disordered speech. First,
there is the need to produce as accurate a transcription as possible to aid in
the analysis of the speech of the patient being investigated, and to inform the
patterns of intervention that will be planned in remediation. Opposed to this
requirement is the problem of reliability; it has been suggested that the more
detailed a transcription is, the less reliable it tends to be. This research
project has two main aims: to test whether the recent introduction of
specialist symbols for aspects of atypical speech can produce reliable
transcriptions, and whether aspect of acoustic instrumentation can help resolve
disagreements in transcription. The overall objective, therefore, is to produce
a set of measures which, if undertaken by clinical phoneticians and speech and
language therapists, will improve the description of disordered speech and
thereby facilitate appropriate therapeutic measures.
1.2 Transcription
1.2.1
Narrow and Broad Transcription
As just noted,
accurate phonetic transcription is required for adequate analysis of the
patterns of disordered speech in a patient, and therefore for effective and efficient
intervention by the therapist. This importance has been highlighted by many
researchers in the field. For example, Carney (1979) warned of the dangers of
inappropriate abstraction in transcription by using broad, less detailed,
symbolization reflecting the phonological units of the target pronunciation,
and thereby running the danger of over- or underestimating the patient's
phonological abilities.
Such dangers are discussed also in
Buckingham and Yule (1987), who note (p.123) "without good phonetics,
there can be no good phonology". Their focus of interest is in 'phonemic
false evaluation', a process whereby listeners assign sounds to a particular
category or sound unit of the target system, ignoring differences at a phonetic
(or 'sub-phonemic') level. With speech disordered patients this often results
in sounds being categorized as belonging to units other than that intended by
the speaker. Buckingham and Yule stress the importance of accurate phonetic
description to allow for an analysis that distinguishes between a disorder that
involves phonological simplification (e.g. complete merger of certain
phonological units), and one where phonetic differences between the patient's
and the target system need to be highlighted.
Ball (1988) and Ball, Rahilly and Tench
(1996) illustrate some of the problems associated with a broad transcription of
disordered speech data. An example from the latter source is given below, using
material from disordered child phonology:
Subject A.
Age 6;9. Broad Transcription.
pin [pIn] ten [ten]
bin [pIn] done [t¿n]
cot [kAÉt] pea [piÉ]
got [kAÉt] bee [piÉ]
This data set
suggests that there is a collapse of phonological contrast: specifically the
contrast between voiced and voiceless plosives in word-initial position. This
clearly leads to homonymic clashes between, for example, 'pin' and 'bin' and
'cot' and 'got' respectively. As word-initial plosives have a high functional
load in English, such a loss of the feature contrast [±voice] in this context clearly requires treatment. It
would appear from this data, that an initial stage of treatment would
concentrate on the establishment of the notion of contrast with this sounds,
before going on to practice the phonetic realization of this contrast.
However, if we look at a narrow
transcription of the same data, the picture alters.
Speaker B.
Age 6;9. Narrow Transcription.
pin [pîIn] ten [tîen]
bin [pIn] done [t¿n]
cot [kîAÉt] pea [pîiÉ]
got [kAÉt] bee [piÉ]
It is clear
from this transcription that there is not, in fact, a loss of contrast between
initial voiced and voiceless plosives. Target voiceless plosives are realized
without vocal fold vibration (voice), but with aspiration on release (as are
the adult target forms). The target voiced plosives are realized without
aspiration (as with the adult forms), but also without any vocal fold
vibration. It is this last differences that distinguishes them from the target
form. For, while adult English 'voiced' plosives are often devoiced for some of
their duration in initial position, totally voiceless examples are rare.
The narrow transcription shows, therefore,
that the difference between the speaker's pronunciation of these sounds and the
target is minimal. The notion of contrast does not need to be established and,
as aspiration is the main acoustic cue used by adults to perceive the
difference between these groups of plosives, the child's speech may well sound
only slightly atypical.
1.2.3 Transcriber Reliability
While it can
be demonstrated that narrow transcriptions of disordered speech are important
to avoid the kinds of misanalysis shown above, there is also evidence to
suggest that narrow phonetic transcription — as opposed to broad — often
produces problems of reliability. Reliability as used in this context has two
main exponents: inter-judge and intra-judge reliability. Inter-judge
reliability refers to measures of agreement between separate transcribers when
dealing with the same data. Agreement measures usually involve one-to-one
comparison of the symbols and the diacritics used, though it is possible to
refine such measures through including features such as 'complete match',
'match within one phonetic feature' (such as voice, place, etc.), and
'non-match'. Intra-judge reliability refers to measures of agreement between
first and subsequent transcriptions of the same material by the same judge.
In order for narrow transcriptions to be
trusted as bases for analysis and remediation, it is important that issues of
reliability are dealt with. However, as Shriberg and Lof (1991) have noted,
barely three dozen studies from the 1930s onwards have addressed this issue.
Their own study was based on a series of transcriptions of different patient
types undertaken for other purposes over several years. They used a series of
different transcriber teams to undertake broad and narrow transcriptions of the
data, and then compared results across consonant symbols, vowel symbols and
diacritics (these last based on the Shriberg and Kent, 1982, system). Their
results cover a large range of variables, but in essence there is a good level
of agreement (inter- and intra-judge) with broad transcription, but on most
measures narrow transcription does not produce acceptable levels of
reliability.
Shriberg and Lof's (1991) study is
clearly an important one, but suffers from the fact that the data used was not
primarily intended for such an investigation, that the symbols utilized lacked
the recent developments towards a comprehensive set for atypical speech sounds,
and that access to acoustic instrumentation was not available.
1.2.4 The extIPA symbols
One of the
problems encountered with the transcription of disordered speech is that the
transcriber is likely to need to deal with non-normal speech sounds while using
a transcription system devised only to deal with the speech sounds of natural
language. The International Phonetic Alphabet (IPA) is the symbol system used
by most clinical phoneticians and speech-language therapists. It was, however,
drawn up to transcribe the range of speech sounds found normally in language.
There are numerous possible speech sounds not recorded in natural language that
nevertheless occur with relative frequency in a range of speech disorders. One
possible explanation, then, for the sorts of reliability results reported in
Shriberg and Lof (1991) could lie in the fact the transcribers have not always
been adequately equipped to undertake narrow transcription through a lack of
specialist symbolization.
Ball (1988, 1991), Duckworth et al
(1990) Ball et al (1994) have charted the development of specialist symbol
systems for disordered speech from the 1970s to the present. This work
culminated in the adoption of the 'Extensions to the International Phonetic
Alphabet for the transcription of disordered speech and voice quality', now
known by the abbreviation 'extIPA'. This system is described in Duckworth et al
(1990), with additions noted in Bernhardt and Ball (1993); examples of the
system in use with a variety of atypical segmental and suprasegmental speech
are available in Ball (1991) and Ball et al (1994).
The extIPA system introduces a range of
new symbols and diacritics to cope with non-normal place and manner of
articulation, phonatory activity, nasalization and nasal friction together with
velopharyngeal friction, reiteration, together with means of marking prosodic
features such as voice quality, tempo, loudness and pausing. A range of
atypical speech, including children's articulation disorders, cranio-facial
disorders, fluency problems and acquired neurogenic disorders in adults can be
covered by these symbols, though it must be recalled that many phonologically
disordered patients may never use such atypical sounds.
These symbols are gradually being
introduced into the training of speech-language therapists in Britain, though
until now there has been no research undertaken to see whether the use of this
dedicated symbol set has enabled high inter- and intra-judge reliability scores
to be obtained in the transcription of speech disordered patients. It may well
be the case that the use of this symbolization will allow transcribers to avoid
the tendency to abstract away from 'difficult' sounds to a symbol used for a
more familiar similar sound, caused by the lack of a symbol specifically for
the sound in question. On the other hand, we may find an 'overload' effect, in
that transcribers will find it difficult to learn and/or apply a still larger
set of symbols than the standard IPA set.
It is one of the aims of this study to
investigate the effect of the extIPA system on reliability measures in the
transcription of disordered speech. To this end, the speech of a variety of
patient types will be investigated. Of most interest will be speech that
contains at least some atypical sounds, so that reliability in the specific
subset of the extIPA symbols can be investigated as well as overall. Nevertheless,
patients with less severe disordered speech will also be included to see
whether the additional training the transcriber receive in extIPA might aid
their abilities with the IPA itself.
1.2.5
Instrumental Analysis
Shriberg and
Lof (1991) conclude their study by pointing to a future 'marriage' of
instrumental phonetic description with impressionistic transcription (see also
Ball 1988), to overcome the problems of narrow transcription reliability.
Recent studies show that this development is beginning to occur with some
clinical phonetics cases. Klee and Ingrisano (1992) and Ball and Rahilly
(1993). Recent software development also highlights the growing use of computer
technology as an aid to speech analysis and transcription, Most notable in this
regard is the Kay Elemetrics CSL Phonetics Tutorial that provide spectrographic
and electropalatographic traces for a wide range of IPA symbols that can be
matched with traces captured by the user to aid in the correct assignment of
symbols. This system is currently being extended for Kay by the applicant to
the extIPA symbol set.
Another aim of this study will be how
far the CSL Tutorial program can aid in transcribing the speech samples
collected. This system will be referred to after impressionistic transcriptions
have been analysed, to see to what extent — if any — it can resolve
uncertainties and inconsistencies between transcribers and between
transcriptions of the same transcriber.
1.3 Research Questions
The research
questions to be addressed in this project are as follows.
1) What level
of inter-judge reliability is found in the narrow phonetic description of
disordered speech using additional symbols specifically designed for this area
(the extIPA symbols).
2) What level
of intra-judge reliability is found in the narrow phonetic description of
disordered speech using the extIPA.
3) What
relation if any exists between inter- and intra-judge disagreements and the
type and severity of speech disorder, and the type of speech sample (i.e.
word-list as opposed to spontaneous speech).
4) To what
extent can consensus be reached on disagreements in transcription through
accessing acoustic instrumental analyses of the speech samples, and what sound
types are most liable to such agreement.
5) How can the
results of the project inform a training and analysis programme to maximise
reliability in the narrow transcription of disordered speech.
1.4 Data Collection and Method
1.4.1 Initial Training
The research
assistants will undergo a short period of intensive training in the use of the
extIPA symbols, conducted by the applicant.
1.4.2 Subjects
Subjects will
be accessed through existing links with local Speech and Language Therapy
services in both Health Centres and Hospitals, and through patients working
with other members of the academic staff in the Department of Communication.
Selection criteria are those of type of
disorder and severity of disorder. We intend investigating five types of speech
disorder that should illustrate a range of atypical speech sounds covered by
extIPA symbols. These are: child articulation disorders, cranio-facial
disorders (cleft palate), adult disfluency (stuttering), adult apraxia of
speech, and adult dysarthria. As it is the disordered speech that is the focus
of the study, there is no requirement to match subjects in terms of age, sex,
time since onset etc.
In terms of severity, we wish to include
both severe and moderately disordered data in our analyses. To this end we will
seek to select two subjects in the severe grouping and two in the moderate
grouping for each disorder type: resulting in 20 subjects in toto. The
classification of subjects into moderate and severe groupings before
undertaking an analysis of their speech is not, of course, straightforward. In
this regard we will rely on the judgements of the subjects' speech and language
therapists and our own informal assessment. Permission from the University
Ethics Committee has been obtained for the use of subjects' speech in this
study.
1.4.3 Data
The data to be
collected will be of two types. First, spontaneous speech will be elicited from
the subjects. This will naturally differ in amount and topic from subject to
subject, but for the purposes of narrow phonetic transcription, a large amount
of such material is not necessary. To aid direct comparability each subject
will also be required to undertake a standard picture elicitation procedure (in
this case that of the Edinburgh Articulation Test). This will also allow us to
investigate the claim of Shriberg and Lof (1991) that continuous speech
produces higher reliability scores in narrow transcription than do word-lists.
The data will be recorded on high
quality portable digital auditory tape recorders (DAT). Where possible,
subjects will be recorded in the Phonetics Laboratory of the University of
Ulster; otherwise a quiet area will be utilized to ensure good quality
recordings.
Video recordings will be made of all
data acquisition sessions, as visual information is important for transcribing
certain sounds (including atypical sounds such as linguolabials and
dentolabials).
1.4.4 Analysis
All recordings
will be transcribed by all three researchers as soon as possible after they are
made. Transcriptions will be repeated after two months to minimize the effect
of memory of the first transcription session. Transcriptions will be only of
segmental information; an examination of reliability in suprasegmental
(prosodic) transcription is beyond the scope of this project. The
transcriptions will be narrow (i.e. aiming to include the maximum amount of
information); there will be no comparison with broad transcriptions as we know
from previous studies (see Shriberg and Lof 1991) that they consistently
produce high reliability scores, though as noted above their accuracy is doubtful.
Finally, the focus of the transcription is on the consonant system, precise
values of vowels will not be sought, though features such as nasality will be
marked.
To assess reliability, a straightforward
matching procedure will be undertaken. Unlike Shriberg and Lof (who were
comparing broad and narrow transcription ratings) we do not intend to
discriminate between symbols and diacritics; the match will be between
segments, whether these are represented by a symbol alone or a symbol plus
diacritic. In cases of mismatch, we will note whether the mismatch is near
(within one phonetic feature) or not near (more than one phonetic feature
different).
As with Shriberg and Lof (1991)
agreement tables for symbols and for phonetic features (such as place, voicing
etc.) will be drawn up, with word position (initial, medial and final)
identified. Percentage agreements will be worked out together with measures of
near agreement. Non-parametric
inferential statistics will be used to support trends in the data. As well as
inter- and intra-judge reliability, we will examine the relationships between
subject type and severity of disorder, and examine disagreement trends between
the three judges.
Following each transcription, acoustic
instrumental analysis of relevant parts of the tape will be undertaken using
the Kay Elemetrics CSL system of the Phonetics Laboratory of the University of
Ulster. This will concentrate on examining areas of disagreement at both the
inter- and intra-judge level. The transcribers may access the Kay Phonetics
Tutorial programs for both IPA and extIPA symbols to help in examining these
disagreements, and if consensus can be reached through this procedure, it will
be noted separately. We will then examine any trends of consensus reaching through
the use of acoustic instrumentation.
1.5 Strategic Implications
One of the
main aims of this project is to provide principled guidance in the undertaking
of narrow phonetic transcription of disordered speech. It is hoped that a
programme may be drawn up to guide clinical phoneticians and speech and
language therapists, as well as lecturers on communicative disorders degree
courses how best to approach this task. It should show which sound types — both
normal and disordered — regularly demonstrate high levels of disagreement, and
which sound types seem most amenable to the aid of instrumental analysis. It is
expected it will also demonstrate the value of the extIPA symbols and the
extIPA CSL Tutorial, and so aid in the further dissemination of this new tool.
1.6 Expected Outputs
Results will
be made available to all those with an interest in the outputs of the research
through a variety of channels. A final report on the project will form the
basis of a workbook in phonetic description of disordered speech aimed at
students and therapists to improve their skills in this area.
We would also aim to publish several
papers in the academic journals in the field of communication disorders and
phonetics. The applicant has considerable experience in publishing in this
area, and would be able to aid the research assistants to increase their
publishing profile.
We would aim to present papers at the
Annual Congress of the International Clinical Phonetics and Linguistics
Association (Montreal, 1998), and at the XIV International Congress of Phonetic
Sciences, Berkeley (August 1999), and at an appropriate annual Convention of
the American Speech-Language-Hearing Association. Work in progress would be
reported to the irregular meetings of the British and Irish Group of the
International Clinical Phonetics and Linguistics Association (ICPLA-BIG).
References
Ball, M. J. (1988) The contribution of speech pathology to the
development of phonetic description. In Ball, M. J. (ed.), Theoretical
Linguistics and Disordered Language. London: Croom Helm.
Ball, M. J. (1991) Recent developments in the transcription of
non-normal speech. Journal of Communication Disorders, 24, 59-78.
Ball, M. J.
and Rahilly, J. (1993) Transcribing disfluent speech: a case study. ICPLA
North-West Pacific Regional Group Meeting, University of British Columbia.
Ball, M. J., Code, C., Rahilly, J. and Hazlett, D. (1994) Non-segmental
aspects of disordered speech: Developments in transcription. Clinical Linguistics and Phonetics, 8,
67-83.
Ball, M. J., Rahilly, J. and Tench, P. (1996) The Phonetic
Transcription of Disordered Speech. San Diego: Singular Press.
Bernhardt, B. and Ball, M. J. (1993) Characteristics of atypical speech
currently not included in the Extensions to the IPA. Journal of the International
Phonetic Association, 23, 35-38.
Buckingham, H. W. and Yule, G. (1987) Phonemic false evaluation:
theoretical and clinical aspects. Clinical Linguistics and Phonetics, 1,
113-25.
Carney, E. (1979) Inappropriate abstraction in speech-assessment procedures.
British Journal of Disorders of Communication, 14, 123-35.
Duckworth, M., Allen, G., Hardcastle, W. and Ball, M. J. (1990)
Extensions to the International Phonetic Alphabet for the transcription of
atypical speech. Clinical Linguistics and Phonetics, 4, 273-80.
Klee, T. and Ingrisano, D. (1992) Clarifying the transcription of
indeterminable utterances. Paper presented at ASHA Convention, San Antonio.
Shriberg, L. and Kent, R. D. (1982) Clinical Phonetics. New York:
Macmillan.
Shriberg, L. and Lof, G. (1991) Reliability studies in broad and narrow
transcription. Clinical Linguistics and Phonetics, 5, 225-79.
2.1 Summary
The literature amply illustrates the problem of
inaccurate description of disordered speech through imprecise phonetic
transcription of clinical speech material. Such inaccurate description will
often result in wrong diagnosis and thus inappropriate management programme
being implemented. There is also the danger of inaccurate prognosis, with a
knock on effect on resource planning.
This project will investigate inter- and intra-scorer
reliability measures for the narrow transcription of a range of disordered
speech types when transcribers are trained in the use of the new symbols:
'Extensions to the International Phonetic Alphabet for the Transcription of
Disordered Speech' (extIPA). It will further ascertain the effect of access to
acoustic phonetic data on the resolution of transcription disagreements.
3.1 Aims
1) To investigate what level of inter-judge
reliability is found in the narrow phonetic description of disordered speech
using additional symbols specifically designed for this area (the extIPA
symbols).
2) To investigate what level of intra-judge
reliability is found in the narrow phonetic description of disordered speech using
the extIPA.
3) To ascertain what relation if any exists between
inter- and intra-judge disagreements and the type and severity of speech
disorder, and the type of speech sample (i.e. word-list as opposed to
spontaneous speech).
4) To evaluate the extent to which consensus can be
reached on disagreements in transcription through accessing acoustic
instrumental analyses of the speech samples, and what sound types are most
liable to such agreement.
5) To produce a programme of training and analysis to
maximise reliability in the narrow transcription of disordered speech.
3.2 Method
The research assistant will undergo a short period of
intensive training in the use of the extIPA symbols and acoustic analysis
(where necessary), conducted by the applicants. The research assistant will be
responsible for the data collection.
Subjects will be accessed through existing links with
local Speech and Language Therapy services in both Health Centres and
Hospitals, and through patients working with other members of the academic
staff in the School of Behavioural & Communication Sciences. Selection
criteria are those of type of disorder and severity of disorder. We intend
investigating five types of speech disorder that should illustrate a range of
atypical speech sounds covered by extIPA symbols. These are: child articulation
disorders (developmental verbal dyspraxia), cranio-facial disorders (cleft
palate), adult disfluency (stuttering), adult apraxia of speech, and adult
dysarthria. As it is the disordered speech that is the focus of the study,
there is no requirement to match subjects in terms of age, sex, time since
onset etc.
In terms of severity, we wish to include both severe
and moderately disordered data in our analyses. To this end we will seek to
select two subjects in the severe grouping and two in the moderate grouping for
each disorder type: resulting in 20 subjects in toto. The classification of
subjects into moderate and severe groupings before undertaking an analysis of
their speech is not, of course, straightforward. In this regard we will rely on
the judgements of the subjects' speech and language therapists and our own
informal assessment. Permission from the University Ethics Committee will be
obtained for the use of subjects' speech in this study.
The data to be collected will be of two types. First,
spontaneous speech will be elicited from the subjects. To aid direct
comparability each subject will also be required to undertake a standard
picture elicitation procedure . This will also allow us to investigate claims
that continuous speech produces higher reliability scores in narrow
transcription than do word-lists. The data will be recorded on high quality
digital auditory tape recorders (DAT).
3.3 Analysis
All recordings will be transcribed by all three
researchers as soon as possible after they are made. Transcriptions will be
repeated after three months to minimize the effect of memory of the first
transcription session. Transcriptions will be only of segmental information; an
examination of reliability in suprasegmental (prosodic) transcription is beyond
the scope of this project. The transcriptions will be narrow, i.e. aiming to
include the maximum amount of information. The transcription will cover both
the consonant and the vowel systems.
Following each transcription, acoustic instrumental
analysis of relevant parts of the tape will be undertaken in the Phonetics
Laboratory of the University of Ulster. This will concentrate on examining
areas of disagreement at both the inter- and intra-judge level. We will then
examine any trends of consensus reaching through the use of acoustic
instrumentation.
3.4 Timescale
Year 1: training of RA in use of extIPA symbols and
acoustic instrumentation; commencement of data acquisition.
Year 2: further data acquisition; data analysis and
re-analysis sessions.
Year 3: completion of analysis sessions; preparation
of transcription guidelines programme; dissemination of results; preparation of
final report.
4.1 Novelty
No research has been undertaken on the use of the
extIPA symbols in narrow transcription of disordered speech. While work exists
on transcription reliability in both normal and disordered speech using the
ordinary International Phonetic Alphabet, no attempts have been made to provide
guidelines in transcription linking transcription with acoustic
instrumentation.
4.2 Significance
Virtually all practising Speech-Language Therapists
utilise phonetic transcription in their description of the speech of their
clients, as few have access to instrumental techniques. This research therefore
is important, as it will provide explicit, principled guidance in the
undertaking of narrow phonetic transcription of disordered speech. It is hoped
that a programme may be drawn up to guide clinical phoneticians and speech and
language therapists, as well as lecturers on communicative disorders degree
courses how best to approach this task. It should show which sound types — both
normal and disordered — regularly demonstrate high levels of disagreement, and
which sound types seem most amenable to the aid of instrumental analysis.