The ATA Chronicle - May/June 2021 - 25

The politics behind
language must also
be acknowledged
when conducting
terminology work.

texts is much smaller than

IT professionals no matter

currently far from a point

the vocabulary in general-

their company affiliation.

at which they can be used

language texts. Therefore,

Additionally, SMEs employ

without human validation.

starting off with a small

simplified language in

For languages in which the

corpus in a technical field is a

texts in which they explain

patterns used to construct

good way to avoid bias.

their trade to lay people.

terms are well identified and

When SMEs address other

have been taught successfully

SMEs, they use the special

to machines, automatic

language of their trade

extraction produces many

naturally, so corpora should

collocations, or groupings

be filled with documents

of words that frequently

produced with this audience

appear in a corpus but are

in mind to more accurately

not actually terms. Figure 5

produce the technical

on page 26 shows the results

terminology of a trade.

from a very small corpus of

5

Make sure the texts
included are not
translations. Why is it

a major challenge and
source of bias is the use of
stereotypical identification
methods that too narrowly
define who gets to be
considered " native. " For
instance, a person with
the last name Smith may
or may not speak English
as a native language. They
may also speak Spanish as
a native language, or any
other language spoken on
the planet. Assuming who a
native speaker is based on
superficial data like people's
names does not and should
not suffice for professional
terminology work.
Here are some tips to
achieve well-formed corpora

important not to include
translations in your
corpora? According to
Mona Baker, emeritus
professor of translation
and intercultural studies
and modern languages

computer-assisted translation
(CAT)-related documentation.

universal characteristics

Terminology Extraction:
Human Validation
Required

of translation include the

Terminology extraction is
more successfully conducted

number of " among additional

tendency to lean toward
explicit communication,

when one can critically

simplified language, and a

engage with the types of

safe middle between covert

challenges being navigated,

and overt translation.6 These

which machines cannot do

characteristics mean that

at this point. To understand

translations do not replicate

these challenges, we'll start

the way SMEs communicate

by looking at how extraction

and cultures at the
University of Manchester,

together in a single
language and are therefore

The results include a number
of invalidated terms, including
" in the document " and " the
collocations. As you can
tell, the configuration that
produced the extraction
results has simply not yet
been refined by a qualified
human to produce higher

is carried out by a machine.
During machine extraction,
the large batch of words in

quality results.

At the core of
terminology
management is an
understanding of how
people conceptualize
objects and ideas
and how that can be
leveraged to influence
audiences and lead
them to take action.

free of bias:

not good candidates for

Concentrate on the volume

technical terminology.

of words. To substantiate

(When producing client-

whether a term is indeed

specific terminology based

part of the special language

upon past translations,

used by a community of

however, a successful

SMEs, a wide variety of texts

project is not possible

is needed. Terminologists

unless past translations

should look for an unbiased

are consulted.)

to carry out extraction for

one indication of this.

Include many authors, and

be taught to identify frequent

According to Khurshid

make sure those authors

occurrences of single words

Ahmad, professor of

are SMEs writing for other

or compounds that follow

computer science at Trinity

SMEs. Why is this important?

these patterns (among

College Dublin, and Margaret

Well, if Microsoft documents

others): noun + noun; noun

Rogers, professor of

are the only documents

+ noun + noun; noun + of +

translation and terminology

included in a corpus intended

noun, etc.

studies and director of

to produce IT terminology,

Despite claims about the

the Centre for Translation

the term extraction results

ability of artificial intelligence

languages. Anecdotally,

Studies at the University of

will be biased toward

to replicate and even replace

I've observed poorer results

Surrey, empirical studies

Microsoft jargon rather

humans, it's important to

from automatic extractions

suggest that the vocabulary

than producing the shared

keep in mind that automatic

for languages like Korean

used in special-language

special language used by

extraction results are

and Thai, if extraction is

sample of texts, size being

www.ata-chronicle.online

corpora intended to produce

the corpus is analyzed for
groups of words that follow
the patterns of how terms
are normally expressed in
that language. Most terms
are nouns. Those nouns are
single words or compounds.
So, when teaching a machine
English, the machine would

Out-of-the-box term
extractors are currently
not widely available in all

American Translators Association

25


http://www.ata-chronicle.online

The ATA Chronicle - May/June 2021

Table of Contents for the Digital Edition of The ATA Chronicle - May/June 2021

Contents
The ATA Chronicle - May/June 2021 - 1
The ATA Chronicle - May/June 2021 - Contents
The ATA Chronicle - May/June 2021 - 3
The ATA Chronicle - May/June 2021 - 4
The ATA Chronicle - May/June 2021 - 5
The ATA Chronicle - May/June 2021 - 6
The ATA Chronicle - May/June 2021 - 7
The ATA Chronicle - May/June 2021 - 8
The ATA Chronicle - May/June 2021 - 9
The ATA Chronicle - May/June 2021 - 10
The ATA Chronicle - May/June 2021 - 11
The ATA Chronicle - May/June 2021 - 12
The ATA Chronicle - May/June 2021 - 13
The ATA Chronicle - May/June 2021 - 14
The ATA Chronicle - May/June 2021 - 15
The ATA Chronicle - May/June 2021 - 16
The ATA Chronicle - May/June 2021 - 17
The ATA Chronicle - May/June 2021 - 18
The ATA Chronicle - May/June 2021 - 19
The ATA Chronicle - May/June 2021 - 20
The ATA Chronicle - May/June 2021 - 21
The ATA Chronicle - May/June 2021 - 22
The ATA Chronicle - May/June 2021 - 23
The ATA Chronicle - May/June 2021 - 24
The ATA Chronicle - May/June 2021 - 25
The ATA Chronicle - May/June 2021 - 26
The ATA Chronicle - May/June 2021 - 27
The ATA Chronicle - May/June 2021 - 28
The ATA Chronicle - May/June 2021 - 29
The ATA Chronicle - May/June 2021 - 30
The ATA Chronicle - May/June 2021 - 31
The ATA Chronicle - May/June 2021 - 32
The ATA Chronicle - May/June 2021 - 33
The ATA Chronicle - May/June 2021 - 34
The ATA Chronicle - May/June 2021 - 35
The ATA Chronicle - May/June 2021 - 36
The ATA Chronicle - May/June 2021 - 37
The ATA Chronicle - May/June 2021 - 38
The ATA Chronicle - May/June 2021 - 39
The ATA Chronicle - May/June 2021 - 40
https://www.nxtbook.com/nxtbooks/chronicle/20240304
https://www.nxtbook.com/nxtbooks/chronicle/20240102
https://www.nxtbook.com/nxtbooks/chronicle/20231112
https://www.nxtbook.com/nxtbooks/chronicle/20230910
https://www.nxtbook.com/nxtbooks/chronicle/20230506
https://www.nxtbook.com/nxtbooks/chronicle/20230304
https://www.nxtbook.com/nxtbooks/chronicle/20230102
https://www.nxtbook.com/nxtbooks/chronicle/20221112
https://www.nxtbook.com/nxtbooks/chronicle/20220910
https://www.nxtbook.com/nxtbooks/chronicle/20220708
https://www.nxtbook.com/nxtbooks/chronicle/20220506
https://www.nxtbook.com/nxtbooks/chronicle/20220304
https://www.nxtbook.com/nxtbooks/chronicle/20220102
https://www.nxtbook.com/nxtbooks/chronicle/20211112
https://www.nxtbook.com/nxtbooks/chronicle/20210910
https://www.nxtbook.com/nxtbooks/chronicle/20210708
https://www.nxtbook.com/nxtbooks/chronicle/20210506
https://www.nxtbook.com/nxtbooks/chronicle/20210304
https://www.nxtbook.com/nxtbooks/chronicle/20210102
https://www.nxtbookmedia.com