The ATA Chronicle - September/October 2021 - 33

RESOURCE REVIEW by Yuri Balashov
OPUS-CAT: A State-of-the-Art
Neural Machine Translation
Engine on Your Local Computer
Neural machine translation (NMT) is one of the success stories of deep
learning and artificial intelligence. Revolutionary innovations in the
computational architectures made in 2015-2017 have led to dramatic
improvements in the quality of machine translation (MT) and changed
the field forever. Some professional translators welcome these changes
with enthusiasm, others less so. But everyone has to deal with them.
Historically, the relationship between human translation and MT has
been uneasy and complicated, but an increasing number of players in
both fields are now coming to view it as synergistic.1
o leverage this synergy,
MT developers,
disappointed with the
quality of crowdsourced
evaluation of MT output,
are heeding advice from
professional translators.
The latter, in turn, are using
MT suggestions alongside
translation memory (TM)
matches to facilitate work
on their projects, especially
since many computer-assisted
translation (CAT) tools now
have plugins for popular
generic MT engines. And one
should never forget that the
multilingual data used to train
MT systems has, at the end of
the day, a very human origin.
Most general-purpose
MT systems are trained on
everything they can find,
scrape, clean, and align
from the huge amount of
multilingual data available.
But most of professional
human translation is
specialized, which is what
gives it a real value. Yet the
desire to automate everything
is unstoppable. Tech giants,
custom MT providers, and
T
larger language services
providers are actively working,
even as you read this, on
domain adaptation-the
fine-tuning of MT systems
to particular areas such as
software localization, clinical
trials, or pharmaceuticals.
Domain adaptation is a hot
topic in academic MT research
and figures prominently
at the annual machine
translation competitions.
To make use of the MT
adaptation offered by
commercial companies,
translators are encouraged
to upload their TMs and
term bases to a cloudbased
engine in the hope
that the system will learn,
for example, to translate
" conductor " as " electric
conductor " and not " music
conductor. " Normally this
requires a paid subscription,
with confidentiality of the
uploaded resources promised
in return. There's no reason
to distrust such assurances.2
Yet some translators may
hesitate to press the Upload
button and commit their
www.ata-chronicle.online
golden super-confidential
resources, resulting from
years of painstaking work and
generating steady revenue,
to the cloud. And some may
want to know more about
what happens, technically
speaking, to the uploaded
data on the other side of the
internet connection.
One could handle all such
issues by installing, training,
and using a customizable NMT
system, such as OpenNMT or
Marian NMT (see the links in
the sidebar on page 37), on a
local machine. But one should
be prepared to use command
line programming and
deal with very intimidating
installation and debugging
pipelines. Even if the process
is successful, before the
system is fine-tuned for a
specific domain (e.g., oil and
gas, aerospace engineering)
it needs to be trained from
scratch on 10+ million
parallel sentences from
generic bilingual corpora,
such as OPUS. Doing it on a
central processing unit (CPU)
machine would take several
months (and you would need
to put it in a freezer). Using
a graphics processing unit
might be a better option. But
to harness its full power, you
need to know what to do with
it-another pain. It's safe
to say that 99% of us don't
have the requisite skills to do
any of that.
What if someone else did
all of that for us? And we
could simply enjoy having a
pre-trained state-of-the-art
NMT system on our PC, totally
free, in a familiar Windows
environment giving us an
opportunity to fine-tune it
with our local and exclusive
resources? And even better-
integrate it into our CAT tools?
Sounds too good to be true?
Enter OPUS-CAT.
What's OPUS-CAT?
OPUS is one of the largest
collections of publicly
available bilingual corpora
in many language pairs
widely used for training
MT systems. OPUS-MT, an
ongoing project led by Jörg
Tiedemann, a professor of
language technology at the
University of Helsinki, and
funded by several European
Union and local agencies, is
a growing repository of over
a thousand pre-trained NMT
models intended to be used
with Marian NMT, an efficient
and robust framework written
in C++, which can run on
Windows computers. OPUSCAT
is a collection of MT tools
built around Marian NMT
and developed by Tommi
Nieminen3
, an experienced
professional translator and
MT researcher. It's largely due
to his unique combination of
skills that we have OPUS-CAT
at our fingertips.
American Translators Association 33
http://www.ata-chronicle.online

The ATA Chronicle - September/October 2021

Table of Contents for the Digital Edition of The ATA Chronicle - September/October 2021

Contents
The ATA Chronicle - September/October 2021 - 1
The ATA Chronicle - September/October 2021 - Contents
The ATA Chronicle - September/October 2021 - 3
The ATA Chronicle - September/October 2021 - 4
The ATA Chronicle - September/October 2021 - 5
The ATA Chronicle - September/October 2021 - 6
The ATA Chronicle - September/October 2021 - 7
The ATA Chronicle - September/October 2021 - 8
The ATA Chronicle - September/October 2021 - 9
The ATA Chronicle - September/October 2021 - 10
The ATA Chronicle - September/October 2021 - 11
The ATA Chronicle - September/October 2021 - 12
The ATA Chronicle - September/October 2021 - 13
The ATA Chronicle - September/October 2021 - 14
The ATA Chronicle - September/October 2021 - 15
The ATA Chronicle - September/October 2021 - 16
The ATA Chronicle - September/October 2021 - 17
The ATA Chronicle - September/October 2021 - 18
The ATA Chronicle - September/October 2021 - 19
The ATA Chronicle - September/October 2021 - 20
The ATA Chronicle - September/October 2021 - 21
The ATA Chronicle - September/October 2021 - 22
The ATA Chronicle - September/October 2021 - 23
The ATA Chronicle - September/October 2021 - 24
The ATA Chronicle - September/October 2021 - 25
The ATA Chronicle - September/October 2021 - 26
The ATA Chronicle - September/October 2021 - 27
The ATA Chronicle - September/October 2021 - 28
The ATA Chronicle - September/October 2021 - 29
The ATA Chronicle - September/October 2021 - 30
The ATA Chronicle - September/October 2021 - 31
The ATA Chronicle - September/October 2021 - 32
The ATA Chronicle - September/October 2021 - 33
The ATA Chronicle - September/October 2021 - 34
The ATA Chronicle - September/October 2021 - 35
The ATA Chronicle - September/October 2021 - 36
The ATA Chronicle - September/October 2021 - 37
The ATA Chronicle - September/October 2021 - 38
The ATA Chronicle - September/October 2021 - 39
The ATA Chronicle - September/October 2021 - 40
https://www.nxtbook.com/nxtbooks/chronicle/20240304
https://www.nxtbook.com/nxtbooks/chronicle/20240102
https://www.nxtbook.com/nxtbooks/chronicle/20231112
https://www.nxtbook.com/nxtbooks/chronicle/20230910
https://www.nxtbook.com/nxtbooks/chronicle/20230506
https://www.nxtbook.com/nxtbooks/chronicle/20230304
https://www.nxtbook.com/nxtbooks/chronicle/20230102
https://www.nxtbook.com/nxtbooks/chronicle/20221112
https://www.nxtbook.com/nxtbooks/chronicle/20220910
https://www.nxtbook.com/nxtbooks/chronicle/20220708
https://www.nxtbook.com/nxtbooks/chronicle/20220506
https://www.nxtbook.com/nxtbooks/chronicle/20220304
https://www.nxtbook.com/nxtbooks/chronicle/20220102
https://www.nxtbook.com/nxtbooks/chronicle/20211112
https://www.nxtbook.com/nxtbooks/chronicle/20210910
https://www.nxtbook.com/nxtbooks/chronicle/20210708
https://www.nxtbook.com/nxtbooks/chronicle/20210506
https://www.nxtbook.com/nxtbooks/chronicle/20210304
https://www.nxtbook.com/nxtbooks/chronicle/20210102
https://www.nxtbookmedia.com