IEEE Systems, Man and Cybernetics Magazine - April 2023 - 30

with densely labeled tracks that allows more fine-grained
and audiovisual analysis for AD.
Dataset
Overview of the MuMMER Dataset
The MuMMER dataset [35] was built to facilitate re -
search in robot perception, particularly for the evaluation
of perception approaches. It consists of audiovisual
recordings of people interacting with the robot in an
open environment using a WoZ approach. The data are
collected using the Pepper robot [48]. As shown in Figure
2(a), the robot has one frontal color camera and one
Asus Xtion depth camera that works at 8 (respectively, 5)
frames per second (fps) at a resolution of 640 × 480. In
addition, during the recording, an external moving Intel
D435 camera was placed on the top of the tablet, and a
static Kinect v2 camera was fixed behind the robot to
shoot the entire scene. Figure 2(b)-(d) shows the view of
the three camera settings. With both static and moving
cameras, the dataset has the advantage of both: the Intel
one moves as the robot is moving, while the Kinect one
is static. The Kinect camera and D435 were set to 15 fps
at resolutions of 960 × 540 and 1,280 × 720, respectively.
The audio streams are recorded with the four-channel
microphone arrays, with the Kinect and the robot at a
frame rate of 48 kHz. The technical details can be found
in the MuMMER article [35].
The MuMMER dataset scenario was designed such that
two or more participants interact with the robot (chatting,
quizzing, performing a guidance task, telling the news, or
telling jokes), while additional people are simply spectating.
Unlike most HRI datasets, the MuMMER dataset
allows the interactors to speak to each other and the
robot. These scenarios involve a multiparty conversation
Kinect
Intel
R
(a)
(b)
between the robot and people. Different participants were
involved in the conversation to make the scenario more
realistic. These participants are passersby in the background
troublemakers trying to distract the robot's attention,
people leaving the field of view when the robot is
giving a direction, and, potentially, people coming back
after a while to tell the robot about the early direction or
recommendation it made earlier. These settings make the
MuMMER dataset unique and more realistic. The dataset
has been recorded, split into 33 sessions, for a total duration
of 1 h 29 min. The short session is 1 min 6 s, and the
longest session spans up to 5 min 6 s. Each frame consists
of a maximum of nine people. During the scenario recording,
28 participants and 22 protagonists took part. Most
frames have two people, while 25% of them have at least
one spectator not interacting with the robot [35]. The
MuMMER dataset consists of 80,488 Kinect color frames,
80,865 Kinect depth frames, 47,023 robot color frames,
23,450 robot depth frames, 80,346 Intel color frames, and
80,310 Intel depth frames. Out of these, it consists of
506,713 annotated faces with their identities.
E-MuMMER Dataset
Ignoring the depth frames, we only selected the Kinect
color frames and Intel color frames, for a total of 2 h 58 min.
We skipped the video recorded by the robot color camera
for two reasons: it was recorded at 8 fps, which does not
match with the other cameras, and it consists of fewer
frames that contain face regions than the Kinect and
Intel cameras. Most of the frames miss the speaking person's
face because humans stand very close to the robot
[shown in Figure 2(b)], and their upper body is not visible.
DaVinci Resolve software [49] has been utilized to resolve
audio and visual synchronization problems during the scenario
recording.
Addressee Annotation
The addressee labels are generated by
human annotators using the interface
shown in Figure 3. Each frame consists of a
bounding box around visible faces. The
activity timeline below the window depicts
the audio waveform to label speech start
and end timestamps. The speech segment
start and end timestamps ([(, ),tt
(, ), ,tt f (, )])tt
se
se
11
sn en
00
were selected manually.
The activity of each face was labeled
according to the activities within that
boundary, where ts0
timestamp of the segment at index zero, teo
(c)
(d)
Figure 2. (a) An example scene and camera settings. (b)-(d) The
views from the three cameras [35]: the (b) robot (R), (c) Kinect V2,
and (d) Intel D435.
30 IEEE SYSTEMS, MAN, & CYBERNETICS MAGAZINE April 2023
indicates the end timestamp segment at
index zero, and n is the size of the speech
segment in the video.
There are two speech activity types in
E-MuMMER: " Speaking to the robot/
addressing the robot, " labeled as " 0, " and
indicates the start

IEEE Systems, Man and Cybernetics Magazine - April 2023

Table of Contents for the Digital Edition of IEEE Systems, Man and Cybernetics Magazine - April 2023

IEEE Systems, Man and Cybernetics Magazine - April 2023 - Cover1
IEEE Systems, Man and Cybernetics Magazine - April 2023 - Cover2
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 1
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 2
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 3
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 4
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 5
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 6
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 7
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 8
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 9
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 10
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 11
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 12
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 13
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 14
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 15
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 16
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 17
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 18
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 19
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 20
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 21
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 22
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 23
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 24
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 25
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 26
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 27
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 28
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 29
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 30
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 31
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 32
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 33
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 34
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 35
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 36
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 37
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 38
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 39
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 40
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 41
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 42
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 43
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 44
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 45
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 46
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 47
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 48
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 49
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 50
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 51
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 52
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 53
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 54
IEEE Systems, Man and Cybernetics Magazine - April 2023 - Cover3
IEEE Systems, Man and Cybernetics Magazine - April 2023 - Cover4
https://www.nxtbook.com/nxtbooks/ieee/smc_202310
https://www.nxtbook.com/nxtbooks/ieee/smc_202307
https://www.nxtbook.com/nxtbooks/ieee/smc_202304
https://www.nxtbook.com/nxtbooks/ieee/smc_202301
https://www.nxtbook.com/nxtbooks/ieee/smc_202210
https://www.nxtbook.com/nxtbooks/ieee/smc_202207
https://www.nxtbook.com/nxtbooks/ieee/smc_202204
https://www.nxtbook.com/nxtbooks/ieee/smc_202201
https://www.nxtbook.com/nxtbooks/ieee/smc_202110
https://www.nxtbook.com/nxtbooks/ieee/smc_202107
https://www.nxtbook.com/nxtbooks/ieee/smc_202104
https://www.nxtbook.com/nxtbooks/ieee/smc_202101
https://www.nxtbook.com/nxtbooks/ieee/smc_202010
https://www.nxtbook.com/nxtbooks/ieee/smc_202007
https://www.nxtbook.com/nxtbooks/ieee/smc_202004
https://www.nxtbook.com/nxtbooks/ieee/smc_202001
https://www.nxtbook.com/nxtbooks/ieee/smc_201910
https://www.nxtbook.com/nxtbooks/ieee/smc_201907
https://www.nxtbook.com/nxtbooks/ieee/smc_201904
https://www.nxtbook.com/nxtbooks/ieee/smc_201901
https://www.nxtbook.com/nxtbooks/ieee/smc_201810
https://www.nxtbook.com/nxtbooks/ieee/smc_201807
https://www.nxtbook.com/nxtbooks/ieee/smc_201804
https://www.nxtbook.com/nxtbooks/ieee/smc_201801
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_1017
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_0717
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_0417
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_0117
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_1016
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_0716
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_0416
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_0116
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_1015
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_0715
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_0415
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_0115
https://www.nxtbookmedia.com