IEEE Systems, Man and Cybernetics Magazine - April 2020 - 11

VAT family of algorithms). Various members of the VAT
family have been applied to many applications, including
image segmentation, urban mobility, transportation,
speech processing, biomedical applications, social media,
and Web data analytics, on a variety of real-life data sets
with diverse characteristics and properties.
We hope that this detailed and systematic survey of the
VAT family of algorithms and their applications will help
researchers choose a useful member of the VAT family to
help them understand structural details in their data. This
article includes pseudocode for a suite of 25 algorithms in
the VAT family of models, and the MATLAB implementation of selected algorithms are available on GitHub [1].
Clustering Tendency as a Tool for
Exploratory Data Analysis
Unsupervised data-mining techniques, such as data clustering, are an important part of EDA, which aims to summarize
and visualize the main characteristics of the data before
developing complex statistical models and testing various
hypotheses about structure in the data. Recent advances in
sensing and storage technology and dramatic growth in
applications on the Internet, digital imaging, video surveillance, and the Internet of Things (IoT) have accelerated the
growth of data collection. With the ever-increasing availability of data across different disciplines, data clustering as a
fundamental tool for EDA has gained more significance.
Data clustering aims to divide the data into several groups
such that data points in each group are more similar to each
other in some well-defined sense than to the points in other
groups. Various clustering techniques have been developed
over the years by researchers in many fields, including taxonomists, social scientists, psychologists, biologists, statisticians,
mathematicians, engineers, computer scientists, and medical
researchers [2]. Some of the most popular clustering
approaches include hierarchical clustering (agglomerative and
divisive), centroid-based approaches (k-means, fuzzy c-means,
and so on), density-based algorithms [e.g., density-based spatial clustering of applications with noise (DBSCAN) and ordering points to identify the clustering structure (OPTICS)], and
distribution-based clustering [expectation maximization,
Gaussian mixture model (GMM), and so on] [3]-[5].
A natural question to ask before applying any clustering
method to a data set is, "Does this data set contain any clusters and, if so, how many?" A major issue with unsupervised
machine learning is that clustering methods will return
clusters that satisfy the constraints of the algorithm that
produces them, even if the data do not contain any clusters.
Blindly applying a clustering analysis to a data set will
divide the data into clusters, even if there are none, because
that is what the algorithm is supposed to do. Therefore,
before applying a clustering approach to a data set, the analyst must decide whether or not the data set contains meaningful clusters (i.e., nonrandom structures).
The issue of deter mining whether clusters a re
present as a step before actual clustering is called the
	

clustering-tendency-assessment problem. Unfortunately,
this has received very little attention in the pattern-recognition and exploratory-data-analysis literature. Some techniques for clustering-tendency assessment are discussed
in [3] and [6], and they can be broadly split into two categories: statistical and visual. The statistical approaches to
clustering tendency assessment, such as the dip test [7],
Silverman test [8], and Hopkins statistics [9], apply the random-position hypothesis to check whether or not the data
are generated from a continuous uniform distribution. A
detailed description of such techniques can be found in [3]
and [10]. Although these statistical approaches determine
whether it is worth looking for clusters in the given data
set by applying clustering algorithms, they provide little
information about how many clusters to look for, an input
required by many clustering algorithms. Moreover, statistical tests determine only whether the data fail to satisfy a
distributional assumption, so they impose a strong constraint on the definition of clusters in the data.
Another class of clustering-tendency-assessment
approaches uses visual techniques to indicate whether or
not the data set contains possible clusters and, if so, how
many clusters to seek. Bezdek and Hathaway [11] introduced a visual approach for assessing cluster tendency,
VAT, that can be used in all cases involving numerical
data. VAT uses a variant of Prim's algorithm [12] to perform matrix reordering (seriation) of the pairwise dissimilarity matrix to generate a reordered dissimilarity matrix,
which, when viewed as a monochrome image (called an
RDI, or cluster heat map), shows possible clusters in the
data set by dark blocks along the diagonal.
The literature of the VAT family of algorithms and its
applications is large and varied, with papers published in
many different journals and conference proceedings. This
diversity makes it difficult for researchers to follow recent
developments and determine the applicability of these
algorithms to their data. We believe that a survey of the
VAT family of algorithms and their applications is timely
and hope that it will be helpful for researchers choosing a
useful visualization algorithm for EDA.

6
5
4
3
2
1
0
-1
-2
-2

0

2

4
(a)

6

8
(b)

Figure 1. The data scatterplot VAT images for

N = 5,000 Gaussian clusters: (a) data set of N = 5,000
and (b) VAT image of I (D *E (X )).

Ap ri l 2020

IEEE SYSTEMS, MAN, & CYBERNETICS MAGAZINE	

11



IEEE Systems, Man and Cybernetics Magazine - April 2020

Table of Contents for the Digital Edition of IEEE Systems, Man and Cybernetics Magazine - April 2020

Contents
IEEE Systems, Man and Cybernetics Magazine - April 2020 - Cover1
IEEE Systems, Man and Cybernetics Magazine - April 2020 - Cover2
IEEE Systems, Man and Cybernetics Magazine - April 2020 - Contents
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 2
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 3
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 4
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 5
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 6
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 7
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 8
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 9
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 10
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 11
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 12
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 13
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 14
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 15
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 16
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 17
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 18
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 19
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 20
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 21
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 22
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 23
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 24
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 25
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 26
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 27
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 28
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 29
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 30
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 31
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 32
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 33
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 34
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 35
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 36
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 37
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 38
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 39
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 40
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 41
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 42
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 43
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 44
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 45
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 46
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 47
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 48
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 49
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 50
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 51
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 52
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 53
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 54
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 55
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 56
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 57
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 58
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 59
IEEE Systems, Man and Cybernetics Magazine - April 2020 - 60
IEEE Systems, Man and Cybernetics Magazine - April 2020 - Cover3
IEEE Systems, Man and Cybernetics Magazine - April 2020 - Cover4
https://www.nxtbook.com/nxtbooks/ieee/smc_202010
https://www.nxtbook.com/nxtbooks/ieee/smc_202007
https://www.nxtbook.com/nxtbooks/ieee/smc_202004
https://www.nxtbook.com/nxtbooks/ieee/smc_202001
https://www.nxtbook.com/nxtbooks/ieee/smc_201910
https://www.nxtbook.com/nxtbooks/ieee/smc_201907
https://www.nxtbook.com/nxtbooks/ieee/smc_201904
https://www.nxtbook.com/nxtbooks/ieee/smc_201901
https://www.nxtbook.com/nxtbooks/ieee/smc_201810
https://www.nxtbook.com/nxtbooks/ieee/smc_201807
https://www.nxtbook.com/nxtbooks/ieee/smc_201804
https://www.nxtbook.com/nxtbooks/ieee/smc_201801
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_1017
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_0717
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_0417
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_0117
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_1016
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_0716
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_0416
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_0116
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_1015
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_0715
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_0415
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_0115
https://www.nxtbookmedia.com