IEEE Power & Energy Magazine - May/June 2018 - 22
One approach is to convert each long series of demand data
to a single two-dimensional (2-D) point that can be plotted
in a simple scatter plot.
least typical 1% of points in black. the remaining points are
divided into two groups with the orange points more typical
than the yellow points. the blue numbers show the ranking
of anomalous points. the most anomalous point (number 1)
corresponds to the data shown in figure 3.
the colors can also be interpreted as corresponding to
highest density regions in the original K -dimensional space.
this way of plotting the data easily allows us to see the anomalies, identify any clusters of observations in the data, and
examine any other structure that might exist.
Visualizing PMU Data
since the first prototype pmus were developed by Virginia
tech in 1988, networked pmus have been rapidly deployed
in the last few years. as of early 2016, china and the united
states have the world's largest pmu networks, with each totaling more than 2,000 pmus in operation. unlike the existing
supervisory control and data acquisition systems that provide
measurements every 2-4 s, pmus can report data, with accurate and precise time stamps, 10-60 times per second. consequently, we receive large volumes of high-dimensional pmu
data continuously, day in and day out. taking 30 pmus, for
example, the system operator needs to manage approximately
15 mB of data per minute, 20 GB per day, 140 GB per week
or 7 tBs per year. the volume of pmu data will increase dramatically when thousands of pmus are installed.
the problem of "too much data, too little information"
must be solved, as it is becoming increasingly difficult
for system operators to make use of the raw pmu data for
real-time decision making. on the one hand, there is an
explosion in the availability of high-rate data streams due
to advances in monitoring pmu devices, leading to data
overload. on the other hand, there is limited understanding
on how to extract actionable information from these dataintensive monitoring devices for real-time monitoring and
control purposes. Big data visual analytics offers a way forward, helping to convert these big data streams into actionable insight in real time and will aid in the development
of next-generation energy management systems. in this section, we will demonstrate the most basic dimension reduction technique, pca, as a fundamental tool for the initial
steps of visualizing pmu data.
A Simple Dimension Reduction Tool-PCA
pca, first proposed in 1901, is one of the most popular
dimension-reduction techniques. using pca, we can remove
ieee power & energy magazine
the correlation between the variables and select only a few
linearly uncorrelated variables to represent the original
data. We can view the pca as a form of orthogonal rotation,
where the new axes can capture the maximum variance of
the data. the orthogonal direction of the maximum variance
can be identified by carrying out eigenvalue and eigenvector
analysis of the covariance matrix of the sample data so that
the maximum variance corresponds to the largest eigenvalues. the transformed new variables are called the principal components, while the first few principal components
can explain most of the variance of the data. thus we only
require a reduced set to represent most of the information
from the original data.
for event detection and diagnosis purposes, we define
two statistics, T 2 and Q. T 2, constructed by the principal
components, is associated with the pca model space and
represents a significant variation of the original data. Q
represents the squared error of the model mismatch and the
variation of the data within the residual subspace. applying
the pca on pmu data, we can analyze many measurement
sets from various locations simultaneously. We will demonstrate the elegance and the beauty of the pca through two
case studies, selected from power networks from Great Britain and ireland.
Case 1: Visualizing Frequency Data
to Distinguish Multiple Events
in the Great Britain Networks
the data used here were recorded from six sites in the Great
Britain networks with a 10-Hz sampling rate through the
openpmu project, with one located in southern england,
one in manchester, and four in orkney islands. the welldocumented event on 30 september 2012 saw a loss of
load at 02:28. later in the same day, a Great Britain-france
interconnector trip event at 15:03 resulted in a Great Britain
frequency drop from 49.97 to 49.60 Hz in a matter of 10 s.
the initial rate of change of frequency (rocof) activated
rocof-based islanding protection, erroneously disconnecting distributed generation.
We can group data from this single day into four different
classes: normal data, loss of load, generation dip, and islanding event. to visualize this in figure 5, we have plotted
seven days of data randomly selected from two locations to
obtain frequency coverage for normal operating conditions.
it ranges from 49.8 to 50.2 Hz, represented by the black dots
surrounded by the red box; this depicts the 99.9% confidence