IEEE Power & Energy Magazine - May/June 2018 - 62
measurement unit (PMu) data. it proves that the necessary
and sufficient condition for the instability of a multimachine system is that at least a pair of complementary groups'
kinetic energy exceeds the corresponding potential energy
barrier. the stability limit of the multigeneration system
model depends on the most critical pair of groups. thus,
the causality reasoning between the perturbed trajectory set
(big data set) and system stability margin (single scalar data)
supplements the experiential and statistical interpretations
for the disturbed trajectory stability. in this way, the experience-based analysis is promoted to a rigorous mechanism
analysis. in other words, the eeac determines the objectively existing yet little-known causality relationship in the
the eeac consists of three algorithms: static eeac
(time-variant factors are omitted), dynamic eeac (timevariant factors are partially considered), and integrated eeac
(time-variant factors are fully considered). the accuracies of
these algorithms increases successively, at the cost of additional computational burden. we can apply these three eeac
algorithms sequentially to coordinate the accuracy and computational burden in system stability analysis. the question
remains why the stabilities in many cases (denoted type a
cases) can be accurately calculated using the analytic static
eeac, while others (denoted type b cases) must be calculated
by integrated or dynamic eeac to judge their stability. by
analyzing and verifying the intermediate results from a large
number of cases, we find that errors exist in the static eeac
compared with the dynamic eeac because the step size is too
large to reflect the time variability of the mapping system. the
computational burden can be dramatically reduced when we
distinguish type a from type b cases.
when applying the eeac to a practical power system,
we can use the statistical technique to determine whether
the whole analysis process can be directly terminated at the
stage of the static eeac or at the stage of the dynamic eeac.
for example, when the results given by the dynamic eeac
are sufficiently close to those given by the static eeac, it is
unnecessary to use the integrated eeac. thus, the computation cost can be significantly reduced.
we can seen that the eeac theory reflects a good combination between causal reasoning and statistical analysis.
statistical analysis helps to find the mechanism behind highdimensional big data; causal reasoning is able to reveal the
principle of stability and conduct the analytical analysis.
in turn, statistical analysis can be applied to accelerate the
speed of analytical analysis.
Big Data for Power Equipment
big data technology also improves power equipment management. the data accumulated in the production management system and the energy management system are expanding.
the large scale of electric power equipment monitoring
data provides valuable information that can be extracted
using big data techniques. Data cleaning and fault diagnosis
ieee power & energy magazine
techniques are two typical applications in power equipment management.
Power equipment big data include
1) data from the equipment itself-nameplate parameters, equipment inspection data, maintenance and
operation information, and defects, faults, and maintenance records
2) power equipment monitoring data-power generation
and storage condition
3) power grid data-power system dispatching information, bus load and generation data, real-time voltage
and current data, and active power and reactive power
4) external data-geographic information system data
and meteorological data.
Data can become inaccurate due to anomalies and missing data points, which may be introduced accidentally or deliberately during the process of equipment data acquisition,
transmission, and storage. Data cleaning ensures the validity,
consistency, and integrity of the whole data set. figure 7 shows
the main procedures of the data cleaning method applied in
the power system. it contains three main processes: 1) missing data monitoring and completion, 2) anomaly data detection
and correction, and 3) data quality evaluation.
the missing data are first detected and categorized as
either small vacancy and big vacancy. simple algorithms such
as interpolation estimation, smoothing, or a regression method
can complete the small vacancy; intelligent algorithms such
as cluster analysis, the artificial neural network (ann), and
svM are more suitable to handle large vacancy. second, the
anomaly states of the data are detected by calculating the
change rate, mean, variance, and range of the monitoring data
under different anomaly modes. filtering algorithms are then
used to remove noise from the anomaly data. finally, the data
quality evaluation index set is used to judge whether the modified data are within a normal interval. if it does not pass the
quality evaluation, the data set should be cleaned again until
the quality requirements are met.
big data technologies also help improve power equipment
condition assessment and fault diagnosis to reduce the probability of equipment failure and strengthen the reliability and
economy of the power system. traditional fault diagnosis
methods rely mostly on a single parameter threshold so that
a considerable amount of false positive and false negative
results cannot be avoided. a fault diagnosis method based on
the large data set and multidimensional feature parameters
can effectively improve the diagnostic accuracy and reduce
the probability of equipment failure.
first, the correlations between various monitoring data
and the fault mode, fault location, and fault severity are
evaluated using correlation analysis algorithms such as gray
correlation. Key features of such correlation are identified.
then, the change patterns of the monitoring data under
different failure modes are identified using supervised or
unsupervised learning algorithms such as high-dimensional