By
Sunil Kumar,
Hari Babu,
Shankar Kumar Choudhary,
Vinit Kumar Mathur
and
Hridayeshwar Jha |
Background
A Data Warehouse (DWH)
was recently installed at the LD2 and Slab Caster Shop to make data
available to plant engineers in a user-friendly manner and for
conducting analysis linked to improvement projects. The DWH is designed
to store large volume of data generated on different computer (hardware
/ software) platforms, at the several sections of the shop including hot
metal pretreatment, primary and secondary steelmaking, slab casting and
conditioning. The Data Warehouse facility was developed using the
technology of IBM that includes DB2 UDB for windows, DB2 Warehouse
Manager, DB2 OLAP Server and Intelligent Miner. COGNOS Impromptu was
employed to develop standard reports. COGNOS Impromptu and Power-play
are end-user tools for viewing reports and analysis cubes respectively.
Thus, in addition to storing vast volume of data, the DWH facility is
equipped with number of special tools for conducting data analysis. The
focus of this paper is Data Mining Technology which makes available a
wide range of special techniques for problem solving and knowledge
creation.
What is Data Mining?
Data Mining, in simple
terms, is extraction of previously unknown and/or potentially useful
information from a historical database. Data Mining algorithms are based
on techniques such as hypothesis generation, classification, prediction,
association rules or affinity grouping and clustering. A brief
description of these techniques is presented in Table 1 for reference
purpose. The algorithms are quite powerful and have the ability to
process large quantities of data and discover meaningful patterns and
rules about a business process. The real power of these techniques is
its inherent ability to reveal “hidden” patterns or relationships even
in complex data-sets that can be associated with a high degree of
uncertainty.
The methodology
followed for Data Mining analysis is presented schematically in Figure
1. The steps are not clear-cut, and a number of iterations are required.
Before Data Mining
analysis can start, it is necessary to describe the data at hand — i.e.,
summarize its statistical attributes (mean and standard deviation
values), visually review the data using charts and graphs, and look for
potential meaningful links among variables.
Data Mining and Other
Analysis Techniques
Data Mining is
different from other analysis techniques such as OLAP (on-line
analytical processing) and statistical methods. While Data Mining is
certainly different from these techniques, it is a fact that these tools
are complimentary to each other and when applied

Fig 1:Methodology of Data Mining Analysis |