American Society of Mechanical Engineering’s (ASME) risk-based inspection methodologies are being used to optimize and prioritize equipment overhaul and maintenance, and upgrade decisions. Hartford Steam Boiler Inspection and Insurance Co. (HSB) collaborated with ASME in developing these guidelines, and it used the ASME methodologies to develop its risk-based decision tools for steam turbine generators. The ASME Risk-Based Inspection Guidelines define five primary steps in developing risk-based programs. These are system definition, qualitative risk assessment, system assessment ranking, inspection program development, and economic optimization. In order to differentiate between turbines and generators in several types of service, the team designed a questionnaire that requires the owner or operator to identify equipment design features, monitoring capabilities, past operating and failure history, as well as current operating experience, inspection, and maintenance practices. The STRAP program is presently in the beta-testing phase, where 30 different turbines representing eight different manufacturers and three different industries have been analyzed. Full implementation of the program is expected to occur in the fall of 1998.
A Typical 500-Megawatt steam turbine generator undergoes a major inspection and repair outage every five or six years. During these outages, nominally five weeks in length, the plant produces no electricity and incurs inspection and repair costs usually in excess of $1 million. Yet the deregulation of the power generation industry has put intense economic pressure on power producers to reduce the cost of generating electricity. To achieve such cost reductions, the power producers have been stretching the five-year period between outages to eight, 10, or even 12 years, often with little or no technical justification for the increase.
In 1995, the Hartford Steam Boiler Inspection and Insurance Co. (HSB) set out to develop a qualitative risk-based assessment tool for steam turbine generators to address this issue. Called the Turbine Outage Optimization Program (TOOP), this tool was conceived to quantify risk and use it to estimate the proper time interval between major inspections. A team of experts was assembled that included personnel from New Brunswick Power, Northern Indiana Public Service Co., Wisconsin Public Service Corp., Radian Corp., and HSB.
The risk-based methods used were based on lessons from ASME’s Center for Research and Technology Development (CRTD), which in 1988 initiated efforts to develop risk-based inspection standards. ASME defined risk as the probability of failure multiplied by the consequence of the failure.
This definition was used in the ASME Risk-Based Inspection Guidelines that were developed for the nuclear and fossil industries in the early 1990s.
HSB collaborated with ASME in developing these guidelines, and it used the ASME methodologies to develop its risk-based decision tools for steam turbine generators. When similar tools were required for the steam turbines used in critical process applications and for manufacturing facility electrical equipment, HSB used the same ASME methodologies to develop them.
The ASME Risk-Based Inspection Guidelines define five primary steps in developing risk-based programs. These are system definition, qualitative risk assessment, system assessment ranking, inspection program development, and economic optimization.
The first part of the process is to define the overall system and pertinent lower-level subsystems. After system boundaries and appropriate subsystems have been defined, applicable failure modes and probabilities of failure must be established. Lastly, the failure mode consequences (repair/replacement costs, lost production time, or whatever) for each subsystem component must be defined.
Once the subsystems, failure modes, failure rates, and consequences have been defined, the risk values are calculated by subsystem component, subsystem, and total system. From these results, the risk levels are ranked to identify the highest-risk subsystem components and subsystems. The system total risk is used to benchmark, or risk-rank, with comparable systems. From the risk rankings, it is relatively easy to prioritize and justify maintenance decisions and develop inspection plans to use company resources more effectively.
A number of approaches can be used to establish failure rates and consequences. Failure probability and consequence data may be obtained from company failure data, manufacturer’s data, industry data, and traditional reliability analyses, including failure-mode effects and criticality analyses, fault tree analyses, event tree analyses, and hazard and operational analyses. When such data are not available, a team of industry experts may be assembled to estimate the failure probabilities and consequences based on their experience.
In most risk assessments, only the correct orders of magnitude for failure rates and consequences are required to effectively calculate risk and to risk-rank the system and subsystem results. For these types of analyses, the risk assessment is defined as a qualitative assessment. When there are detailed analyses and data to support the failure rates and consequences, the risk results are defined as a quantitative assessment. As a practical matter, usually extensive data are not available, particularly for low-probability and high-consequence events, so it is necessary to rely on expert opinion and appropriate analyses.
A typical power generation steam turbine, as shown on page 72, consists of a high-pressure (HP) steam turbine, intermediate-pressure (IP) turbine, low-pressure (LP) turbine, and electrical generator. For the purposes of HSB’s TOOP analysis, these major system components are broken down into subcomponents and their corresponding failure modes-fatigue, creep, stress corrosion cracking (SSC), erosion, foreign object damage (FOD), overload electrical breakdown, and so forth, along with probabilities of failure and its consequences. The parts analyzed include only internal turbine and generator subcomponents that require major disassembly for inspection and repair.
Failure rates and consequence data were obtained from HSB’s insurance claims database, from the team of experts, and from available industry data. In TOOP, “consequence” was defined as the cost to repair or replace a failed subcomponent as a result of an unscheduled outage.
To differentiate between turbines and generators in different types of service, the team designed a questionnaire that requires the owner or operator to identify equipment design features, monitoring capabilities, past operating and failure history, as well as current operating experience, inspection, and maintenance practices. From these responses, the baseline subcomponent failure probabilities in the model are raised or lowered, based on the specifics of the unit being analyzed.
Commercial database software was used to perform the risk calculations and to risk-rank and sort the results by subcomponent and failure mode. The largest risk item is plotted first, the risk of the next largest item is added to that risk, and so on for all the subcomponents and associated failure modes. A typical risk ranking for an LP turbine might show, for instance, that the first 12 items represent approximately 90 percent of the total risk. The most effective place to apply company resources would then be on maintenance practices, inspections, and improvements that can reduce the risk—failure mode probability or consequence, or both—of these 12 subcomponent-failure mode combinations. These high-risk areas become the basis for developing risk reduction recommendations, as well as conducting what-if analyses of the recommendations to quantify the cost benefits.
The total cumulative risk of the turbine can then be benchmarked with other similar turbines analyzed by TOOP. LP turbines with low calculated risks are candidates for inspection outage intervals of nine to 12 years, while higher-risk LP turbines are candidates for intervals of three or four years. This plot also allows comparing units on a common basis within a company for prioritizing resources to the higher-risk units.
The TOOP model has provided excellent results and has served well as an independent assessor of steam turbine generator risks. More than 70 units have been analyzed, representing eight steam turbine and nine generator manufacturers, sizes from 14 to 820 MW, operating hours from 12,800 to 328,000, and manufacture dates and metallurgies from 1954 to 1994. A relatively normal distribution of LP turbine risk was achieved with the risk model, regardless of the turbine manufacturer, size, operating hours, and age. Similar results have been achieved for HP turbines, IP turbines, and generators.
These results have been appreciated by TOOP clients.
Dave Fewkes of New Brunswick Power in Fredericton indicated that “TOOP has been an invaluable tool for supporting proper engineering evaluations to extend outages and lower operations and maintenance costs, for finalizing inspection schedules for each turbine component, and for managing equipment historical data.” Terry Jensky of Wisconsin Public Service Corp. in Green Bay, indicated that TOOP analyses have saved his company more than $3.3 million for nine of his fossil steam turbine generators over the life of the machines. Depending on the size and age of their machines, other clients have also achieved considerable savings. For many machines, TOOP has estimated that one to two major outages may be eliminated over the life of a turbine and generator.
Critical Process-Industry Turbines
While the time interval between outages is the major concern for the power generation industry, the availability of steam turbine applications during production operations is the primary concern in the manufacturing and process industries. The photo below shows a typical process steam turbine rotor. When steam turbines are heavily integrated into plant processes, loss of the steam turbine will shut down the process and result in substantial lost production and revenue. Examples of this situation include boiler feed pumps, line shafts for the pulp and paper industry, blowers and generators for the iron and steel industry, and compressors for the refinery, petrochemical, and chemical process industries. In these and other process industries, the cost of the steam turbine, on a relative basis, represents only a fraction of the process plant’s assets. However, each day of lost production attributable to the turbine can result in lost revenue that may reach $1 million per day. This is the critical factor to consider when developing a risk-based analysis tool for the process industries.
To quantify the risks associated with steam turbines in critical process service and assist maintenance staffs in making decisions for these critical turbines, HSB assembled a team of manufacturing, process (refinery and petrochemical), turbine repair, consulting engineering, and insurance industry experts to develop a qualitative risk-based tool, which was named STRAP (Steam Turbine Risk Assessment Program). The team included rotating equipment experts from the Dow Chemical Co., Equilon Enterprises LLC, CF Industries, Stone Container Corp., Hickham Industries, Revak Turbomachinery, Radian Corp., and HSB.
The development process for this risk model was nearly the same as TOOP. A system including the associated components and subcomponents was established for the steam turbine. Because of the wide variety of steam turbine configurations used in the different process industries, these turbines were separated into five different size and speed classes and four different operating regimes. Failure probabilities and consequences were established for each of the different turbine classes, operating regimes, subcomponents, and applicable failure modes.
Risk consequence here is lost production time, expressed in terms of days or in equivalent lost revenue, and/or added expense per day. This approach was used because the cost of the equipment is considered to be inconsequential compared with the amount of lost production revenue. This differs from the situation in the power industry as reflected in TOOP, where consequence was expressed as the cost to repair or replace a failed subcomponent during an unscheduled outage. To account for the differences in service between units, the STRAP model also uses a detailed questionnaire. The responses are used to raise or lower the baseline subcomponent failure probabilities and consequences based on the specifics of the unit being analyzed.
As in the power generation risk model, once the baseline steam turbine subcomponent failure modes, probabilities of failure, and associated consequences have been established, then risk can be calculated. Risk-ranking results for a typical turbine are displayed in the screen shot on page 73. The chart displays the percentage of risk by major component; it also ranks the subcomponent risk’s contribution to the total component risk and ranks the failure-mode contribution percentage for each specific subcomponent. From this information, the risk drivers for each of the subcomponents can be identified, and appropriate recommendations can be established to reduce risk in these areas. Risk reduction recommendations and what-if analysis capabilities are provided in the model, as well as the capability of evaluating the return on investment for implementing the recommendations.
The risk-based analysis method has cost benefits for almost any industry and type of equipment.
As with TOOP, the total risk for the steam turbine can be compared on a relative basis with other steam turbines in a company’s inventory as well as in the industry as a whole. A turbine can be compared with other units in its class, within a company, with comparable industry units, with similar manufacturer’s units, and with similarly driven equipment. This information is important because the results can be used to develop risk-driven inspection plans and to prioritize maintenance actions, spares support, and other turbine decisions for a plant’s or corporation’s fleet of steam turbines.
The STRAP program is presently in the beta-testing phase, where 30 different turbines representing eight different manufacturers and three different industries have been analyzed. Full implementation of the program is expected to occur in the fall of 1998. The results to date have been excellent. One major petrochemical company has saved more than $300,000 for spares as a result of STRAP analyses conducted for it during the beta-testing phase. Spares that would normally have been procured were found to be low in risk (not critical); this justified not making the purchase and saved the company money.
Risk In An Electrical System
A major manufacturer asked HSB to adapt the risk- based process to evaluate the primary electrical distribution system in one of its plants, in order to see if we could determine the change in risk that would occur if some built-in redundancies were reduced and maintenance activities optimized. For this analysis, the risk- based methodologies for the previous two programs were used. Failure frequencies for transformers, switches, and other components were obtained from IEEE Standard 493, the HSB claims database, and the manufacturer’s own experience. For this model the major consequences of concern were lost production time, and equipment replacement and installation costs.
Since there were a number of electrical equipment rooms in the one plant, and decisions had to be made about one equipment room relative to another, the risk assessment process had to be more quantitative. Therefore, fault and event trees were constructed to determine failure rates and consequences more accurately. In addition, once the risks were calculated, safety and economic considerations had to be considered together. To accomplish this, the decision analysis software tool was used to perform pairwise comparisons as the basis for decision making.
These programs were a direct result of participating in the CRTD risk-based analysis process in ASME and then aggressively pursuing the company’s own needs, using what had been learned from the CRTD program. The use of risk-based analysis tools combines the technical and reliability factors of equipment with financial consequences so that the limited company resources available can be applied to the equipment that has the greatest need. The risk-based analysis method is an excellent way to help manage operations and maintenance activities, with cost benefits for almost any industry and type of equipment.
While we have used the term “risk,” risk reduction is, in most cases, analogous to increased reliability. This is especially so in an effective risk management program, where the risk analysis results are effectively used to perform the appropriate inspections and maintenance at the right time.