This post is contributed by Cindy Caldwell, ES&H Senior Technical Advisor, Environment Health Safety & Security, Pacific Northwest National Laboratory. With a strong technical and operations background Cindy has immersed herself in seeking to understand the human and organizational dynamics that shape culture and organizational behaviors. She has been an active contributor and practitioner in the DOE human performance and safety culture improvement activities, the technical lead in the laboratory enterprise risk management initiative, and is pursuing a Phd in organizational science. My thanks to Cindy for these thoughtful and sensitive insights on the Fukumshima Daiichi catastrophe and what we may gain from a greater understanding of sense making and resilience.
Last fall I attended the winter meeting of the American Nuclear Society where I heard a firsthand account of the initial response at the Fukushima Daiichi power plant following the devastating earthquake and tsunami. I was emotionally touched by the heroic efforts of the workers that happened to be on shift that day. More recently I attended the annual meeting of the Health Physics Society and once again had the opportunity to learn more about the actions and reactions of the workers at Fukushima Daiichi. My experience as a reactor operator at the Department of Energy’s Fast Flux Test Facility caused me to reflect upon strength of character and capacity of the workers that was vital for a positive outcome in the worst imaginable situation. In the following essay I describe the first twenty eight hours of the event at Fukushima Daiichi through the lens of sensemaking and provide a reliability perspective on lessons learned from the event. I have based my ideas on notes from classroom presentations, investigational papers and personal accounts. I have also tried to simplify technical details and apologize in advance for any unintended misrepresentations.
Sensemaking is a collective effort among a group of people to comprehend complex events. Through the interaction of our individual cognitive processes, the actions we choose to take, and the reflection that is done in partnership with others we form a shared reality that we take further action upon. High reliability relies on sensemaking to improve organizational resilience by adjusting to demands and detecting and correcting errors. Low probability, high consequence events defy interpretations and impose severe demands on sensemaking. People think by acting, and to sort out a crisis requires an action that simultaneously generates a response that is used for sensemaking and also contributes to the unfolding crisis. Unwitting escalation of crises is especially likely in technologies that are complex, highly interactive, non-routine, and poorly understood (Weick, 1988).
The Fukushima Nuclear Accident: The first twenty eight hours
On March 11, 2011 at 14:46 the largest earthquake in their recorded history struck Japan. The epicenter was approximately 180 km from Fukushima Daiichi. When the earthquake struck, the three operating on site nuclear reactors at Fukushima Daiichi all automatically shut down or “scrammed”. After a reactor scram, residual heat caused by the decay of fission products must be removed by cooling systems to prevent the fuel rods from overheating and failing (IAEA, 2011). Maintaining enough cooling to remove the decay heat in the reactor was the main priority for the workers as the events unfolded on that Friday afternoon.
The earthquake destroyed the external power supply to the power station. As designed, the back-up emergency diesel power generators started and provided the electricity to essential equipment. However, less than one hour after the earthquake a series of tsunami waves arrived that flooded the diesel generators and power distribution panels causing them to fail (IAEA, 2011).
The damage caused by the tsunami created circumstances that exceeded the power plant’s preexisting emergency management framework and made the response extremely difficult. The workers were faced with a difficult situation where the plant status at multiple units at Fukushima Daiichi simultaneously worsened minute by minute. The tsunami resulted in a loss of almost all the functions required for accident response, such as lighting, plant monitoring equipment, communication measures, and reactor cooling equipment (TEPCO, 2011a).
The estimated tsunami height of 13m was much greater than the design basis of 6.1m. The seafloor displacement caused amplification of multiple tsunami waves that was not previously predicted by any expert (Tateiwa, 2012). Since the systems and the procedures in place for responding to an accident were dependent upon the use of these equipment and power sources, it was extremely difficult to respond to the accident on site. The workers had to respond flexibly to the changing situation. Environmental challenges included powerful aftershocks, tsunami debris interfering with outdoor work, and loss of lighting in the main control room, station buildings, and the field. The debris from the tsunami blocked vehicle access so vehicles could not be used to move heavy equipment. Work at night was in complete darkness amidst hundreds of aftershocks and hazards such as open manholes. Power station workers gathered car batteries, brought them into the main control room and used them to provide power to critical indications of the reactor status (Tateiwa, 2012).
The primary goal for the workers at Fukushima Daiichi was to manage the core temperature to ensure the fuel cladding remained intact and operational for as long as possible. Since the operators lost most of their cooling capabilities due to the loss of power, they had to use whatever cooling system capacity they had to get rid of as much heat as possible. As long as the heat production exceeded the heat removal capacity, the pressure continued to increase as more water boiled into steam. So to protect the integrity of the vessel and containment, the operators began preparations to vent steam to control the pressure and inject water to keep the fuel covered. The workers relied on their past experience and knowledge of the plant systems to address the dire situation since procedures did not exist for opening valves using batteries, compressors and gas cylinders or injecting water into the reactor core using fire engines (TEPCO, 2011a). Despite these heroic efforts, it is estimated that approximately four hours after the tsunami struck, the temperature of some of the fuel rod cladding was hot enough to initiate a reaction between the zircaloy cladding and water. This oxidizing reaction produced hydrogen gas (Tateiwa, 2012).
At 20:50 March 11, junior operators were sent to the flooded basement of the turbine building to configure the diesel driven fire pump for cooling water injection into the reactor. Before injection started the pump ran out of fuel and subsequently could not be restarted (Tateiwa, 2012). At the same time, preparations were being made to configure a line using the fire engine. At around 4:00 on March 12 freshwater was injected into the reactor using the fire engine (TEPCO, 2011b).
Working in parallel to relieve the pressure inside the containment vessel, three teams of operators headed to the field at around 9:00 on March 12 to begin venting. Containment vessel venting was done manually. Radiation levels in the reactor building were significantly elevated. Six men (three teams) were selected to open the valves. The first team manually opened the pressure containment vessel vent valve. The second team was unable to open the suppression chamber vent valve due to high radiation levels and working conditions. Eventually this valve was opened remotely (Tateiwa, 2012).
At around 12:00 March 12, the supply of fresh water was running low and preparations were made to inject seawater into the reactor using three fire engines and water from the tsunami. However, at around 15:36 right before the lineup was completed, the Unit 1 reactor building exploded. At some point during enough hydrogen gas built up inside the containment so when it was vented outside containment to the air an explosion occurred. It is thought that the hydrogen leaked from the containment vessel top flange into the reactor building due to the high pressure (Tateiwa, 2012). Design experts had not predicted that enough hydrogen could accumulate in the reactor building and lead to an explosion and worker injuries. The workers once again had their fundamental beliefs about the safety margin shattered. The explosion caused worker injuries and setbacks such as damage to the temporary water hoses, and cables. As the incident wore on, increased radiation levels and fatigue resulted in fear and despair with many of the responders (Tateiwa, 2012).
The explosion damaged the hoses that were to be used for injecting seawater. Injured workers needed to be rescued and carried out. Prior to reentry into the area a team conducted radiation measurements and visual inspection to assess damage and habitability. Hoses needed to be newly laid, so new hoses were gathered from the field’s fire hydrants and highly radioactive debris was cleared. A new seawater injection lineup was completed and the injection of seawater began at around 19:04 March12.
At this time, twenty eight hours into the event the crisis continued to escalate. During the next three days two other hydrogen explosions occurred at Units 3 and 4 (TEPCO, 2011b). The initial shift of operators continued working around the clock for days into the event. It was an unprecedented industrial accident.
Factors Influencing Sensemaking: Defense in Depth
One of the fundamental tenets of nuclear power plant design is “Defense in Depth.” This approach provides a plant design that can withstand severe catastrophes, even when several systems fail. Just as I was trained thirty years ago, the operators at the plant were trained to believe the safety margin provided by defense in depth engineering sufficiently lowered reactor facility risk and the possibility of a catastrophic accident.
When the crisis struck, the workers first tried to make sense of the situation and looked for reasons to enable them to stay the “normal” course. Their reasons were drawn from institutional training, expectations and acceptable justifications. The operating crew of the plant expected that their defense in depth systems design margin would mitigate and control the situation. When these reasons did not help, then sensemaking helped them identify alternative actions.
In the first hours of the crisis the worker’s identities were also challenged. They began their shift as highly trained nuclear technicians that monitored system parameters and worked to a strict set of procedures whose step by step actions have been carefully analyzed. Immediately after disaster struck they transformed into courageous soldiers facing unimaginable scenarios that required them to think and act independently with minimal communications and ingeniously employ every means necessary to prevent disaster. Defense in depth had broken down and the unfolding crisis was under the direct control of the human action of those working on shift that day. The following statements were made by workers in the control room (Tateiwa, 2012, TEPCO, 2011b):
“On March 11, 2011 at 14:46, a large earthquake struck the Fukushima Daiichi NPS. When the earthquake struck, we took refuge under desks and I told operators to hold on. As soon as the earthquake subsided, I could see a green light from my position indicating that a scram had already begun. I confirmed that the emergency power diesel generator had started up and was running and parameters in the main control room were OK, so I thought that the worst was over.”
“After this (around when the tsunami arrived), power lights began to flick, and then I saw they all turned off. The emergency power was shut off, and all of the lights on the main control room panel started to turn off. I did not know what happened. My fears were confirmed when operator was running into the main control room and yelling we’re being flooded with seawater”.
“In an attempt to check the status of Unit 4 diesel generator, I was trapped inside the security gate compartment. Soon the tsunami came and I was minutes away from being drowned when my colleague smash opened the window and saved my life.”
“As the tsunami engulfed us, the emergency power became unusable and lights in the main control room were reduced to one emergency light (making it possible to just barely see within the darkness).”
“We lost the power, and I felt that we could not do anything. The other operators looked nervous. They yelled, “we can’t do anything, why are we still here!?” However I bowed my head and asked them to remain and they did.”
Factors Influencing Sensemaking: Communications and Environmental Conditions
The more information a person has during an event, the more likely they will be able take make the appropriate changes needed to avoid or lessen the crisis. When a crisis unfolds the initial response can have dire consequences for those involved because it sets off a series of actions in a context that allows little room for error (Weick, 1988). In this case harsh environmental conditions and minimal communications hampered the operators’ ability to piece together data and understand what was happening and thus determined the trajectory of the crisis.
Communication became difficult. The site emergency response center lost all remote monitoring capability to determine the status of the plant. Mobile phones were normally used as communication measures within the power station; however, these could not be used due to the loss of power. Communication between the main control room and the site emergency response center at the power station were restricted to two land lines (TEPCO, 2011b). Apart from some cases in which the radios on fire engines were available, information in the field could not be obtained until workers who went to the field returned to report the conditions (TEPCO, 2011a).
It took a great deal of time to retrieve any data on the reactor’s condition and the data was limited. Additionally, some equipment was exposed to conditions that greatly exceeded the environmental conditions that it was designed for. In many cases it was difficult to understand plant status based on independent instrument readings (TEPCO, 2011a). Amidst these circumstances, the power station, utilizing its accumulated knowledge and experience, came up with response actions to inject water to the reactor and open vent valves to stabilize the power plants, and implemented these measures under an extremely poor environment in the field. The following statements were made by workers in the control room (Tateiwa, 2012, TEPCO, 2011b):
“Radiation levels in the main control room rose therefore the Shift Supervisor ordered us to put on the charcoal filtered masks and protective suits. We moved closer to the Unit 2 side where radiation levels were lower and continued to monitor the situation.”
“That was only way to restore the instruments at that time due to loss of time. Car batteries were begun to gather. However, carrying the batteries was difficult due to their weight. It was the worst situation ever.”
“Because the power was lost, we had to vent by manually opening the valves. However, due to high radiation exposure in the field we had to gather who could engage in venting work, and the Shift Supervisor allocated each team. Even though we had full protective gear, the radiation levels were quite high therefore we did not let young operator go.”
“In total darkness, I could hear the unearthly sound of the safety relief valve dumping steam into the torus. I stepped on the torus to open the suppression chamber valve, and my rubber boot melted”
“Unit 3 could explode anytime soon, but it was my turn to go to the main control room. I called my dad and asked him to take good care of my wife and kids should I die.”
Factors Influencing Sensemaking: Capacity and Teamwork
Clearly the capacity of the power plant’s ability to respond was severely depleted. The number and experience base of workers available to respond and interpret the situation was limited and there was an immediate contraction of authority that reduced the overall level of competence directed at the problem as well as an overall reduction in the use of action to develop meaning. However, those left to respond at the station were the most competent to directly deal with the crisis and the reduction in communication at the field level necessitated more independent thinking that likely accelerated the decision making process.
The workers were focused on protecting the integrity of the reactor vessel and containment by venting steam to control pressure and injecting water to keep the fuel covered. These justifications provided a common goal and probably sufficient structure for people to get their bearings and then create fuller, more accurate views of what is happening.
Opportunities to Strengthen Reliability
Reliability contends that safe operations are possible through the use of organizational design and management techniques such as decentralized decision making, redundancy and uniformity, training and organizational learning. Anticipatory and resilience mechanisms stress the collective capacity of the group to compensate for individual weaknesses. Understandably, the post Fukushima accident analysis has focused on design and severe accident management to prevent future accidents. Have we adequately evaluated the human element?
Weick (1988) argues that by striving to make technology operator-proof we move the dynamics of enactment to an earlier point in time where incomplete designs are enacted into unreliable technology. To complement technology, it is equally important to consider strengthening the anticipatory and resilience mechanisms used by workers. Expanding on research by Weick and others, I suggest that accident response could have been enhanced by actively cultivating a collective mindset that builds upon institutional memory. The collective mindset is able to hold an integrated big picture of events in the moment. There is a high level of situational awareness among members that allows for a healthy response to novel situations and crisis (K. E. Weick, Sutcliffe, & Obstfeld, 1999). Institutional memory allows individuals to see alternative possibilities and make connections.
Weick (1988) pointed out that people can see only those categories and assumptions that they store in cause maps built up from previous experience. If those cause maps are varied and rich, people should see more, and good institutional memory would be an asset. However, if cause maps are filled with only a handful of overworked justifications, then perception should be limited and inaccurate, and good memory would be a liability. A collective memory in reactor operations could be cultivated by taking more steps to assure that the composition of the operating crews has the right blend and depth of collective experience as well as overlapping knowledge.
In addition, the use and familiarity of standard operating procedures is fundamental in the training of reactor operators but should be considered from the perspective of building the capacity to respond effectively to a potentially diverse and changing set of stimuli (mindfulness). Levinthal and Rerup (2006) maintained that the effectiveness of the process of mindfulness is dependent on the richness of the set of well-rehearsed routines available for the construction of novel recombination.
Finally I suggest that more emphasis should be placed on the unanticipated. Weick (1988) argued that to increase resilience it is important to look for and exaggerate all possible human contributions to crises in order to spot previously unnoticed contributions that can be leveraged in the future. The relative importance of such exaggeration could be discovery of unexpected places to gain control over crises. Encouraging people to think out of the “box” allows them to discover potential crises of which they may be the primary agents of control. For reactor operators this means performing mental drills that go beyond the conventional simulator scenarios.
The tsunami of March 11th was beyond all expectations. The extreme difficulties that the operators on the site faced at Fukushima Daiichi were incredible: loss of all the safety systems, loss of practically all the instrumentation, lack of human resources, lack of equipment, total darkness inside buildings, tsunami debris, hydrogen explosions and high levels of radiation The image unfolding before the operating crew on March 11, 2011 was beyond their ability to envision and challenged their ability to respond. Despite the inadequacies of the defense in depth provisions the collectively responders made sense of the situation and took action that ultimately prevented catastrophic failure of the reactor vessel.
The Fukushima accident has recharged the debate of whether tightly coupled highly complex systems such as nuclear power plants can mitigate catastrophic accidents through reliability. The nuclear power industry’s efforts to enhance the capability for reliability was evidenced in the Fukushima response, but I suggest that there are additional steps that can be taken to increase the capacity of a crew of operators to respond to a catastrophic event and help to mitigate the inevitable consequences of complex technology.
International Atomic Energy Agency (2011). Mission Report: The Great East Japan Earthquake Expert
Mission Iaea International Fact Finding Expert Mission Of The Fukushima Dai-Ichi Npp Accident Following The Great East Japan Earthquake And Tsunami. Retrieved from http://www-pub.iaea.org/MTCD/Meetings/PDFplus/2011/cn200/documentation/cn200_Final-Fukushima-Mission_Report.pdf
International Atomic Energy Agency (2012). One year on: The Fukushima Nuclear Accident and Its Aftermath. Retrieved from http://www.iaea.org/newscenter/news/2012/fukushima1yearon.html
Tatiewa, K. (2012, July). Fukushima Nuclear Accident: A TEPCO Nuclear Engineer’s Perspective. Presented at the AAHP course at the Annual meeting of the Health Physics Society, Sacramento, CA.
Levinthal, D., & Rerup, C. (2006). Crossing an apparent chasm: Bridging mindful and less-mindful perspectives on organizational learning. Organization Science, 17(4), 502-513.
Tokyo Electric Power Company (2011a). Fukushima Nuclear Accident Analysis Report (Interim Report). Retrieved from http://www.tepco.co.jp/en/press/corp com/release/betu11_e/images/111202e14.pdf
Tokyo Electric Power Company (2011b). Fukushima Nuclear Accident Investigation Report (Interim Report – Supplementary Volume). Retrieved from http://www.tepco.co.jp/en/press/corp-com/release/betu11_e/images/111202e16.pdf
Weick, K. (1988). Enacted sensemaking in crisis situations. Journal of Management Studies, 24(4), 305 – 317.
Weick, K. E., Sutcliffe, K. M., & Obstfeld, D. (1999). Organizing for high reliability: Processes of collective mindfulness. In R. S. Sutton & B. M. Staw (Eds.), Research in Organizational Behavior (Vol. 1, pp. 81-123).