Step 1: Securing the Site
Once the emergency response plan has been executed, the first step is to secure the site. Securing the site is for safety purposes to ensure no one gets hurt or no one else gets hurt. It is just as important to preserve as much of the evidence, or “clues” if you will, as possible. That might mean taking measures to ensure access to the incident site is appropriately restricted and controlled. Remember that an Incident Investigation is much less dramatic than what you would see in, say, an episode of “CSI,” but it is also much more thorough and certainly cannot be wrapped up neatly in an hour like it would on the television show. It can take days, weeks or perhaps months to get a complete picture of what caused an incident. To get to the bottom of what caused an incident, a thorough Incident Investigation should follow a methodical scientific approach.
Step 2: Witness Testimony
The second step, after the location is secured, is to get eye-witness testimony as soon as possible. You can determine the truthfulness or reliability of the information later, but in the early going it is important to talk to witnesses to hear their recounts while it’s fresh in their minds. Just like any good detective, the investigator must be objective about the circumstances and will need to talk to all the witnesses to determine the who, what, where, when, why and how. Later, you can talk to key people who might not have witnessed the incident, but can offer insight into the area where the incident occurred, the technology present, employee and maintenance schedules, etc. In the early going, however, it’s important to hear and document firsthand accounts.
Step 3: Selecting a Lead Investigator
In conjunction with Step 2, if an investigator has not been identified yet, then it is time to choose one. Depending on the significance of the event, it is important to decide whether to use internal or external investigators and determine the investigation team where applicable. In the case of a major event, an external investigator might be preferable. You can also choose a combined internal and external approach to examining the incident. Regardless of what you decide, it’s important to appoint a lead investigator to help the Incident Investigation run smoothly. Often supervisors conduct investigations, but the U.S. Occupational Safety and Health Administration (OSHA) recommends a collaborative approach to an investigation. “To be most effective,” OSHA explains, “investigations should be conducted by a team in which managers and employees work together since each brings different knowledge, understanding and perspectives to an investigation. Working together will also encourage all parties to ‘own’ the conclusions and recommendations and to jointly ensure that corrective actions are implemented in a timely manner.”
Steps 4 & 5: Notifying Authorities and Evaluating Legal & Insurance Issues
We won’t spend too much time on this here, but after you’ve selected your investigative team, it’s important that you contact your legal team, insurance company and, especially, the pertinent regulatory authorities in your country.
In the United States, OSHA, for example, has strict guidelines for reporting injuries and deaths. If there’s a death, it must be reported within eight hours, and if there’s an amputation, loss of an eye or hospitalization, it must be reported within 24 hours. To help get organized, OSHA also has a handy incident investigation form to help companies fill out their documentation. The Canadian Federal Workers’ Compensation Service, on the other hand, requires companies to report incidents within three days; and under the U.K. Health and Safety Executive agency’s Reporting of Injuries, Diseases and Dangerous Occurrences Regulations (RIDDOR) rules, incidents must be reported within 10 days.
Selecting a Team
Similar to a schoolyard draft for teammates where you’d likely select a soccer/football player rather than a tennis player for a game of kickball, you want to select the best people for the investigative job. That means determining who would be the most helpful to the team by having a good idea of what needs to be investigated. Is it a liquid process incident? Gas? Mechanical? Ask yourself: “What level of expertise would come in handy?” Keep in mind that you might have to pull in different experts at different times during the investigation ranging from process engineers to mechanical engineers to metallurgy specialists, etc.
It’s important to explain to the team not to jump to conclusions. That said, the team’s job is to form a working hypothesis of what caused the incident and why.
Step 6: Develop a Working Hypothesis & Sequence of Events
Now that you’ve tapped a lead investigator, selected a team, interviewed the key witnesses and contacted the appropriate people and regulatory organizations about the event, it’s time to get out your Sherlock Holmes magnifying glass and start the investigation. You’ll want to take lots of photographs, collect as many relevant samples as you can and possibly enlist a disassembly team to take apart equipment that needs to be examined thoroughly. Make sure your team wears the proper personal protective equipment when investigating, and don’t discard anything unless it has been properly documented and examined by the investigative team.
Forming a Working Hypothesis
Before conducting any science experiment, it’s important to develop a hypothesis of what you think happened. The same holds true with any Incident Investigation. For example: “My hypothesis is that the valve on the gas compressor failed because it was old and that caused too much gas to enter the separator vessel. This led to an excessive pressure buildup.” (This is obviously a much more simplistic hypothesis than what one would actually produce during a real investigation.) The next step is to test the hypothesis or, in the case of an Incident Investigation, figure out why an incident or near-miss occurred.
It’s not uncommon for there to be multiple hypotheses in the beginning of an investigation, but it’s important to boil these down to one working hypothesis through a screening method. One way of doing that is to assign a score or point total to an alternate hypothesis based on the evidence and research. For instance, on a 10-point scale, how likely is this hypothesis? If it’s not very likely, it gets assigned a low number from the team, but if it’s very likely, it gets a higher number attached to it. Keep in mind that, in many cases, the working hypothesis will evolve as more information comes to light.
EXPERT TIP: Most companies focus on the incident itself with good reason, but one area that sometimes gets overlooked is organizing a disassembly team to take apart equipment to be analyzed later. You want to include workers on the team who can handle the job without contaminating or getting rid of any potential evidence about what might have caused the incident.
Validating Your Working Hypothesis
As we mentioned in the beginning of this e-book: Things are not always as they seem. That’s why after establishing a working hypothesis, it’s time to go out and see if it holds true.
While conducting interviews and gathering facts, begin the process of event sequencing. This is a critical step in determining what happened first, second, third, etc. This will help you sniff out the prime root cause, which we will discuss shortly. Every incident has a sequence of events that took place to cause it. Think of it this way: If you’re in a car accident, there are multiple things that had to happen for the accident to take place. Maybe you were driving 30 mph, but the car behind you was tailgating. A deer ran out into the street, so you hit the brakes to avoid it. The car behind you was too close, so it couldn’t stop in time. This is a simple example, but remember that an Incident Investigation can often be quite complex.
"Eliminating the immediate causes is like cutting weeds, while eliminating the root causes is equivalent to pulling out the roots so that the weed cannot grow back."—OSHA’s “Incident Investigations: A Guide for Employers
One commonly used playbook for event sequencing is the Sequentially Timed Events Plotting (STEP) process developed by Kingsley Hendrick and Ludwig Benner Jr. in the book “Investigating Incidents With STEP.
In figure 3 below, you will find an example schematic for part of a gas compression train. In a situation where there has been a rupture incident of the V-2 vessel as seen in figure 4 on the opposite page, this is an example of the STEP process being used to “describe” the incident. This is based on the working hypothesis of the investigative team supported by the factual data and established timeline of the incident.
As the book states, “[A]ccidents should be investigated in a way that is compatible with the way a productive process is designed. All processes involve actors, whether people or things, who act to introduce changes. Those actions are called events. Changes of state occur as events interact during processes. Processes start with the first event which initiates a change of state and end when a new state or outcome has been reached.”
Step 7: Finding the Prime Root Cause and Assigning Relevant Corrective Actions
There can be any number of root causes when investigating an incident, but there will be just one prime root cause. And this is what investigators are after—answering the “how” and the “why” of the incident. There’s a lot of literature out there on the topic going back decades, including Herbert Heinrich’s “five domino model”—a sequential accident model designed to determine what caused an accident—and hazard and barrier analysis, which offers strategies to prevent accidents. The goal is, to use OSHA’s terminology, to do more than “cut the weeds” that caused an incident and, instead, “uproot” it completely by determining the root cause.
Today, we often see companies employ the “5 Why” methodology discussed in Figure 3, Casual Tree Analysis, Fishbone Diagram (also known as Cause and Effect Diagrams) and Fault Tree Analysis (FTA). The goal of these methodologies is to get to the root cause of the incident beyond the easy-road answer of saying “It was a human error” or something to that effect. Keep in mind that we’re not saying human errors don’t play a role in an incident, but it’s important to go beyond the obvious to establish things like “Was the operator properly trained?” and “Were there changes involved in the process?”
Figure 4: Why x 5
When a child asks “why” over and over again, it might grate on the nerves a bit, but keep in mind that asking lots of questions is part of a child’s learning process. The same question-and-answer approach can help an investigator learn what the root cause of an incident or near-miss was, so don’t be afraid to ask lots of questions.
Investigations often don’t go far enough in identifying the root cause. In fact, they often end when it is determined that someone can be blamed for an incident. Sometimes the root cause is listed as “operator made a mistake while following procedure X,” and the action listed is to “provide more training.” But the root cause has not been identified: Why was the mistake made? What other underlying factors were in play?
Establishing a root cause is the “golden goose” in helping to put a meaningful, corrective plan in place to ensure the incident does not occur again. However, don’t underestimate the possibility that there are multiple root causes that led to the incident. There are a number of causation theories to consider, including Heinrich’s Domino Theory of Causation, which says accidents are the result of a chain of events that fall in line much like dominoes; the Multiple Causation Theory is similar but says there could be multiple contributing causes and subcauses that led to the incident; and there are several root-cause theories, including the aforementioned “5 Whys.”
A serious incident is traumatic for everyone involved, especially for the people who witnessed the event and/or knew someone who was hurt or even killed. Often, an investigator has to be the shoulder to cry on while gathering as much information as possible from the key witnesses. Being compassionate yet thorough is most important. It’s not an easy job, but it’s an important one. It’s incumbent upon investigators to follow and enforce strict protocol to ensure the integrity of the investigation. In the end, the goal is to not only figure out what caused the incident but also to implement the proper procedures, protocols, etc., to ensure it doesn’t happen again. What we know is that if proper attention is not paid to causal factors and contributory causes, it will negatively impact the amount of knowledge gained from the incident.
In our experience, companies often fail to follow through on implementing any prescribed corrective actions, and not doing so can compromise the future safety of the organization. Even so, an investigator cannot force a company to implement corrective actions following an incident, but an investigator can offer the company critical steps it can take to help prevent future incidents and explain the importance of taking those steps. It’s then up to the organization to take the reins on mitigating any potential future risks.
Without taking the proactive steps prescribed based on the Incident Investigation, the case will never be properly closed.
“‘It is not necessary to be solely aware of how an incident occurred without a detailed understanding of the mechanisms of occurrence and how it might have been prevented." —Nigel Hyatt, P.Eng. (Ontario), chartered engineer U.K., member of the Institution of Chemical Engineers, retired Process Safety Management expert