Beyond Human Failure
Avoiding the oversimplification of accidents
When an employee falls from height, it always means great suffering for many people. Frequently this kind of incident results in severe injuries or even has a fatal ending.
What is particularly bitter for the workers surviving the accident or the mourning families of the fatally injured ones is the fact that very often these accidents could have been prevented in the first place.
The following cases show incidents during practical consulting experiences.
A worker carried out his work on a facade using travelling scaffolding, which had a work platform made of wooden boards at about 10-metres height. He received formal work permission for this job and a supervisor was present at the site. When a consultant observed his work during a routine visit with the factory manager, he advised the manager to stop immediately, after asserting the following facts:
- The simple boards of the platform at an elevation of 10-metres started to bend under the weight of the worker.
- The relation between the base and height of the scaffolding was too small; the scaffolding being secured against falling with only a single thin wire on the facade.
- There was no ladder at the scaffolding.
- The workers had to use the rungs of the scaffolding for climbing up and down. They wore safety harnesses for fall protection, which were supposed to hook into the rungs of the scaffolding. However, with the diameter of the rungs being bigger than the hooks themselves, they did not clasp around the rungs properly.
Consequently, the factory manager stopped this work before an accident could happen. The consultant made him aware of the following dangers and the manager realised the risks himself:
- Falling hazard for the worker when climbing up or down
- Danger of the scaffolding tilting over
- Danger of breaking through the wooden boards
Revision works took place in a coal-fired power plant. For this purpose, workers installed hanging scaffolding inside the steam boiler at a height of about 50-metres, including an inserted ceiling with wooden panels. Electric lamps served as lighting. After finishing the repair works, the workers began to remove the temporary assembly. The team took out one wooden panel, using the opening for throwing down surplus plastic material. After finishing this job, they neither closed nor marked the opening. As the team did not document the existence of the opening, fate, unfortunately, took its course.
Workers on the following shift continued disassembling, oblivious to the opening’s existence. Due to the black walls and the great height of the boiler’s ceiling, the light was very weak. In the course of the shift a worker stepped into the opening, falling the 50 metres.
Technicians had to carry out repair works on the service storey of a factory. The plant components, which required repair, were in an area with a pipeline shaft. The shaft contained different pipelines, all of which went vertically through all storeys up to the top of the building. The service floor was clean and well lit. The outcome of the risk assessment of this storey area showed a falling hazard. Therefore, a permanent barrier was installed as a technical means to stop people entering. In addition, a warning sign was put up pointing out the risk. At the time of the accident the foreman was seeking approval from the division manager in order to be able to enter this area and work there, while simultaneously the other technician, left on his own, put on his harness and crossed the barrier without hooking in. He fell down from a height of more than nine metres.
As part of a routine job for a client a team of workers erected a working platform inside a high tank on its installations. To do this, the first worker had to enter the tank from a stable platform outside via an inspection manhole. He carried out preparation activities while standing on the installation at great height. According to the safety regulations, he should have been wearing a safety harness, which he should have properly fixed outside before entering the opening. A second worker should also have been present as flagman.
The team arrived at the site and unloaded the tools. When everything was ready for work in the container, the colleagues suddenly realised that the safety harnesses were still down in the car. One worker, relying on his experiences from previous assignments in this tank, entered. He slipped and fell.
Is human failure the single cause for these accidents? No, this would be too simple an explanation.
When we start looking at the causes of these accidents, we can easily jump to the conclusion that human failure was the main reason in all examples:
- In Finland, safety experts and supervisors did not realise the falling hazards or real danger of tilting – they did not integrate the necessary measures into the existing risk management system
- In Poland, after finishing the task, the team did not secure the opening in the floor properly, nor did they inform the colleagues about it
- In France, a worker ignored the existing warning signs, relying instead on his own risk evaluation
- In Germany, several men worked together beyond any safety rules according simply to their own previous experiences
One tends to explain the incidents by the misconduct of the people involved:
- “If only the responsible managers in Finland had reflected the situation more and asked themselves whether the safety measures were sufficient!”
- “If only the team in Poland had put more effort into securing the floor opening after they had finished their job. This way they would have technically prevented the falling hazard!”
- “If only the technician in France had trusted the risk assessment of the experts by hooking in his safety harness!”
- “If only the team in Germany had made each other aware of the falling hazard, thus preventing one colleague from climbing into the vessel without a safety harness!”
If we just ticked ‘human failure’ in the accident investigation report, however, and carried on with our daily routine, we would choose an easy way out. “Human failure is never the cause, but always the consequence of a deeper cause.” (James Reason)
Indeed, we easily recognise the human factor in these actions or non-actions leading to the falls from height. Yet, these four cases show significant differences in terms of the reasons for the behaviours. Having recognised this important fact, it inevitably leads to different measures for preventing a reoccurrence. This means we have to look closely at deeper reasons, which allowed or even promoted these dangerous behaviour patterns.
In Finland, the responsible managers actually acted in the strong belief to have established a safe working environment for the worker on the scaffolding. They were even proud of their management system, providing an elaborate risk management for their factory: starting from wearing safety harness and needing written approval, up to supervising on the site. Only when they learned about the details did they realise that the hooks of the harness were too small for the rungs. Moreover, they recognised that the wire for securing the scaffolding at the building was not strong enough. In this case, obviously the failure lies in the lack of knowledge among all experts involved. Due to this lack, they were not able to check the effectiveness of the measures.
Apparently, the quality of the training of the responsible managers is the real cause of the risk, which has now been uncovered. It represents a latent weakness of the complete safety management system, for prevention measurements can only be as good as the experts themselves, carrying out risk assessments and defining safety measures. Consequently, improving the qualification is the key to enhancing the subsequent levels of the system. In the organisation of this example, it is common practice to define, implement and supervise risk-minimising measures when risks are recognised.
In Poland, the case is more complicated. With an opening at a height of 50-metres presenting an obvious danger, we must conclude that the workers removing the floor panel were aware of the high risks. Despite this fact, the workers repeatedly went to the edge of the opening while doing their work. When throwing down plastic material, they themselves did not fall down. They were aware of the danger and behaved in a self-assured manner. However, we can speak of a latent weakness of the safety management system in this case, as the complete team left the site, despite being fully aware of the hazardous opening left behind.
Looking further into the work processes of this specific shift, we find even more weak points, as there were no detailed guidelines or instructions for removing the plastic components. It was the team itself that decided to speed up the workflow by taking out a floor panel for the fast transport of the material downwards. Neither did they conduct a risk assessment nor did they seek advice from a safety expert. For these reasons, they never secured the floor opening, although throwing plastic components meant a real danger for this shift as well. In addition, lifting the parts over a reasonably high fall arrest would have meant no ergonomic challenge.
During the whole process of this work, the shift supervisor did not question the way the team worked, nor did any member of the management become aware of it. Thus, the remaining undocumented and uncommented opening was only another part in a chain of latent failure of the whole system, starting with missing a concrete plan for the work. We conclude that the failure is symptomatic of an organisation in which safety at work has a low significance.
The top management has obviously given safety a low priority. Line management has taken over this low priority level without reflecting upon it. This way, they created the base within the management system, leading to unsafe behaviour through improvisation and dangerous conditions such as missing fall arrests. Correcting measurements have to start with the top management in order to safely assign the necessary importance and implement a completely new safety management system.
When we look at the example in France, we notice a different kind of failure. In this case, there actually was a risk assessment beforehand, pointing out the risks adequately. Risk-minimising measures were implemented: technical (barrier) as well as organisational measures (approval system). In spite of these exemplary measures, one worker decided to ignore and violate the existing rules. As opposed to mistakes, which happen subconsciously and involuntarily, this violation was a conscious and voluntary act. Thus, the example indicates that its cause lies in the distorted perception of risks by the person, combined with a disregard of authority and expertise. Violation of rules cannot completely be prevented. However, we can set the frame conditions to complicate rule violations. In the example above, the service company to which the worker belonged lost the current contract. It was also barred from any future works on the client’s factory site. The client rightfully reasoned this decision with the service company’s failure to create a unified sense of safety among its employees.
The case in Germany is particularly sad, as it makes obvious that the team knew and tolerated the violation of safety rules. The way to the car was far, it was a public bank holiday, and the work was an unplanned repair. It was certainly not a coincidence that none of the workers had a safety harness at the working site. The foreman did not even order the workmen to go and fetch their safety equipment.
In this context, we would like to quote James Reason again in saying: “Human behaviour is never the cause, but always the consequence of a deeper cause.”
The previous example illustrates that several causes were leading to this conscious misconduct.
On the one hand, the case reveals a complete underestimation of risks in combination with an overestimation of one’s abilities. Each worker of this service team had been in the tank once or even several times for installing a working platform. They all thought they knew the site, especially the assembly of the tank well enough. Moreover, they were relying on their long-term experiences and capabilities of working at great height without making mistakes.
On the other hand, the team was newly constituted for this assignment due to the public bank holiday. Many colleagues had already taken this day off. The team leader, although he knew all the men, had never been their leader before. The same applied to the workers: they knew each other, but had never worked as a team before. Thus, the fact that the safety protection was missing at the site and that one worker entered the vessel without safety harness is clearly a result of a lack of responsibility by the team leader. At the same time, apparently no member of the team was brave enough to stop the colleague’s wrong behaviour, to which the leader consented tacitly. Strong responsible leaders give high priority to safety, clear orders on safety rules and demand that their team members abide by these rules.
All these cases have one aspect in common: the persons acted according to their own assumptions, which defined their actions. In Finland, the managers assumed they had implemented all necessary measures. In Poland, the workers assumed that covering the floor opening was not their duty, and that the opening would probably not present a problem until finishing the disassembly. The workers in France and Germany assumed their know-how and capabilities were sufficient for mastering the risks.
However, the established processes and routines created blind spots in terms of risk awareness, resulting in underestimating the real dangers. Expectations are notions based on thoughts, steering our way of thinking and acting. These notions are responsible for bringing order into our way of thinking; they provide us with criteria for interpreting situations as we perceive them.
The problems arise when we approach situations with ‘routine thinking’. This does not imply, actually, that we really understand the problem – but in fact that we interpret it on the basis of known criteria. Routines are based on expectations and therefore frequently contain ‘mental traps’. This is just human nature: we like to expect things that are familiar and prefer to shut off contrary perceptions. We tend to look for a confirmation of our assumptions, expectations and decisions, as they provide us with a feeling of safety.
Simultaneously we avoid proofs of the opposite or play them down. We are inclined to do this even more if the situation is more complex. Then we tend to simplify matters and rely once again on the familiar. Quite often, we succeed with this!
So, what happens if this is not the case?
High Reliability Organisations
At this point, we would like to draw attention to the so-called High-Reliability-Organisations (HROs). HROs come from industries with high risks. They have learned to deal with human inadequacies because of their high risks for catastrophes or because they have even gone through painful, catastrophic experiences. Attentiveness for HROs means: mindfully and thoughtfully dealing with complex situations, processes at work and mistakes.
Fire brigades, for instance, perform services under highly dangerous conditions, in which every mistake can lead to a catastrophe. Operation controllers of fire brigades know that a high percentage of the information resulting in an action of the fire brigade is either incomplete or false. Although they make assumptions of the real situation, to be able to act efficiently they continuously question these assumptions as soon as they are at the place of action. Thus they update and correct their decisions and actions based on real and valid information.
Since human beings are not able to predict or qualify each and every situation, companies have to introduce certain routines which help employees to stay aware and prevent them from relying too soon on a single process or a simple solution, which would make them inflexible in unexpected circumstances.
Despite their permanent high risks, well-organised HROs suffer less incidents and accidents than other organisations with similar situations.
Even in the best HROs, however, mistakes do happen. It is then crucial for them to learn quickly from these mistakes and to react to the new situation in a flexible way.
As the four previous examples outline, risky situations develop because humans are not able to know everything about their respective situations or potential problems. The key for improving this issue lies in building the organisation in such a way that misperceptions and potential mistakes can be detected at an early stage. After realisation of a potential failure, it is of great importance to openly discuss and assess all of the possible consequences, even the apparently absurd ones. This is the basis for acting in a flexible way in any situation.
In most organisations, however, this is not the case for different reasons:
- Fear of repercussions (“My boss will never forgive me!”)
- Care for a relationship (“My colleague is always willing to stand in for me and to support me! Who knows if he will still do this when I criticise him now?”)
- Clash of interests (“If I do this now, the production will stand still and we won’t be able to achieve our daily target!”)
Our experiences have shown that organisations often have to learn open discussions anew. Frequently this involves a long process stretching over all levels of hierarchy and demanding a great deal of awareness. Nevertheless, the effort definitely is worth it, because not only does it save oneself and others from suffering, but it also fosters many creative solutions in other areas.
Risk awareness and error detection need experiences and up-to-date know-how. Error detection is a crucial step on the way to become an HRO. In order to achieve the best possible results in this process, organisations have to employ well-qualified people willing to learn. They have to develop a fine sense for collecting, analysing and processing information on actual work processes and weak warning signals for potential mistakes. In doing this, they have to become aware of the fact that all humans, including themselves, are biased through their assumptions and that these assumptions always have to be questioned to prevent simplification. At the same time, employees should appreciate and respect expert knowledge.
High Reliability Organisations permanently analyse the current state of affairs and look for improvements:
- Do we regularly update our routines and the assumptions on which they are based?
- Do we have routines that demand updates?
- Do we recognise weak signals for unexpected incidents or deviations from our expectations?
- Do we question the adequacy of our assumptions when we detect weak signals?
Checklists have proven to be a valuable tool in daily operations by helping to detect systematically critical and defective situations. Checklists are not only useful in concrete work situations, but also for operational processes. By using checklists, we can monitor different kinds of critical circumstances. For instance:
- If there have been important organisational changes recently (e.g. change of manager or team leader)
- If employees are allowed to ask critical questions
- If all employees involved possess the same level of information and knowledge on a situation
Additionally, James Reason recommends managers to ask three questions concerning the human-system-interfaces for detecting unexpected incidents:
- A practical question: Which activities involve the most intense contact between humans and the system, meaning where a mistake can influence the system immediately and directly?
- A question concerning the activities with the highest risk for the system: Which activities have the highest risks?
- A question about frequency: How often are these risky activities carried out?
Finally, we would like to quote Roberts and Bea, who identified three characteristics that organisations can implement to enhance their reliability:
- HROs aggressively seek to know what they do not know: investment of resources to train and re-train staff to enhance technical competence and enable them to anticipate and respond appropriately to unexpected events. They also analyse accidents and near misses to identify the types of accidents that happen in the organisation and target the aspects of the system that require redundancies.
- HROs balance efficiency with reliability: they use incentive schemes to balance safety with profits and enable employees to make decisions that are safe in the short-term and profitable in the long-term.
- HROs communicate the big picture to everyone: they have effective communication channels so that they can quickly access expertise in emergencies and communicate the big picture to everyone. They also have well-defined procedures for both normal and emergency situations with well-known decision rules as to when they should be used.
The transformation of an established organisation into an HRO or the introduction of some HRO principles can be a long, not always easy path. In the end, it certainly is worth the effort, as a more robust and – through continuous improvements – a more sustainable organisation emerges, with more attentiveness and high-risk awareness.