L-SYS

Loading

Byzantine Fault Tolerance Explanation

Generally, electric Vertical Takeoff and Landing (eVTOL) aircraft have propulsion systems that cannot tolerate a >25% total loss of vertical thrust in the event of a single failure. For this type of propulsion system, a quadruplex as opposed to a triplex system architecture is required unless end system voting is implemented. End system voting is challenging and if timing and coordination are required, it is not possible.

To explain why, the concept of Byzantine fault tolerance is discussed, followed by pictorial representations of system architectures that have different levels of fault tolerance including Example 3 which is fully Byzantine fault tolerant.

Byzantine Fault Tolerance

Imagine that the grand Eastern Roman empire aka Byzantine empire has decided to capture a city. Alas, there is fierce resistance from within the city. The Byzantine army has completely encircled the city. The army has many divisions and each division has a general. The generals communicate between each other as well as between all lieutenants within their division only through messengers.

All the generals or commanders have to agree upon one of the two plans of action. Exact time to attack all at once or if faced by fierce resistance then the time to retreat all at once. The army cannot hold on forever. If the attack or retreat is without full strength then it means only one thing — unacceptable brutal defeat.

If all generals and/or messengers were trustworthy then it is a very simple solution. However, some of the messengers and even a few generals/commanders are traitors. They are spies or even enemy soldiers. There is a very high chance that they will not follow orders or pass on the incorrect message.

Example 1: System Architecture That Can Tolerate <33% Failure of a Function

This aircraft level system architecture is insufficient for most eVTOLs.

  • A Cyclic Redundancy Check (CRC) is assumed between nodes. Therefore, corruption between transmitter and receiver would require a failure of the CRC
  • Message can be transmitted directly
  • Message can be transmitted indirectly via intermediary equipment e.g. a network  switch
  • Each channel acts upon the same data, therefore, with the exception of the failed channel, exact consensus is achieved
  • This architecture is adequate unless the failed channel cannot transmit an erroneous output

Example 2: System Architecture That Can Tolerate <25% Failure of a Function

This aircraft level system architecture is sufficient for most eVTOLs.

  • A similar scheme can be applied to a 4 as opposed to 3 channel architecture
  • There is still the possibility the failed channel can transmit an erroneous output

Example 3: System Architecture That Can Tolerate 0% Failure of a Function (fully Byzantine Fault Tolerant)

This aircraft level system architecture is sufficient for most eVTOLs and electric and hybrid-electric powertrains on larger commercial aircraft. It is actually very similar to the system architecture of Full Authority Digital Engine Controllers (FADECs) that have been used for turbine engines on these aircraft. Also, it is frequently implemented as part of an Integrated Modular Avionics (IMA) system architecture (see Hover, Inc single channel and dual channel IMA computing platforms).

  • A possible solution is for 1A and 1B to be paired and 2A and 2B to be paired
  • 1A and 2B can shutdown their paired channels if there is a disagree. Therefore, the failed channel cannot transmit an erroneous output