Re: N-version software compared to others

From:         Tom Speer <speer%do.edw@mhs.elan.af.mil>
Organization: 412th Test Wing / TSFF
Date:         24 Jun 96 00:42:05 
References:   1 2
Followups:    1 2
Next article
View raw article
  or MIME structure

Charles Radley wrote:
> ...It is less effective than some techniques, more effective than
> others.  It can be accomplished  more easily than Formal Methods, but
> is probably less effective....

I missed the beginning of this thread, and with responses from the likes
of Prof Littlewood, you've got it from the authority.  But here's a
yeoman practioner's experience:

The STOL/Maneuver Demonstrator (an F-15B modified with canards, 2D
vectoring/reversing nozzles, and an integrated flight/propulsion control
system) was the first aircraft in the US to fly without a dissimilar
backup system (the British FBW Jaguar preceeded it).  The aircraft is
now at NASA Dryden as the ACTIVE F-15.  I worked in the SMTD program
office, and was responsible for overseeing the development of the IFPC
hardware and software.  Naturally, we were concerned about the generic
software fault problem, since the software was flight critical and
common to all 4 channels.

I looked into the pro's and con's of N-version programming, and ran
accross some interesting studies.  The one that impressed me the most
was one in which a couple of universities teamed up to provide something
like 16 independent programmer teams to code software from a common
spec.  I believe the problem given them had to do with SDI decoy
discrimination.  From these teams' products, many triplex combinations
were formed and tested against a "golden" version produced by the
instructors.  The triplex versions were more reliable than single
versions.  However, they also found that the triplex voting did _not_
preclude common software errors and that the software errors were
statistically correlated.

The reason for this striking finding is that all versions were coded
from common requirements (naturally).  Where the requirements
specification is ambiguous or hard to understand, it's difficult for
everyone.  Plus, programmers tend to have similar backgrounds, and so if
one person has misconceptions, then many people can share the same.  So
people tend to make many of the same mistakes.

So, if you have to convince a certification authority that there is no
possiblilty of a software fault, then formal methods are your only hope.
You can't rule it out with N-version programming.

That being said, there's a tradeoff between reliability and
availablilty. For example, a secondary control system like for flaps or
slats, has to have high reliability but it may not be as stringent in
availability. It may be highly important that faults be detected so as
to avoid the possibility of asymmetric deployment or retraction.  But it
may be acceptable to have more false alarms which result in the flaps
being frozen in position because the aircraft can still be landed
safely whether they are up or down.  For such an application, N-version
programming may be appropriate because there may be a greater chance of
an error in one or the other channel, but the chance of a common error
is acceptably low.  Just as twin engines have twice the chances of
engine failure but a much lower chance of total thrust loss, compared to
single engine aircraft.

The primary control system of a fly-by-wire aircraft must have high
availability because the "fail safe" state is only marginally more
acceptable than the failure state.  In this application, it may be more
desireable to slant the risk toward the undetected failure than to have
false alarms.  So single version software may be the way to go.  The
consequences of failure also mean that considerable resources can be
devoted to the development of this version in order to reduce the
chance that such an undetected failure exists, so formal methods are
appropriate.

On the STOL/MTD, we used an autocode generator to create the control law
code directly from graphical input of the block diagrams.  The autocode
generator used a library of thoroughly validated routines which
corresponded to the diagram blocks.  By eliminating human involvement as
much as possible in the coding, we were taking a step in the direction
of formal methods, although I can't claim anything like formal methods
were used.  So far as I know, this was the first time autocoded software
was used in a primary flight control system.

Autocode generators have become pretty much a part of routine control
system synthesis, as much for their convenience as because of the
increased reliability.  They can now produce code which is comparable in
speed and memory to hand coding, so there is little reason not to use
them.

I believe it's possible that formal methods could be developed to the
point where they actually speed up the development process and can be
justified as much on the basis of reducing product cycle times as on
their contribution to safety.  But this will require them to be used
right from the outset in the development of the requirements, and this
culture change may well be a far bigger impediment than the creation of
the actual tools themselves.

Something to look forward to.

TS