Shoddy Software Is Eating The World, And People Are Dying As A Result

from the sometimes-it's-too-late-to-issue-an-update dept

Two recent crashes involving Boeing 737 Max jets are still being investigated. But there is a growing view that anti-stall software used on the plane may have caused a "repetitive uncommanded nose-down", as a preliminary report into the crash of the Ethiopian Airlines plane puts it. Gregory Travis has been a pilot for 30 years, and a software developer for more than 40 years. Drawing on that double expertise, he has written an illuminating article for the IEEE Spectrum site, entitled "How the Boeing 737 Max Disaster Looks to a Software Developer" (free account required). It provides an extremely clear explanation of the particular challenges of designing the Boeing 737 Max, and what they tell us about modern software development.

Airline companies want jets to be as cost-effective as possible. That means using engines that are as efficient as possible in converting fuel into thrust, which turns out to mean engines that are as big as possible. But that was a problem for the hugely-popular Boeing 737 series of planes. There wasn't enough room under the wing simply to replace the existing jet engines with bigger, more fuel-efficient versions. Here's how Boeing resolved that issue -- and encountered a new challenge:

The solution was to extend the engine up and well in front of the wing. However, doing so also meant that the centerline of the engine's thrust changed. Now, when the pilots applied power to the engine, the aircraft would have a significant propensity to "pitch up," or raise its nose.

The solution to that problem was the "Maneuvering Characteristics Augmentation System," or MCAS. Its job was simply to stop the human pilots from putting the plane in a situation where the nose might go up too far, causing the plane to stall -- and crash. According to Travis, even though the Boeing 737 Max has two flight management computers, only one is active at a time. It bases its decisions purely on the sensors that are found on one side of the plane. Since it does not cross-check with sensors on the other side of the plane, it has no way of knowing if a sensor is producing wildly inaccurate information. It assumes that the data is correct, and responds accordingly:

In a pinch, a human pilot could just look out the windshield to confirm visually and directly that, no, the aircraft is not pitched up dangerously. That's the ultimate check and should go directly to the pilot's ultimate sovereignty. Unfortunately, the current implementation of MCAS denies that sovereignty. It denies the pilots the ability to respond to what's before their own eyes.

Like someone with narcissistic personality disorder, MCAS gaslights the pilots. And it turns out badly for everyone. "Raise the nose, HAL." "I’m sorry, Dave, I’m afraid I can’t do that."

The coders who wrote the MCAS software for the 737 Max don't seem to have worried about the risks of using sensors from just one side in the computer's determination of an impending stall. This major design blunder may have cost the lives of hundreds of people, and shows that "safety doesn’t come first -- money comes first, and safety's only utility in that regard is in helping to keep the money coming," according to Travis. But he points out that it also reveals something more general, and much deeper: the growing use of software code that is simply not good enough.

I believe the relative ease -- not to mention the lack of tangible cost -- of software updates has created a cultural laziness within the software engineering community. Moreover, because more and more of the hardware that we create is monitored and controlled by software, that cultural laziness is now creeping into hardware engineering -- like building airliners. Less thought is now given to getting a design correct and simple up front because it's so easy to fix what you didn’t get right later.

Every time a software update gets pushed to my Tesla, to the Garmin flight computers in my Cessna, to my Nest thermostat, and to the TVs in my house, I'm reminded that none of those things were complete when they left the factory -- because their builders realized they didn't have to be complete. The job could be done at any time in the future with a software update.

Back in August 2011, Netscape founder and VC Marc Andreessen wrote famously that "software is eating the world". He was almost right. It turns that shoddy software is eating the world, sometimes with fatal consequences.

Follow me @glynmoody on Twitter, Diaspora, or Mastodon.

Hide this

Thank you for reading this Techdirt post. With so many things competing for everyone’s attention these days, we really appreciate you giving us your time. We work hard every day to put quality content out there for our community.

Techdirt is one of the few remaining truly independent media outlets. We do not have a giant corporation behind us, and we rely heavily on our community to support us, in an age when advertisers are increasingly uninterested in sponsoring small, independent sites — especially a site like ours that is unwilling to pull punches in its reporting and analysis.

While other websites have resorted to paywalls, registration requirements, and increasingly annoying/intrusive advertising, we have always kept Techdirt open and available to anyone. But in order to continue doing so, we need your support. We offer a variety of ways for our readers to support us, from direct donations to special subscriptions and cool merchandise — and every little bit helps. Thank you.

–The Techdirt Team

Filed Under: 737 max, eating the world, safety, software, unfinished, updates
Companies: boeing


Reader Comments

Subscribe: RSS

View by: Time | Thread


  • icon
    James Burkhardt (profile), 26 Apr 2019 @ 9:39am

    Video Games are a great example of this phenomina

    All you have to see to understand this debacle is major AAA video game releases of the last few years. Video Games have been pushed out the doors incomplete more often due to the ability to issue day one patches to fix the issues, or even with the expectation the issues can be fixed down the line. Anthem being a prime example, its a buggy mess with threadbare content. The developers planned to fix it all in post.

    For all the bugs you can find in older games when patches were not an option, AAA releases were more willing to delay release than put out a buggy mess.

    link to this | view in chronology ]

    • identicon
      Anonymous Coward, 26 Apr 2019 @ 9:49am

      Re: Video Games are a great example of this phenomina

      "Embedded" software such as that in cars and planes as well as just about any electronic device you can buy is held to a higher standard. Where there is no room for error the quality benchmark is much more stringent. Still, bugs get through the testing a lot more these days than they used to.

      While it is true that there is no such thing as bug-free software there is no excuse for shipping incomplete and poorly tested product, even video games. The backlash against companies shipping non-life-threatening software is ramping up. That against companies shipping product that affects lives is far more violent. Hopefully we'll see this swelling wave of attention on software quality produce better results in the future. This trend desperately needs a reversal.

      link to this | view in chronology ]

      • identicon
        Anonymous Coward, 26 Apr 2019 @ 1:08pm

        Re: Re: Video Games are a great example of this phenomina

        Back when I was involved with embedded systems development, the standards were already starting to slip, due to the fact that you could push OTA updates instead of having to visit each device with a JTAG unit in order to update them.

        Teslas, for example, really worry me, as it's too easy to broadcast a bad update, or have a bug in the update process itself that results in a faulty update. There's no diagnostic on-site to verify everything is operating in-spec other than the one built in to the device you're updating.

        So I really hope the wave of attention is larger than the wave of both lax coding standards and kneejerk political reaction that's keeping pace.

        link to this | view in chronology ]

      • identicon
        Anonymous Coward, 26 Apr 2019 @ 1:26pm

        Re: Re: Video Games are a great example of this phenomina

        "Embedded" software such as that in cars and planes as well as just about any electronic device you can buy is held to a higher standard.

        Except when it isn't. Present case highlighted.

        link to this | view in chronology ]

    • icon
      Wyrm (profile), 26 Apr 2019 @ 11:26am

      Re: Video Games are a great example of this phenomina

      Definitely my first thought.
      This is particularly obvious when core features break down (e.g. main quest in an RPG, collision detection in... just about any game). These should be evident in a simple play-through. Run through your game once, that's the bare minimum of testing.
      It's a little more acceptable when it's about edge cases (e.g. optional side quest with complicated requirements, jumping into a wall while strafing and using a consumable in an action game). It's less obvious, so I can forgive them overlooking this at launch.

      Now, the article talks about software with real-world life-and-death consequences, but it is indeed a similar symptom of bug-tolerance in software development that should not have been left to grow.

      link to this | view in chronology ]

      • identicon
        Anonymous Coward, 30 Apr 2019 @ 7:29am

        Re: Re: Video Games are a great example of this phenomina

        There's also a problem of a culture of instant gratification that fuels video game glitches; gamers demand games be released instantly and with the latest and greatest of visuals. If you spend additional time playtesting a game for stability and leave it to wait, gamers will potentially move on to something else. There's an intense pressure on video game developers not only to develop quality, but also to develop quickly.

        Indie games tend to be pretty reliably, but also because they're not usually massive in scale, there's fewer bugs to run into. When it comes to large AAA titles, Japanese companies in general are more reliable at ensuring stability in their coding, and have more rigorous playtesting processes, whereas large American software houses tend to be more lax about this sort of thing.

        link to this | view in chronology ]

        • icon
          Uriel-238 (profile), 30 Apr 2019 @ 10:04am

          US gaming industry is not merely lax but outright stupid.

          Recent news of the 100+ hour workweek crunches mandated of devs and QA for Mortal Kombat 11 actually exemplifies what is typical in the current US AAA game industry, even when it's clear that people so overworked will produce shoddier results.

          Games in the US run on the hearts of forsaken children. From what I understand, music and movie studios are pretty similar. CGI labs can be so cruel.

          link to this | view in chronology ]

    • icon
      Darkness Of Course (profile), 26 Apr 2019 @ 5:02pm

      Re: Video Games are a great example of this phenomina

      Oh, you poor thing, poor thing.

      The cost of developing AAA games has blazed new trails - into the clouds. Once games hit billion dollar numbers the suits showed up in force. They only care about spreadsheets, check lists, and the bottom line. Most of them care nothing about gaming, gamers, and the industry itself.

      Support indies. Vote with your wallet.

      BTW, Anthem is a bogus example that most gamers realized was going to be a disaster up front. Not figuring that out is on you. Pay attention.

      Don't buy from EA, Ubisoft, and any publisher that has performed badly FOR YOU. Wait until the game stabilizes. If it doesn't, hey money saved.

      In this day and age, never pre-order.

      link to this | view in chronology ]

      • icon
        Uriel-238 (profile), 26 Apr 2019 @ 6:29pm

        Excercise vigilance with gaming

        Part of the problem is that living in our society requires way, way too much vigilance on part of the end-user. When a company messes up, we're expected to eat the damage. When we mess up we're expected to pay service fees. These are enforced through the Terms of Service that reflect the same sort of unbalanced agreements that were common between the West and the far East (China, Japan, Korea) at the turn of the 20th century. Thank you gunboat diplomacy.

        The problem is the same here. The companies have the gunboat. They've agreed with each other to use the same exploits and to reduce competition that end-users have no real choice in products necessary to live and work in the US. Even the capitalist pretense is a farce, and companies are given the benefit of doubt whether they release a broken game or crash an airliner with shoddy software.

        It's going to happen again and again and again, because for the public there is no representation in our regulatory agencies. Maybe, eventually, enough people will die that there will be backlash, but actual reform is way out of the question.

        link to this | view in chronology ]

  • identicon
    Luke Out, 26 Apr 2019 @ 9:39am

    Don't they have pitch indicators?

    In a pinch, a human pilot could just look out the windshield to confirm visually and directly that, no, the aircraft is not pitched up dangerously.

    First, NO, you can't judge pitch relative to ground even a low level. You must have an objective indicator. And what if cloudy or dark? -- That one sentence is enough to totally discount the rest. It's just simply trivially WRONG.

    Now, my take on the crashes is

    link to this | view in chronology ]

  • identicon
    Anonymous Coward, 26 Apr 2019 @ 9:48am

    No evidence the coders didn't worry

    The coders who wrote the MCAS software for the 737 Max don't seem to have worried about the risks of using sensors from just one side in the computer's determination of an impending stall.

    I don't know of any evidence for this. Coders can't just run around making whatever major architecture changes they feel like. Such things have to be approved at higher levels, and for all we know the coders raised concerns and were told it wasn't important or would be fixed in a future update.

    link to this | view in chronology ]

  • identicon
    Anonymous Coward, 26 Apr 2019 @ 9:58am

    what's the problem & fix

    hard to figure out the main point and purpose of this article:

    Software Sucks ?

    Software writers Suck ?

    Boeing Sucks ?

    The world now heavily depends upon computer software/firmware/hardware -- it's everywhere.
    Much of it is defective/shoddy when initially used, but most is not critical to life or death issues. Important software is in a constant update evolution cycle.

    Boeing screwed up and killed people. That is not rare. Airplanes have been crashing for over a century from design errors.
    Human error and misjudgements killed people by the many millions throughout history.

    What exactly do you want done ?

    link to this | view in chronology ]

    • identicon
      AricTheRed, 26 Apr 2019 @ 10:12am

      Re: what's the problem & fix

      "What exactly do you want done ?"

      Really?, I mean Really? are you effing serious?

      When One builds a product with a SINGLE point of failure that will be FATAL it is a fatal DESIGN ERROR!!!

      yeah, I know I'm shouting, but people don't listen, like you noted in your comment.

      link to this | view in chronology ]

      • identicon
        Anonymous Coward, 26 Apr 2019 @ 1:51pm

        Re: Re: what's the problem & fix

        yes, yes -- we and Boeing know MCAS design error caused these 737 crashes.

        Old news.

        However, the article title ("Shoddy Software Is Eating The World,") directly implies some general problem with "Software" worlwide.

        What is that problem ?

        link to this | view in chronology ]

        • icon
          Thad (profile), 26 Apr 2019 @ 2:12pm

          Re: Re: Re: what's the problem & fix

          Do you have a complaint about anything in this article besides the headline?

          link to this | view in chronology ]

    • identicon
      Moz, 26 Apr 2019 @ 2:54pm

      Re: what's the problem & fix

      The problem is that overwhelmingly, inevitably, when customers are faced with "safe or cheap" they opt for cheap.

      Now, you could argue that that's the result of capitalist propaganda from a very early age, pushing the lie that money is the measure and more is better. But that's a complex message that can't be turned into a 3 word political slogan or 20 second video ad, so it's obviously wrong or irrelevant or both.

      link to this | view in chronology ]

      • identicon
        Anonymous Coward, 27 Apr 2019 @ 5:35am

        Re: Re: what's the problem & fix

        "The problem is that overwhelmingly, inevitably, when customers are faced with "safe or cheap" they opt for cheap."

        and then there is a federal certification process in order to mitigate the possibility of design/production errors causing a catastrophe ...... what happened?

        link to this | view in chronology ]

        • identicon
          Anonymous Coward, 29 Apr 2019 @ 5:33am

          Re: Re: Re: what's the problem & fix

          Boeing was given a copy of the rubber stamp......

          link to this | view in chronology ]

    • identicon
      Henry93, 26 Apr 2019 @ 4:24pm

      Re: what's the problem & fix

      Unlike hardware, software is supposed to do exactly what you tell it to do. Most airplanes that fail is because the hardware does not work as designed. When software fails, it only fails because of design.

      If hardware was designed to the standards of most software, if you turned your door handle the wrong way, your house would start on fire.

      Software is not hard to make correctly. Correct code is almost always easier to read and understand than incorrect code. And even if you can't understand it, someone else who can, can make an easy-to-use API around that code.

      link to this | view in chronology ]

      • identicon
        Anonymous Coward, 26 Apr 2019 @ 6:20pm

        Re: Re: what's the problem & fix

        In software, I doubt there is this correct/incorrect dichotomy to which you refer ... and hardware too.

        Famous saying, not sure who the author is:
        . There is more than one way to do it.

        link to this | view in chronology ]

      • identicon
        Anonymous Coward, 27 Apr 2019 @ 2:11am

        Re: Re: what's the problem & fix

        Software is not hard to make correctly.

        Software will do exactly what it is told to do, and can be correct, in that it meets specifications, like MCAS did, and fail because the specifications are wrong.

        Most airplanes that fail is because the hardware does not work as designed.

        Some have failed because of design flaws, like the Lockheed Electra.

        Also, it is worth considering that hardware can be tested as a whole, by various forms of shake and bake, that is stress testing to see if anything breaks, and this may locate software problems as well. But taken in isolation from it use, software can only be tested detail by detail, including individual combinations of inputs to see if anything goes wrong.

        Also, how many cars have been recalled because of mechanical design errors?

        link to this | view in chronology ]

        • identicon
          Anonymous Coward, 27 Apr 2019 @ 5:37am

          Re: Re: Re: what's the problem & fix

          "But taken in isolation from it use, software can only be tested detail by detail, including individual combinations of inputs to see if anything goes wrong."

          So parametric testing was not done then?

          link to this | view in chronology ]

          • identicon
            Anonymous Coward, 27 Apr 2019 @ 8:02am

            Re: Re: Re: Re: what's the problem & fix

            Testing with samples in and out of range is often done. but for each input range, at least six tests are requires, max value, just above max range, max range, at least one test between min and max, min range, just below min range, and min value. Exhaustive testing requires 6^n tests where n is the number of inputs. with 5 inputs that is 7,776 tests, and with 6 inputs 46,656 tests. So while correct response for each input can be tested, it rapidly becomes impossible to test for all combinations of input conditions.

            link to this | view in chronology ]

            • identicon
              Anonymous Coward, 28 Apr 2019 @ 8:07am

              Re: Re: Re: Re: Re: what's the problem & fix

              "Exhaustive testing requires 6^n tests where n is the number of inputs."

              How many inputs from the one sensor are there to test, is it digital or analog?

              link to this | view in chronology ]

              • identicon
                Anonymous Coward, 29 Apr 2019 @ 4:21am

                Re: Re: Re: Re: Re: Re: what's the problem & fix

                How many inputs are there to a flight control system, especially as bugs are more likely to hang out in higher level control functions with many inputs, than a module dealing with a single sensor input.

                link to this | view in chronology ]

                • identicon
                  Anonymous Coward, 29 Apr 2019 @ 5:39am

                  Re: Re: Re: Re: Re: Re: Re: what's the problem &

                  Are we discussing the testing related to one sensor system or the entire airframe?

                  link to this | view in chronology ]

                  • identicon
                    Anonymous Coward, 29 Apr 2019 @ 7:19am

                    Re: Re: Re: Re: Re: Re: Re: Re: what's the problem &

                    The entire airframe, as a single sensor failure prevented the pilots from controlling the aircraft, and single sensor failure should not do that.

                    There are ways of detecting a probable sensor or control failure, but that has to be carried out in higher level software with multiple inputs. I.e., if a control output does not cause a sensor reading change, either the control or the sensor have failed.

                    link to this | view in chronology ]

  • identicon
    Anonymous Coward, 26 Apr 2019 @ 10:30am

    Yes, this looks like it was a single point failure.

    Failures happen, it is just the things work - so a good design takes this into account and mitigates the danger with backups that are polled at regular intervals with some sort of comparison going on between them in order to ascertain their functional capability. This is, or at least it used to be, standard operating procedure for manned aircraft. When was this abandoned?

    link to this | view in chronology ]

  • identicon
    Jason, 26 Apr 2019 @ 10:31am

    Being both a pilot and a software engineer myself, there has been a lot to absorb in both of these crashes. But as is the case with nearly all major aircraft accidents, there is more to it than just simply blaming poor software or hardware design.

    The regulations involved in getting aircraft certified are voluminous, and getting approval requires a huge amount of work. Or at least that's the theory, and according to some reports the FAA delegated a lot of its oversight responsibilities back to Boeing itself. (A decent summary article is here.)

    The design of the MCAS system feels inadequate to me, and it's getting a justifiable level of concern placed on it. But the natural follow-up question is how a flawed design like that was approved in the first place. If the FAA had been keeping closer tabs on things, it's possible that system would have never been put into production in its current form, and these accidents could have been avoided.

    There is almost always a chain of failures that lead up to a crash, and "shoddy software", if you like, certainly played its part. But it wasn't the only link in that chain.

    link to this | view in chronology ]

    • identicon
      Glyn Moody, 26 Apr 2019 @ 11:21am

      Re:

      Yes, you're right, and the (long) original article makes that clear. But for Techdirt, I concentrated on the software angle because of the wider point about today's software.

      link to this | view in chronology ]

    • identicon
      MathFox, 26 Apr 2019 @ 12:37pm

      Re:

      I've read multiple accounts on the MCAS design and in my opinion the biggest issues point to shoddy system design. The specifications given to the software designers told them to use a single AoA sensor; the cut-out switch for MCAS also cut out the electric trim switches of the pilot; manual trim required too much force to be feasible.
      It could be that the software was a bit more fanatic in trimming than the system specifications required, but I'm waiting for the investigation reports for confirmation.

      My opinion is that a safety-critical subsystem was bolted on an existing plane without properly analysing the consequences. Boeing was in a rush to keep up with Airbus.

      link to this | view in chronology ]

      • identicon
        Anonymous Coward, 26 Apr 2019 @ 1:53pm

        Re: Re:

        The specifications given to the software designers told them to use a single AoA sensor;

        Citation, please.

        link to this | view in chronology ]

        • identicon
          MathFox, 26 Apr 2019 @ 2:31pm

          Re: Re: Re:

          I'll happily dig out a citation if you could provide me with access to the sources :-)

          I inferred: Boeing allowed the software to use only one AoA source, otherwise the issue would have been found in a design review.
          And if Boeing skipped design reviews for safety critical software... they deserve the beating they'll get.

          link to this | view in chronology ]

    • identicon
      Anonymous Coward, 26 Apr 2019 @ 1:43pm

      Re:

      It was not just software, the underlying design was at fault for allowing a single point failure. One single sensor can cause a crash ... think about that. A multi million dollar thing can be destroyed by a single sensor malfunction, that is insane.

      link to this | view in chronology ]

      • identicon
        Anonymous Coward, 28 Apr 2019 @ 3:44am

        Re: Re:

        It was not just software, the underlying design was at fault for allowing a single point failure. One single sensor can cause a crash ... think about that. A multi million dollar thing can be destroyed by a single sensor malfunction, that is insane.

        I'm not sure why you are concluding based on the information publically released that it was more than a software architecture/design issue.

        The 737 has two AoA sensors. It appears it was a software specification (i.e.BA) failure, or software architectural or software design or buggy software implementation that led to the software ignoring the 2nd, functioning, AoA indicator and only accept input from a single sensor per flight (apparently, from other articles, every time a new flight is begun, the system swaps sensors to receive input from for the duration of the flight). It is software that decides which sensor to use.

        link to this | view in chronology ]

        • identicon
          Anonymous Coward, 28 Apr 2019 @ 4:08am

          Re: Re: Re:

          from other articles, every time a new flight is begun, the system swaps sensors to receive input from for the duration of the flight). It is software that decides which sensor to use.

          And that is insane design, as the pilot should have the option of selecting which sensor to use. Note, in a two sensor system, comparing sensors only indicates that one is faulty when they disagree, therefore manual selection becomes imperative.

          link to this | view in chronology ]

          • icon
            Eldakka (profile), 29 Apr 2019 @ 3:53am

            Re: Re: Re: Re:

            Note, in a two sensor system, comparing sensors only indicates that one is faulty when they disagree, therefore manual selection becomes imperative.

            No. If you have 2 sensors and they disagree with each other, all the pilot knows is that one of them is faulty. They do not know which one, therefore the ability to manually select one of them is pointless.

            In this case, if you know a sensor is faulty, the procedure is to shutdown MCAS entirely, at which point the existemce of the AoA sensors becomes irrelevant as no flight systems are now dependant on them.

            MCAS is not a system necessary to fly the 737 MAX. It is a system that is intended to make the MAX fly like earlier 737 models so that new type certification training is not necesary. If an airline was willing to conduct flight training for the MAX as if it was a new aircraft, they could do away with MCAS entirely. Therefore once a pilot knows something is wrong with MCAS or its input systems (AoA sensors) AND they know the system even exists, the procedure is to just turn the wretched thing off.

            link to this | view in chronology ]

        • identicon
          Anonymous Coward, 28 Apr 2019 @ 8:12am

          Re: Re: Re:

          There may be two sensors but they are not polling them both during critical maneuvers. One sensor could fail during liftoff and you want the pilot to manually override that failure?

          A fully redundant system does not rely upon single sensor polling in order to determine whether a malfunction has occurred, and that polling is not manual.

          link to this | view in chronology ]

    • identicon
      Anonymous Coward, 26 Apr 2019 @ 2:14pm

      Re:

      so if we had more "watchers" monitoring the primary software engineers at Boeing -- the error would likely have been caught ?

      How many layers of software "watchers" should be routinely used ?

      Are governmment watchers better than others ?

      Boeing no doubt already has an extensive formal system of quality-control, safety monitors, and management approvals designed to prevent software design errors.
      But something in that layered chain failed.
      That's the key -- not just some general indictment of software.

      link to this | view in chronology ]

  • icon
    Wyrm (profile), 26 Apr 2019 @ 11:33am

    This looks like it could all lead to an accidental robot-apocalypse. ("Oops, I killed my owner because I mistook it for the chicken I was supposed to cook.")

    Guys, if we are to be killed by robots, please at least make it intentional. Please don't have historians (aliens or robots) make fun of us for wiping ourselves from the face of Earth with shoddy programming.

    link to this | view in chronology ]

  • icon
    TKnarr (profile), 26 Apr 2019 @ 11:48am

    I question the article's accuracy because of basic flaws. For instance, the 737 MAX airframe isn't unstable. It's the standard 737 airframe, which is very aerodynamically stable. The new engines also aren't a problem, the plane flies perfectly well with them. Sudden increases in engine thrust cause a pitch-up condition, but no worse than on some other aircraft and well within what a trained pilot can handle. Compensating for that pitch-up condition in software isn't a hard job, as long as the software can tell whether it's pitch data is accurate or not (which is a well-known and solved problem with sensors of all sorts). The failure's at a much higher level than the aircraft design or the software:

    1. At least one airline (Southwest) wanted to have pilots able to fly the 737 MAX using their existing 737 certifications without having to be recertified on the new model. That would let them save money on pilot training. That required that the MAX have the same flight characteristics as the older 737 models, which in turn required that it not be subject to the pitch-up condition the new engines caused.
    2. Boeing management decided to oblige by making the pitch-up compensation a permanent part of the flight-control software which couldn't be easily disabled or overridden. This made detection of failed pitch sensors critical, because that would be the only way for the system to detect that it was trying to respond to faulty data and take itself out of the loop.
    3. To improve revenue, Boeing management (likely a different section) decided to make only a single pitch sensor standard and offer the second sensor as an extra-cost option. This removed the only way for the MCAS to detect a failed pitch sensor, because the failure state of the sensors is indistinguishable from a normal flight condition (level flight).

    That was what led to an airplane where the MCAS could command the aircraft to dive into the ground and the flight crew couldn't counteract it even if they were experienced enough to recognize what was happening. I can't even blame the engineers (hardware and software) at Boeing, because the only option they'd've had would've been to quit rather than build to management's requirements which would leave them out of work and wouldn't even stop the plane from being built (Boeing would just hire engineers who were hungry enough to just do the work without arguing).

    Were it me making the decisions on how to handle this, I'd start with the engineers and work my way up the chain to the executives ultimately responsible for the requirements decisions. Those are the people I'd be charging with multiple counts of negligent homicide and looking to get life in maximum security prison without possibility of parole for. I'd also track down the DERs who did the certification on the MAX and give them good long sentences in prison too as an object lesson to all the others that their job is defined by the law, not what management says, and there are worse things that can happen than just getting terminated.

    link to this | view in chronology ]

    • identicon
      Anonymous Coward, 26 Apr 2019 @ 1:50pm

      Re:

      "the 737 MAX airframe isn't unstable"

      • That claim was not made, did you read the post? The engine position had to be moved forward, which affected the aerodynamics enough to warrant a corrective measure. Too bad that measure was insufficient.

      "I'd start with the engineers and work my way up the chain to the executives ultimately responsible for the requirements decisions."

      • You have obviously not worked a large engineering project as an engineer. Engineers have a way to make their recommendations known but management rarely gives them any credence. For example, look at the Colombia disaster. It is most likely that an engineer told management of the single point failure but it did not make a difference. Happens a lot.

      link to this | view in chronology ]

      • icon
        TKnarr (profile), 26 Apr 2019 @ 2:43pm

        Re: Re:

        In the original post I read the author did explicitly make the claim that the airframe was unstable. I'll have to see if I can find the link again.

        As for the engineers being ignored by management, that's why I'd start with the engineers and trace the chain up until I found the people at the top responsible for the actual decision about the requirements rather than targeting the engineers at the bottom who, as I said, have the choice of meet management's requirements or lose their job.

        link to this | view in chronology ]

  • icon
    ECA (profile), 26 Apr 2019 @ 12:16pm

    Not to long ago..(sensors)

    A person made a model rocket, and he used a Mercury sensor inside to tell when he got to the highest point, to Dump the parachute..

    He found a problem, 1/2 way up the Chute would pop out.
    Force on the Sensor, Pushed the Mercury DOWN, which was great, but as the engine Stopped and it was supposed to Coast UP HIGHER from the blast.. the Mercury goes UP from the release of the Force from the blast and triggered the Chute.

    And he had problems figuring it out..

    the Human logic of HOW it should work, and HOW itself thinks and how the computer reads the input, is AMAZING...

    link to this | view in chronology ]

  • identicon
    michael, 26 Apr 2019 @ 1:53pm

    Not the engineers

    "I believe the relative ease -- not to mention the lack of tangible cost -- of software updates has created a cultural laziness within the software engineering community. "

    It's clear that the author isn't a software engineer, because we don't make the final decisions on mechanical behaviors related to software. This is done at the management level, where beancounting reigns supreme.

    If I got a bonus everytime I was ordered to inject an obvious and documented usability problem into a system, I'd be retired already.

    link to this | view in chronology ]

    • identicon
      Anonymous Coward, 26 Apr 2019 @ 3:12pm

      Re: Not the engineers

      I've been told that I was not a team player.

      Management got mad because I told them the spec they wanted to change was a contractual item and would require re-negotiation of the contract. Out of spec root cause was in correct assembly, production did not follow their own procedures. But somehow it was all my fault - LOL. You hardly ever get a gold star for doing the right thing.

      link to this | view in chronology ]

    • identicon
      Anonymous Coward, 26 Apr 2019 @ 4:50pm

      Re: Not the engineers

      Note also that the article says "coders", and some people draw a distinction between coders and programmers. (I imagine this is more common at large bureaucratic companies.) A coder converts design specifications to working computer programs, without any real input into the design process.

      link to this | view in chronology ]

    • icon
      ECA (profile), 29 Apr 2019 @ 11:35am

      Re: Not the engineers

      dont you love Concept over design.. and not listening to those who KNOW the hardware and What it cant do?? or its failures??

      You have a sensor registering the Air input at the nose, and it notices a SLOW DOWN, when in AND UP position...
      It decises to Adjust things...
      but the Sensor only works when the AIR is hitting the nose.. and the Computer cant tell if its going fast enough, because you are bringing the nose Down until YOU GET your speed back up..(its called a dive)
      Anyone know how the Old bi-wings had to fly, because they had CRAP engines?? you DIVE to gain speed..

      Any programmers here think they could of Ran simulations ALONG time ago to figure this out??

      Oh!! and another think...Ever had a boss decide to change a sensor and NOT tell us it was Proven if it was a Proven part??

      link to this | view in chronology ]

  • identicon
    Châu, 26 Apr 2019 @ 4:26pm

    Open Source

    No perfect solution, but if airplane software is open source software more people inspect it and its system, find bugs faster, and create fixes faster.

    link to this | view in chronology ]

  • identicon
    Lawrence D’Oliveiro, 26 Apr 2019 @ 5:46pm

    No Need For An IEEE Spectrum Account ...

    ... archive.org has a copy of the article here.

    link to this | view in chronology ]

  • identicon
    Rekrul, 26 Apr 2019 @ 6:12pm

    Gee, I wonder if the software's EULA had the standard disclaimer that the company isn't responsible for any damage their program might cause. It sure is lucky for Boeing that software is a protected class of product that no company has to take responsibility for!

    Seriously(?) though, I've noticed a trend with software over the years. I don't know if the fault lies with the programmers or with management, but it often seems that the software is written to use the simplest and most direct method for doing something with little thought given to error checking. That sort of thing only seems to get corrected after the software is put into use and problems arise.

    It's like designing a car with no roof because it never rains in the lab. Or no seat belts because the testers never crashed it. Or no lights because they only ever tested it during the day...

    link to this | view in chronology ]

  • icon
    Uriel-238 (profile), 26 Apr 2019 @ 6:16pm

    Decisions based on sensors on one side of the plane

    I only flew flight simulators, but remember how conspicuously redundant the instruments were. The point was really clear: instrumentation was a common point of failure in flight, and so it was important to confirm with as many metrics possible, including looking out the window (weather provided) and making sure your analog instruments and digital instruments said the same thing.

    Lack of redundant instruments, and instruments that tell how things should be rather than how things are makes for the kind of failures that melt down nuclear reactors not to mention crash airplanes.

    This is both aviation and engineering 101. Measure twice, cut once. Especially when lives are at stake.

    link to this | view in chronology ]

  • identicon
    Anonymous Coward, 26 Apr 2019 @ 8:15pm

    All Hail the Apocalypse Woodpecker!

    Recall Weinberg's Second Law:

    If builders built buildings the way programmers wrote programs, then the first woodpecker that came along would destroy civilization.

    • Gerald Weinberg, 2010 (or earlier)

    link to this | view in chronology ]

  • icon
    AEIO_ (profile), 26 Apr 2019 @ 9:00pm

    "Shoddy Software Is Eating The World, And People Are Dying As A Result"

    Really? Color me surprised. The only difference is that computers run more things now-a-days, thus more people are affected in a computer (pun) "crash" .

    I'll just leave this here; it's only been published for a few decades. The one about the virtual attack plane flipping upside down while crossing the equator is funny.

    link to this | view in chronology ]

  • identicon
    GERALD L ROBINSON, 27 Apr 2019 @ 9:50am

    Video Games are a great example of this phenomina

    This leads to the reasons that autonomous cars are inherently dangerious. The SW development process has a lot of reviews, but any SW complex enough to do anything really interesting can not be debuted us ing the normal development cycle.

    The 737 is an example of poor engineering on Boieng's part. Making it difficult for the pilot (driver) to override the computer is criminally bad desighn! Yet this is the goal of current auto design. Years ago I was involved with industrial automation of a chemical plant. We used three computers which voted. If one was wrong it was taken offline an a spare brought in. Law the computers had access to all the instruments, just used different instruments as priority.

    One possible solution is to require all code for cars to be open source, including all design/decision documents. Implement a bounty program for finding errors.

    link to this | view in chronology ]

  • icon
    Calvin (profile), 29 Apr 2019 @ 2:30am

    Imagination gap

    As a retired software engineer this issue falls into what I call 'The Imagination Gap'.

    Software engineers test for all the conditions they can imagine, unfortunately they can't imagine all the conditions that the software will face in the real world

    link to this | view in chronology ]

    • identicon
      Anonymous Coward, 29 Apr 2019 @ 5:44am

      Re: Imagination gap

      And then there are those tests that management said will never happen so why test them .....

      link to this | view in chronology ]

  • identicon
    Hagarfa, 22 Apr 2020 @ 9:32am

    Hey everyone. I want to call your attention to the fact that it is worth testing out new ways to improve the consistency of the job. At https://www.ergonized.com/blog/everything-you-need-to-know-about-hiring-salesforce-experts-in-2020/ you'll be able to get acquainted with helpful directions. That can support you with your thinking.

    link to this | view in chronology ]


Follow Techdirt
Essential Reading
Techdirt Deals
Report this ad  |  Hide Techdirt ads
Techdirt Insider Discord

The latest chatter on the Techdirt Insider Discord channel...

Loading...
Recent Stories

This site, like most other sites on the web, uses cookies. For more information, see our privacy policy. Got it
Close

Email This

This feature is only available to registered users. Register or sign in to use it.