Who is responsible for QA(Quality Assurance), i.e. testing, the GCM's (Global Climate Models), who SHOULD be?

"The results of my direct correspondence about these matters with several of the US agencies and personnel involved in GCM development and applications, and analyses of CO2 and Climate Science, with a strict focus on software Verification and SQA, have shown that Verification and SQA are not considered necessary aspects of any analyses of these important issues. My correspondence has been with appropriate personnel at NASA and GFDL, among others. Correspondence with editors of several of the high-impact Journals in Climate Science, Journals such as Nature, Science, and those published by the AGU, AMS, AAAS, for examples, indicate the same results. These Journals do not have Verification and SQA requirements in place for the papers submitted for publication."– from a letter by Dan Hughes to Rona Birmbaum

"Third, they have undergone an extensive peer-review process and been validated by numerous scientific bodies."–Rona Birnbaum
Chief, Climate Science & impacts Branch

http://danhughes.auditblogs.com/2009/06/10/epa-hq-and-software-quality-assurance/

As my personal training is in the area of computer science, I am appalled that one so high up in the food chain thinks that an "extensive peer-review" process is a valid substitute for actual code testing.

Does this dimwit have any idea what the process of regression testing encompasses? How is it that a peer-reviewer, caught up with the time requirements of their own projects, has the time to write a "valid" test plan and perform it to the extent required to validate a GCM? She is obviously out of her mind and her league. She would best be best advised to sit this one out and leave it to the experts in this area.

How many of you out there even realize that today’s modern software systems are NEVER bug free. When you have a million lines of code it is impossible to keep all errors out.

Considering that fact now think about how many times the AGW supporters have tried to convince you that you should believe based on the output of these models. Absolutely baffling, if they are so accurate, then why have they never been tested?

So what do you think, who SHOULD be responsible for insuring QA in these systems that we are basing BILLION dollar decisions on?
pegminer:

Show me one test plan for the dozen or so GCM’s that exist?

I have personally taken part in software testing. Do you know the slightest thing about testing or what cases you need to check for when designing one? Or do you think that because you have written a few VB applications for your varied scientific projects that you are now qualified to decide when code has been tested?

Just because you have studied science does not mean you know a da** thing about software design or validation.
The GCM code is publicly available. Where are the test plans? As is illustrated in the excerpt from David Hughes letter, nobody feels that vlidation is "necessary". Well why the h*** not? We aren’t talking about spending 0 on an operating system that might flash you the blue screen of death every 5 minutes, we are talking about a mufti-billion dollar investment to fight global warming, based heavily on the output of this software.
pegminer:

I am beginning to believe that you are naive enough to believe that software testing can actually consist almost entirely of running a program to "see what happens."

This is not how software testing is done. The sooner those of you in the Climate Science community, and those following you blindly, realize this the sooner we can get get some actual work done.
Hi Dawei:

There are any number of errors that can occur in software from failing to check for array bounds to initializing variables incorrectly, to using multiple pointers to access the same variable, intermittently changing a value. Then the next time you use the variable for no discernible reason it has changed. If your lucky it is a catastrophic error. Then there is the whole problem that 0 may or may not be zero. So what happens when you cast a float (decimal) to an integer and back? For example lets say I have a number 3.247237 and I actually need it to 4 decimal places. But somewhere somebody has cast the value to an integer. Now the value is 3. Sometimes casting can occur automatically and when I cast it back it might be something like 3.0002 because floating values are not exactly 0

Just imagine what could happen if you have enough numbers that go through this process? Did the writer of the function fail to document this function truncates values
Hi Dawei:

There are any number of errors that can occur in software from failing to check for array bounds to initializing variables incorrectly, to using multiple pointers to access the same variable, intermittently changing a value. Then the next time you use the variable for no discernible reason it is has changed. If your lucky it is a catastrophic error. Then there is the whole problem that 0 is not necessarily 0. for example when you cast a float (decimal value) to an integer and back you loose data. lets say I have a number 3.247237 and I actually need it to 4 decimal places. But somewhere somebody has cast the value to an integer. Now the value is 3. Maybe it was automatically cast back and in the process is something like 3.0000002 because floating values are not exactly 0. Just imagine what could happen if enough numbers go through this process?

Or maybe the writer of the function failed to document that this function truncates values at 2 decimal places?
pegimer:

You are a great one for telling others to stick to their area of expertise. I suggest you stick to yours. My education is in software engineering. I don’t care what you think will "pass" as testing. There is a way to do it and a way not to do it.
Software testing has been well studied and documented. So like I said, because you have written a few lines of code in your lifetime you seem to feel that qualifies you in the area of software design and test?

If there is no test plan how can you know it was tested?
For example let’s say I have a function that is getting a pointer to an array of functions. The first time I test it it works. But in subsequent iterations someone has deleted the function my procedure needs. When I tested it it worked. Does that mean my code is valid, even though I have failed to perform regression testes that would have caught my error? Oh, but that is right you have reinvented the wheel. It is Ok to test any old which way.
Pegminer:

You are an arrogant self absorbed hypocrite.
From one of your previous posts: “I think it’s time for the deniers to leave science to the scientists.” http://answers.yahoo.com/question/index;_ylt=Ahgn.VldVTXn6Vvy0QYLkfTty6IX;_ylv=3?qid=20090809024405AAuoKea&show=7#profile-info-NqGkbDetaa

I think it is time for you to leave computer science to computer scientists, such as Dan Hughes and those of us actually trained in the field.
Pegminer:

Although you are not a trained computer scientist, you are willing to use your limited knowledge about computing from the few basic programs you have written within the framework of your research as a testimony to how much you know about software testing. If there are “other ways” of testing, then tell me, what is a “valid” test?

You’re a climate scientist right? Exactly how much time do you spend in a week writing test cases? How many test plans have you written to test the code? If you have “peer-reviewed” software when do you go back and test it again to be sure it hasn’t broken since your last test? You are a peer-reviewer right? How much time do you have in a week to alot to quality assurance of other peoples’ work? How much of your precious time are you willing to spend on testing in a week?
An excerpt from the university at which I received my degree: “Less than 10 percent of computer sci¬ence programs in the U.S. are accred¬ited by the Computer Science Accredi¬tation Commission.” My university is one of those less than %10. I know for a fact that it is very easy to write unmaintainable spaghetti code. I have worked with individuals that create such atrocities.

As I have shown you verifiably in the excerpts from Dan Hughes letter, that the science “has not been done.” The most important ingredient has been left out, testing. Would you fly in an airliner with untested software systems?
I wanted to point out one last thing for people to consider if you come across this question.

Pegminer asserts that the testing of the GCM’s is far greater than anything he has seen at a simulation company.

This assertion from a climate scientist is in direct conflict with the evidence which shows that SQA is not "deemed necessary"

If the agencies responsible for generating the GCMs response to a letter inquiring about SQA is that it is "not deemed necessary", then how does pegminer expect us to take him at his word?

In all of pegminer’s ramblings not once was the issue of the agencies not deeming testing necessary addressed, which was the root of the question.
In all of the answers only part of the question got answered. I feel that the problem of testing the GCM’s and the data is an important one. We must not let a few ego filled scientists wanting the lime light steal the show prematurely. Let the science and the methods mature.

So here is my suggestion:

Let’s establish an independent organization that is solely responsible for verification of GCM’s. Let this organization be responsible for developing tests and performing them on any code used in GCM calculation.

Let’s also establish an independent Climate Data Standards Committee to which all experiments utilized to dictate public policy must comply.

We need to make the science better before it can be used to form the basis of billion dollar decisions.

Both comments and pings are currently closed.

2 Responses to “Who is responsible for QA(Quality Assurance), i.e. testing, the GCM's (Global Climate Models), who SHOULD be?”

  1. amancalledchuda says:

    Yes, I couldn’t agree more.

    I wonder what the result would be if these GCMs were put through the same double-blind tests that drugs companies have to go through.

    One team supplies the GCM. Another team supplies the parameters to go into it. And a third team runs the model to produce the result. None of the teams have any knowledge of who they are working with.

    I can’t help but think that the whole Global Warming panic would disappear almost overnight if this were to happen – which is why the Global Warming Lairs are making sure it never does, of course. Hardly very scientific.

    Would you happily take a drug on the strength of the claims and promises of the manufacturer?

    Of course you wouldn’t. You’d insist that the drug is properly tested first. That’s science.

    With Global Warming, you are expected to trust the word of the people involved. That’s religion.

    As ever with Global Warming – don’t believe the hype.

    :::EDIT:::

    Response to Dawei.

    You raise a very interesting point in your middle paragraph that applies to many aspects of Global Warming, not just computer models. What you’re saying is that, all things being equal, around 50% of the “problems” with any data will show too much warming, while around 50% of the time they will show too little warming, or even cooling.

    Is that what you’re saying?

    The problem with the issues you mention is that, in Global Warming, the “bugs”, “mistakes”, “errors” etc, are only ever found when they throw up anomalous data that shows too little warming. If the result of the data is the warming that was expected, then the attitude is just “That’s what we expected, so that’s fine.” And no one checks the data.

    Think about it; how many times have you heard of data in the area of Global Warming being flagged as anomalously too high? It almost never happens – and when it does, it’s the sceptics that find it, not the alarmists.

    But, as you point out, it should be a 50/50 split, shouldn’t it?

    Take the ARGO data as a recent example. It initially showed rapid ocean cooling, so the data was checked and checked until errors were found. After correction the result was still slight cooling, so the checks continued. I’m not actually sure what the current situation is, but realclimate.org is saying that the ARGO data now shows warming – though I’ve seen a recent paper that says it’s showing cooling still. Regardless, the point is; had the data showed warming as expected, or even higher than expected, do you honestly think that the problems would have been found?

    Just look at the GISS “corrections” to the global temperature data. All things being equal, the positive corrections should be matched by similar negative corrections, but that’s not the case is it? The “corrections” amount to a shocking +0.5⁰F!

    So, in summary, what you’re saying is absolutely correct, but, sadly, Global Warming “science” (I use the word very loosely) is so corrupt that the errors are only ever spotted (or even looked for) if they result in data that contradicts the Global Warming hypothesis.

    Are you comfortable with that situation Dawei?

    As a final comment, I note with astonishment that (at the time of editing) I have 2 thumbs down for the first part of my answer. So at least two people think it would be a bad idea to have double-blind tests done on CGMs to ensure that the predictions they are making are accurate.

    I rest my case!

  2. Alexander6 says:

    buy@generic.VIAGRA” rel=”nofollow”>…

    Need cheap generic VIAGRA?…