Freesteel Blog » How Excel also destroys the low carbon economy

How Excel also destroys the low carbon economy

Friday, January 5th, 2018 at 5:25 pm Written by:

Most software related to engineering and construction is woefully out of date, time wasting, and under-deployed. New on my list of examples is the Passive House Planning Package (PHPP), the de facto standard for designing and retrofitting energy efficient houses.

It comes as a massive multi-tab 7.9Mb unfriendly Excel spreadsheet. An example (rendered into PDF) looks like this.

The purpose of this software to “build up a useful interactive understanding of the design” in terms of materials, wall insulation thicknesses, windows direction facing into the sun, etc. and so forth. The results have been validated to a statistical average (but with up to a factor of 2 error), there’s a huge industry of consultants and training materials around it, and it’s trusted by the experts who seem pretty happy with its format.

The problem is that building it in the Excel platform fundamentally cripples its capability. And by being a paid-for product, not an open source program, they prevent any software developer, who is up to date with the efficient and more modern methods of production, from making improvements. (Instead these software developers end up devoting their time to perfecting a remote control light bulb and writing more lines of code than would ever be found in a Javascript-based port of the PHPP.)

Some background.

I recently paid consultants to fill a PHPP spreadsheet with the parameters of my house and received back a printout with some numbers. As this was insufficient for the purpose of “building up a useful interactive understanding of the design”, I further paid the full price of £140 to the Passive House Institute for an empty copy of the PHPP and was then able to obtain a copy of the spreadsheet with my numbers in it from the consultant.

You can tell instantly that this system should never be in Excel, and must be supplied as an on-line or semi-online service for several straightforward reasons.

Firstly, because Excel can’t preform basic operations like accessing an API through the internet, this spreadsheet has to come with its own out-of-date copies of data sets relating to climate variables and building materials.

Here is a section from the 800 row table of values of monthly average temperatures at different locations around the world.

We get just 22 stations to approximate the whole of the UK, and the data is sourced from Meteonorm who would like to sell you a 250Mb downloadable lump of their data, and have only as recently as 2015 worked out how to provide the obvious online webservice API.

In his 2012 article “A proposed method for generating high resolution current and future climate data for Passivhaus design”, Dr R MacLeod of Loughborough University explained that it used to be much worse:

According to the Passivhaus Institute (PHI) the verification of a Passivhaus design must be carried using the Passivhaus Planning Package (PHPP). As a result, this quasi-steady state software is the de facto software used for both the design and compliance predictions of Passivhaus buildings in the UK and around the world.

In the original PHPP models, (PHPP04 and PHPP07) climate data for the UK was derived from TRY (Test Reference Year) datasets for half a dozen locations. In most cases, this data was thought to be adequate for Passivhaus verification based on calculation of the mean annual heating demand. However, since it is possible to obtain Passivhaus certification based on peak loads, questions were raised about the appropriateness of using only a single UK climate data set (Manchester) as a proxy for calculations across the entire UK.citation[2010] This situation has recently evolved with the production of 22 UK regional datasets developed using Meteonorm (MN) interpolation, which have been cross-checked against EPW climate files and ratified by PHI.

Basically, the demands of a low energy house are especially sensitive to the climate, and it’s going to make a big difference if you’re in a windy hillside or in the middle of a big city (both of which occur in the catchment of each weather station).

There is no reason to compromise and do it like this, when institutions like the MetOffice host enormously rich datasets of climate measurements and predictions for each square kilometre of the country.

Now, there is a work-around with a separate climate data tool for generating the spreadsheet numbers to cut and paste back into the PHPP, but this is a poor hack.

There’s more. Here is a snapshot from a second massive out-of-date embedded table of Passivhaus components.

Why is this not seamlessly implemented through an online database of components with prices and filtered by the proximity of suppliers and links that can generate purchase orders? If the Passivehaus Institute did this, they could charge a commission on orders that would totally cover the costs of providing the PHPP as a free online service to everyone, while reducing road miles and money wasted on advertising. So you’re a builder’s merchant who is importing the latest Compacfoam tech from Austria: you add this product into the list, and anyone using the online PHPP in your distribution area will find out about you at the point at which they know exactly what quantity they want to order. Business sales efficient.

Secondly, Excel is such a restricted platform for integration it can’t handle incoming data, or do any graphics with a friendly visualization of a house showing different sized coloured arrows denoting heat transfer in and out through the roof, ground, and so forth like they have on the front cover of every brochure.

Once again, for the importing of data from an architectural model, there is a hacky work-round but only for Sketchup. This is inadequate.

And finally, being in Excel means it’s a dead model; you might as well print it out on a piece of paper and nail to a tree in the rain. Meanwhile, the PHI maintains a comprehensive worldwide Passive House Database that includes 140 houses in the UK, like this one in Bristol which lists its vital statistics, such as: “23 kWh /(m2a ) calculated according to PHPP”.

Many of these houses will have smart energy meters and thermostats controlled by voice activated Amazon products, and could be running a realtime statistical simulation of the energy consumption based on the PHPP calculations and returning data of inaccuracies back to the PHI for further research — if the PHPP code was in anything other than Excel.

Mathematical functions that are implemented in an actual programming language, and not terminally mixed up with the data in a spreadsheet format, can be put to many other uses.

The PHI could have as their target that every new house build in Europe gets its own permanent record in the database that is sufficient to rerun their simulations. This comes as a marginally free byproduct of BIM and is a significant step up from the inexcusably crude EPC measurements. This database would provide a baseline target that could either be checked by the gas utility on aggregate or your household’s talking robot with access to the heating system to monitor when matters are diverging. If there is an exceptionally cold winter, the increase in fuel use would be predictable when running with a set of live and validated models. Remove the guesswork and you can free up the waste caused by wrong supply, improper policy and long term systemic underperformance.

That is my theoretical critique of this Excel platform in terms of it’s inefficiencies and the lost opportunities. Now I’m going to go through its underlying implementation.

But first, if the problem is so obvious, why has the Passive House Institute not done anything about moving away from Excel?

I don’t know how this particular organization works, but I have observed patterns with regards to supplies of engineering type software across the sectors enough to guess, both from within the organization of Autodesk, and by observing the severe lack of up to date tech deployed in other niche products such as ClearRoute.

Here is what seems to happen.

Firstly, there is no competition among products that address a particular engineering problem. Often this is because the software has become part of a regulatory or accreditation step, so you have to use it anyway for one part of the process, and you might as well use it for everything else once you’ve paid for it in terms of money and training.

The main established supplier capitalizes on trust on the basis that it’s the default choice and there won’t be any repercussions for anyone if you choose it and it goes wrong. For example, If you use de facto standard concrete beam strength simulation calculation software, and the beam breaks, then it’s not your fault because it’s a decision everyone else would take and they can’t criticize you for it. It’s not the software’s fault because it’s been used in the field for 30 years, which proves it has an acceptable margin of risk. Responsibility is diffused.

Not so if you used a newer rival product, then all the responsibility would be on your own head, quite literally, and nobody wants the consequences of that — especially if someone else is paying for it. There is no independent standards agency for software, like there is for materials and other hardware components, to give users the confidence to change product.

In general this engineering software tends to be closed source, which means it is impossible for an independent software engineer with no conflict of interest (ie not employed by the company) to see into it and refute the claims that the software is sound.

Secondly, the users have such a deep sense of dependence on the software that they don’t ever want to complain to the supplier that development has fallen far behind the state of the art. You get a lot of rationalization and an unwillingness even to imagine how things could be so much better and cheaper for them. Often the user isn’t ultimately burdened by the excess expense; the cost of the software — including billable time wasted by its inefficiency — is one part of a larger project, and it can be passed on to the end client.

Thirdly, the software supplier doesn’t hear any demands from their customers that things need to be better. It’s a monopoly, nobody is going to say to them: “I’m changing from you to product B, because product B has this feature you haven’t got which I have been telling you about for the last five years.” They either use this, or nothing, and this is certainly better than nothing. That is the low bar it’s easy to reach.

And finally, the supplier will tend to be managed by a conservative who lacks any software innovation, because it’s not their area. Dr So-and-so who founded the company in 1983 didn’t know anything other than Excel at the time, and doesn’t believe any other programming language will be beneficial, because he doesn’t know anything about it and isn’t that interested. His ill-informed opinion cannot be challenged within the organization because he controls the rights to the codebase and only allows people who agree with him to do any developments.

In their report to me on my house, the consultants described the PHPP software as:

[A] powerful tool used to assist in the design and quality assurance of low energy buildings. The PHPP consists of a series of interlinked worksheets, typically presented in Microsoft Excel Format. Values within the software are calculated using a series of building physics based algorithms which have been derived from dynamically simulated models. When detailed information is entered into the PHPP, accurate and reliable results can be generated. The calculations in the PHPP occur instantaneously, ie when an entry is changed in the model the immediate effect upon the energy balance of the building can be seen, ensuring that the user can optimize the design in the most efficient manner.

Two things here. The results are instantaneous because the model isn’t doing anything other than a few multiplications and divisions with no actual iterative or monte carlo simulations. And secondly, to really optimize designs you should see graphs of outcome against an input parameter (eg wall insulation thickness) so you could spot the point of best trade-off from the curve rather than having to try out the different numbers one at a time. Something as basic as a FOR-loop, this cannot be done.

Let’s open up this spreadsheet for my house as the consultants left it to me.

First select the Existing Dwelling Variant

Now I’m going to work from the Verification tab where it’s possible to set the Interior temperature winter [C] (look at the middle top row).

When you change it from 20degC to 30degC you get this:

That’s weird. For a higher interior temperature the heating demand has gone down from 434 to 430, when you’d expect it to go up. (Even having read the documentation I’m still not sure of the exact difference between “heating demand” and “heating load”.)

This number is from cell Q77 on the Heating tab, where it is set to O77/O9 = QH/floor_area.

For fans of Excel, I’ve simplified all the equations from their horrendous original forms, which look more like:

=IF(AND(ISNUMBER(O10),ISNUMBER(O77),O9>0),O77/O9,””)

Now, QH = QL – QG = heat_losses – heat_gains, which is 41273-8852 at 30degC and 41561-8872 for 20degC.

The second number (8852 at 30deg) changes slightly due to a change by 0.3% in the Utilization factor heat gains in cell O72, which is set to the average of the 12 percentage values in row 116 whose series of questionable equations eventually depend on a differing monthly Ground Temp in Row 76 that is copied from $Climate.row32, which in turn is copied from $Ground.row110 (Winter).

You explain to me what the Monthly Average temperature for July in Winter means, because I can’t make sense of it.

These cells contain the (simplified) formula:

=P9 - (H86*(P9 - P11) + H87*P12*COS(2*PI()*(E92 - P13 - H85)/12))/H88

Where P9 is the specified constant average indoor temperature of either 20 or 30degC — which accounts for the variation between 8872 and 8852.

We will examine the other terms of this ground temperature formula later on.

The larger term in cell O48 (the one that’s 41273 for 30degC and 41561 for 20degC) is QT+QV=33857+7416 (vs 34146+7416 for 20degC).

Now, QV represents the ventilation heat losses, which you’d expect to change if you put the thermostat up, but it evidently lacks a dependency. There might be a dependency on temperature when there is a mechanical ventilation value, but this could be a omission.

The larger QT value (that’s 33857 for 30degC) comes from a simple multiplication of area, U-value and kKh/a Gt value in which only the “Floor slab / Basement ceiling” number changes from 45 at 30degC to 61 at 20degC — meaning that as I turn the interior temperature of the house up I lose less heat through the floor. How does that work?

Also, why don’t the numbers for the non-floor parts of the house change when I change the temperature? They do so on the “Annual heating” tab, but not in this monthly heating table. On the “Annual heating” tab the Gt value is $Climate.K10+(O6-20)*I59*0.024), where O6 is the thermostat temperature (20 or 30), so it depends on this factor.

This bug is explained below.

Those monthly dependencies depend on these orange German tables of various environmental monthly temperatures that are initially folded away.

… whose difference is taken from the Median Indoor temperature in row 87, which looks like this:

So this row of temperatures is not updated from the number 30degC that is given in cell O6, which is a bug.

But then why does the Floor slab Gt value change in cell M17?

Well, it comes from cell AH99 (see orange panel above) which is set to the sum of that whole row99 labeled as “HeizgrSt. Grund”, and these numbers update because they are the difference between the Median indoor temperature (incorrectly hardcoded to 20) and the Ground Temp in row 76 times the number of hours in the month divided by 1000.

The Ground Temp is taken from row 32 on the Climate tab, which in this case is taken from row110 of the Ground tab, as we discussed above.

Here it is again:

The (simplified) formula in the first cell is:

=P9 - (H86*(P9 - P11) + H87*P12*COS(2*PI()*(E92 - P13 - H85)/12))/H88

… where P9 is the Average indoor temperature (30degC or 20degC), and most of the other numbers have labels like the “Steady-state transmittance LS” (H104), and “Phase shifting of Te,m” (P13), and “Average ground surface temperature Tg,avg” (P11).

Now, that Average ground surface temperature is an interesting one because it’s suspiciously set to $Climate.AE27 + 1.

Note that all the other fudge-factors and adjustments make the pretense of being driven by physical processes, while here we just get an unexplained round number of +1 on the climate value. Not a good sign.

We go to this cell AE27, inexplicably called Mittelwert.

…and it’s the average of row24 of monthly Exterior temperatures from the row of data the location GB0008a-Fairfield.

It happens to get there with the formula

=INDEX(W219:W1008, MATCH(Z18, I219:I1008, 0)) - AB24

Now, AB24 has the small value of 0.108 and is equal to (D18-J23)*0.006, which is the difference in altitude between the weather station and the house multiplied by the environmental lapse rate (=0.006 degC/m at sea level). You have to know that this is an insignificant factor in comparison to, say, the urban heat island effect — which is totally unaccounted for.

The other part of the formula (“INDEX(W219:W1008…”) pulls a line from the 800 row table of values for different locations around the world, mentioned at the beginning of this post.

I don’t expect many people to get this far through this article, which is why I put all the conclusions at the top.

Basically, if you’re computerizing some paper accounts of a small business and it’s before 1993, then Excel is probably the right tool for you. Otherwise you’re just trying to bang screws in with a hammer and it’s not a pretty sight.

It is the duty of all serious craftsmen, such as modern day software engineers, to cringe violently when they see this sort of thing happening.

Unfortunately we tend to get ignored because the experts don’t think we know what we’re talking about. And that’s why the software for controlling an open source light bulb is ten times better than what passes for the state of the art in building planning design.

5 Comments

  • 1. Graeme Nisbet replies at 11th January 2018, 9:32 am :

    Looks like an excellent business opportunity.

  • 2. Clive replies at 19th January 2018, 7:32 pm :

    You can do HTTP calls from within Excel – maybe not quite so seamlessly as in a language where it’s designed in from scratch, but it’s not too hard. Been doing that for years 🙂

  • 3. Julian replies at 22nd January 2018, 6:53 pm :

    Lose the words “from scratch” and I’d agree with that statement.

    There seems to be no canonical documentation page about it (to avoid random Stackoverflow snippets involving random calls like: CreateObject(“WinHttp.WinHttpRequest.5.1”)

    Can you find any proper documentation that I could refer the PHPP programmers to, to sort out the ridiculous situation of these out-of-date lookup tables in their product?

    I’ll be disappointed if it doesn’t document the further abomination of Excel running as a webserver and serving out rows of data as binary blobs of excel to insert as new tabs.

  • 4. Clive replies at 5th February 2018, 9:56 pm :

    I don’t think it’s that much harder than in straight Java, and probably easier than C++.

    I agree there’s probably no definitive documentation from MS on how to do exactly what you after. We probably worked it out many years ago from whatever equivalent of Stackoverflow was around back then, and now have the libraries to do it so don’t need to worry again.
    That CreateObject call is key though 🙂

    (MS have documented the library you need to use :
    https://msdn.microsoft.com/en-us/library/windows/desktop/aa385331(v=vs.85).aspx

    and how to call libraries :

    https://msdn.microsoft.com/en-us/library/office/bb687915.aspx

    but I’d recommend starting from a blog post/stackoverflow for a first go at putting it all together)

    I suspect you might hit a different problem though – the spreadsheet people will probably say “But it has to work without an internet connection”, at which point they’ll ignore anything else you have to say.

  • 5. Julian replies at 6th February 2018, 8:23 pm :

    “It has to work without an internet connection” is usually the last refuge of a scoundrel.

    Those links don’t seem to detail anything in particular.

    Do you have a small Excel spreadsheet that performs a simple http request and fills in a row? If I get one I can send it to them as a demonstration and see what they say.

Leave a comment

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <blockquote cite=""> <code> <em> <strong>