Rolling out new software or how not to schedule things



  • I am part of a 2 person + 1 manager team rolling out some new PLC programming software for a customer and after I just got off a phone conference I am sitting here looking at head shaped indentation on my desk

    For those of you unfamiliar with PLC programming software, think of the PLC as a proprietary 3rd party embedded system which requires the vendors specific IDE in order to write and download binary code to the controller - no compilation step involved. On the controller end is effectively a very robust VM that runs the binary code. It is even locked down to the point that the controller records the version of the IDE used to write the program - so you can't edit the program with an older version of the IDE (if you need to use a previous IDE all you can do is blast the entire controller with the last version of the program written with that IDE). While this may sound horrible when compared to the openness of general programming, the locked-down nature does give you some nice benefits such as on-the-fly program updates, and rock-solid operation to the point that only a physical hardware crash will take a PLC offline. And in a manufacturing environment where a crashed system could break physical things causing major $$$$ damage or even maim or kill people the locked down nature is worth putting up with.

    The project that I am working on is to upgrade the customer from an older (now unsupported for about 8 years) version of the IDE that was basically a stand alone system to the latest and greatest version which also has a client server based change management system so you can keep track of what version of code is running on what system. I am working on the server side of things while my colleague is busily working on upgrading the the programs to the latest version of the IDE. Which has led to the following WTFs

    • The manager is the person who sold the system to our client - he is a salesman for this product. He sold the customer on the idea that they would need to supply the server machine. He also sold them on the scripted functionality that would allow the IDE go out to each PLC and check if the version in the machine matches that in the change management system (this allows you to check for maintenance people who are creating rogue copies of the code that are not in stored in the server - and yes that does happen). The trouble is that at the moment the Change Management software and the IDE can't live on the same box as they are at different revision levels that precludes them from sharing some code. If they were at the same level then we could put them on the one box. After the purchase order was made the manager asked me to look over it to see if what we are supplying was reasonable. At that point I had to inform him that they will need 2 servers, or a 2 VMs or anything with a 2 in it in order to meet the requirements that *he* sold them.
    • In the meantime my busy little colleague has been running around upgrading the PLC code to the latest and greatest IDE. The trouble is that without a server he has no where to put this code. And we are not likely to get a server to work on until the end of this month.
    • While all this has been happening, the maintenance staff who do the day to day PLC programming are still using the old IDE. Remember what I said about how you can't program with the PLC with an older version of the IDE? Well it seems that the people who use the PLCs think that its better to have a useable system rather than the latest system - so when ever they need to do work on a PLC they are blasting it back to the previous version. This is not unreasonable as they are not even going to get any training on the new IDE to the end of *next* week (and yes it is different enough that they will need that training)
    • Just to add insult to injury, I spent a bit of time over the weekend trying to set up a VMware test system so we could show the customer's engineering manager what the final system would look like and how their maintenance staff would use it (and yes, that means he never even saw the software running in the purchase order phase). I install the Change Management server, the apply the service pack, and then apply the latest SIM (Software Improvement Module - ie patch). Nothing works. Blow away the VM, rinse and repeat. Same thing. Then on my third attempt I discovered that I had to do the initial install, start the package, then apply the service pack. Then I slapped on the SIM, which caused things not to work again - albeit for a different reason, and at the time I didn't know it was the SIMs fault. Tried some other options and then called tech support this morning .. "Oh yes, SIM 14 breaks the system, I know that SIM 11 works - you should try that one. SIM 15 will be out sometime in the future and does fix the issue". Thanks for wasted weekend.

    So the customer was sold a dream, we have no project plan (or completion date), we have no idea as to where we are going to put the final system, and my colleague is running around making changes that are being reverted by the customer so that they can still do their day job.

    Finally I'm amused by the fact that the back end of the Change Management server is built on VSS and that is not even the TWTF in this post!



  • @OzPeter said:

    At that point I had to inform him that they will need 2 servers, or a 2 VMs or anything with a 2 in it in order to meet the requirements that he sold them.
    So you guys bought a Pentium 2 and everything was great forever, the end.



  • I feel sorry for your desk. It probably disagrees about PLCs not causing major $$$$ damage.



  • @OzPeter said:

    maim or kill people
    @OzPeter said:
    blast the entire controller
    @OzPeter said:
    blasting it back
    @OzPeter said:
    Blow away the VM

     I like these bits.

     It's like BOOM BANGGG BLAST VVOOOSH KABLAMMO and weaponized frying pans.

     



  • So who did the change management that allows the CMS to be incable of managing a changed IDE?  Shouldn't the CMS be integrated with the IDE so that they don't release a new IDE without testing it against the CMS?  The rest sounds like Dilbert's MFU 1 and MFU 2.

    @OzPeter said:

    rock-solid operation to the point that only a physical hardware crash will take a PLC offline

    What exactly do you mean by "physical" here?  Some people use "physical" when perhaps "actual" would be better.  Do you mean "physical" as "There's a funny smell and we can see flames coming out the side holes," or "Please send somebody over to press the reset button"?



  • Of course, on embedded systems where …

    Oh, RATS!

    (well, you were all THINKING it, right?)



  • @Qwerty said:

    So who did the change management that allows the CMS to be incable of managing a changed IDE?  Shouldn't the CMS be integrated with the IDE so that they don't release a new IDE without testing it against the CMS?  The rest sounds like Dilbert's MFU 1 and MFU 2.

    The IDE is bonded to the PLC, and stores all its programming information into a binary blob. The CMS is a separate product (and made by a different team from the IDE/PLC team) that versions binary blobs - and by version I don't mean stores diffs - it just stores a zipped version of each blob you check in. But the CMS can be hosted in the IDEs UI. Its a crazy system - especially when I have had to deal with bugs/lack of feature that are the result of the CMS and IDE teams not really communicating. One scary thing is that due to historical reasons the CMS in some ways has better support for the programming files of their competitors IDEs!

    In re-reading what your comments I realized what you were actually targeting. Yes it is a crazy Dilbert world that I reside in. In fact there is a matching CMS release to the latest IDE release. However it is unofficial and not supported. You enable it by adding some command line parameters to the installer. However we have been informed that no patches will be made to the CMS side of that release in the future - so we can't install that for the customer. It sounds crazy, but I believe that the reasoning is that they are trying to split the CMS totally off the IDE and this non-release of the latest product is meant to push us in the direction of the "blessed" version of the CMS client which "only" relies on Active-X and IE . But I have no idea when they will come up with a CMS and and IDE that can reside on the same box. Already the IDE can't reside on the same box as another one of their products. This other product is a report generator that pumps out all sorts of data via HTTP for management viewing (written in JAVA but only "supports" IE - go figure) The problem is that the IDE co-opts port 80 for its own communications use and locks out the web-server. The only solution is to move the web server to other than port 80, which then starts to get fun when people want to find other things hosted on the same web-server

    (continuing from the first para .. yeah I'm tired) Anyway the overall CMS/IDE architecture has the advantage that if you can package something into a binary blob you can throw it at the CMS, and hence you can use the CMS system for not only your own products, but those of your competitors. Though it does mean that in order to report on diffs you have to extract two separate versions and let the native IDE toolset work over them to tell you the differences. Throw in some scripting for manipulating blobs and you have a fairly flexible systems that can handle a multitude of proprietary binary formats (although the scripting language of choice for this product is VBScript - yes I know). The product does suck in a lot of ways, but it is actually one of the better ones around

    @Qwerty said:

    What exactly do you mean by "physical" here?  Some people use "physical" when perhaps "actual" would be better.  Do you mean "physical" as "There's a funny smell and we can see flames coming out the side holes," or "Please send somebody over to press the reset button"?

    Yes, I really mean fire out of the side holes type failure. I can't quote numbers off the top of my head but PLC failures due to software in a similar vein to say Windows computers are rare . I would expect more hardware failures. This is because the systems are so locked down that it is near impossible for the equivalent of a user land process to take the system down. But then again I suppose that this is probably biased by me having to support stuff on Windows rather than a more robust OS.


Log in to reply
 

Looks like your connection to What the Daily WTF? was lost, please wait while we try to reconnect.