8.5.1 is still broken for large applications

A while ago Erik Brooks blogged about a major problem introduced in Notes/Domino 8.5.1. IBM reacted to this quite fast and released a fix for the problem.
Erik has posted a summary of where we are with the issue here: 8.5.1 UNFAIL - Part 2: The Fix.

Basically the problem introduced in 8.5.1 is that the behaviour of all variants of GetSOMETHINGByKey (For instance GetAllDocumentsByKey) has changed compared to earlier releases of Notes/Domino. Now the LotusScript might fail with "The collection has become invalid" if the view used was updated while the LotusScript code was running.
If you have an application with "lots" of document updates, this issue will show up at your site.

A fix was released, but only available on request from Lotus Support. But at least you can have it.

Now it looks like the fix is actually only a half way fix. It apparently only addresses the issue when the LotusScript is executed on the Lotus Domino server. In an agent for instance. But the original problem is also seen when the GetSOMETHINGByKey is executed from a Lotus Notes client. Have a look in the comments on the original blog entry. Comments number 27, 28 and 29: 8.5.1 FAIL. Your code may just break.

Most of the organisations that I know who have Notes/Domino Applications will experience this problem. This is a major bug, and it does not sound like it is scheduled for a quick fix.


kOMMENTARER

1 - @1 Wolfgang: That probably explains why the technote URL that Keith Brooks links to does not work today.

This is not a minor problem. There are probably tens of thousands of LotusScript code lines that contain GetSOMETHINGByKey.
Skrevet af / Written by Jens Bruntt fra 12:24:49 På 24-03-2010 | - Hjemmeside - |

Gravatar Image2 - To answer some questions.

@2. Correct. The tech note was withdrawn due to an issue with the current hot fix.

@7 The reason to open a PMR is for tracking purposes. Hot fixes do not go under the same extensive testing that a fix pack or point release would have to go through (they do at a later point when merging into FP/Point release).

So if an issue does occur then support will chase it up with all the customers. I (and others) spent this morning contacting customers and updating them on the status of the hot fix.

I don't have any details on hand regarding a client fix (apologies). The process to generate these is somewhat different and non-trivial. So the server fix would normally be built first in such an instance.

I can say that development are well aware of how critical this situation and are working on it around the clock.

I've mentioned this before, but support are committed to making sure you are looked after. So if you feel your PMR is not being handled correctly please ask to speak to a Manager.
Skrevet af / Written by Simon O'Doherty fra 23:13:45 På 24-03-2010 | - Hjemmeside - |

Gravatar Image3 - @8 Thank you for clear answers. To me it sounds like you are handling the issue very in a very professional manner. It makes sense to me.
Skrevet af / Written by Jens Bruntt fra 07:00:15 På 25-03-2010 | - Hjemmeside - |

Gravatar Image4 - The problem is getting more and more ridiculous: I requested the fix from IBM support on monday. Today I got the information the the fix is withdrawn due major problems with the fix.

So: No fix for the domino-server! And for the client there is "no need" according IBM support.

Regards
Wolfgang
Skrevet af / Written by Wolfgang Haderlein fra 11:27:41 På 24-03-2010 | - Hjemmeside - |

Gravatar Image5 - I just counted, in just one of my applications (my largest application here at work, but just one of several) I have 300+ instances of GetAllDocumentsByKey() and GetDocumentByKey(). 179 and 148 respectivly, to be exact. Most of that code is running on the clients.
Skrevet af / Written by Karl-Henry Martinsson fra 15:37:43 På 24-03-2010 | - Hjemmeside - |

Gravatar Image6 - @4 Chad: Maybe I don't unterstand it: What different requirements had to be discussed for a client fix for us as a customer? There is a huge bug, that had to be fixed and we want to deploy it to all our client-installation regardless if every client uses a application that has the problem.

And: I thing it is very strange that for this we have to create a pmr. I created one but want an offical statement from IBM why this is necessary. Sorry, but if Microsoft would act like IBM does in this case, a lot of people would shout loudly.
Skrevet af / Written by Wolfgang Haderlein fra 20:26:06 På 24-03-2010 | - Hjemmeside - |

Gravatar Image7 - I agree. This has been a huge problem for us. We need the fix on the client as well. It's the client that executes as much of this code for us as the server.
We applied the original hotfix on our iSeries servers. That did seem to correct the problem from the scheduled agents. However, Richard Schwartz tweeted last night that the hotfix has been pulled due to a regression bug. That's the first I've heard of that. As far as I know the server fix has worked well for us.
But I'm continuing to talk to support about the need for a client fix. I originally thought/hoped that the client would simply be included in the original hotfix release date. When it didn't show up we were told there's no client fix because no one ASKED for a fix on the client. Wow!
So I emailed someone else our business impact case and I'm hopeful that something might get done.

Now I see comments in the original Erik Brooks thread that there is a client hotfix? So I'm just getting more confused about this.

Lotus - We NEED a client fix. The thought of refactoring all the code is unfathonable!
Skrevet af / Written by David Leedy fra 12:36:05 På 24-03-2010 | - Hjemmeside - |

Gravatar Image8 - @4 Chad: Thanks for clarifying things.
So, it sounds like a fix for the client will be available in the same time frame as the serve fix.

More information on why the current fix has been pulled here: { Link }
Skrevet af / Written by Jens Bruntt fra 14:36:33 På 24-03-2010 | - Hjemmeside - |

Gravatar Image9 - We have identified some problems with the most recent fix and are working to correct those before making it available again. That's why the Technote is temporarily unavailable. If you have installed that fix, we recommend uninstalling it until the newer version is available.

The code to fix the problem is common to the client and server, so this is not a case of fixing one and leaving a vulnerability on the client side unaddressed. There are different requirements for building a client fix that should be discussed with your Support representative when you open a PMR.

If you have any questions, feel free to reach me at chads@us.ibm.com.
Skrevet af / Written by Chad Scott fra 13:20:54 På 24-03-2010 | - Hjemmeside - |

Gravatar Image10 - @8: Thanky you for the answer, but it remains a general problem: We have to decide either waiting for a fixpack (with extensive testing, but with the risk that we run in the problem) or applying a hotfix (with less testing) and run in other problems (caused by the hofix). As a result of this for me deploying a hotfix is extremly riskful.

Strange situation, or? I know this is a basic risk using software. But in this case the problem is known for a few weeks, is critical and is in my opinion worth to create a fixpack in time.

But okay: I hope IBM will provide a vaild fix soon.
Skrevet af / Written by Wolfgang Haderlein fra 09:58:09 På 25-03-2010 | - Hjemmeside - |