1
votes

We have a legacy PowerBuilder 12.1 Classic application with an Oracle 11g back end, and are experiencing performance issues in production that we cannot reproduce in our test environments.

The window in question has shared grid/freeform DataWindows and buttons to open other response windows, which when closed cause the grid to re-retrieve.

The grid has a very expensive query behind it, several columns receive their values from function calls with some very intense SQL within, however it still runs within a couple seconds, even in production.

The only consistency in when the errors occur is that it seems to be more likely if they attempt to navigate to the other windows quickly. The buttons that open said windows are assuming that a certain instance variable is set with the appropriate value from the row in focus in the grid. However, in this scenario, the instance variable has not yet been set, even though it looks like the row focus change has occurred. This is causing null reference exceptions that shouldn't be possible.

The end users' network connectivity is often sluggish, and their hardware isn't any less capable than ours. I want to blame the network, but I attempted to reproduce this myself in development by intentionally slowing down the SQL so that I could attempt to click a button, however everything happened as I expected: clicking the button didn't happen until after retrieve and all the other events finished.

My gut tells me that for some reason things aren't running synchronously when they should, and the only factor I can imagine is the speed of the SQL, whether from the query being slow, or the network being slow, but when I tried reproducing that effect things still happened in the proper sequence. The only suspect code is that the datawindow ancestor posts a user event called ue_post_rfc from rowfocuschanged, and this event does a Yield(). ue_post_rfc is where code goes instead of rowfocuschanged.

Is there any way Yield() would cause these problems, without manifesting itself in test environments, even when SQL is artificially slowed?

1

1 Answers

1
votes

While your message may not give enough information to give you a recipe to solve your problem, it does give me a hint towards a common point of hard-to-diagnose failures that I see often in PowerBuilder systems.

The sequence of development events goes something like this

  • Developer develops code where there is a dependence on one event firing before another event, often a dependence through instance or global variables
    • This event sequence has been something the developer has observed, but isn't documented as a guaranteed sequence (like the AcceptText() sequence or the Update() sequence are documented)
    • I find this a lot with posted events, and I'm not talking about event and post-event where post-event is posted from event, but more like between post-ItemChanged and post-GetFocus
  • Something changes the sequence of events, breaking the code. Things that I've seen change non-guaranteed sequences of events include:
    • PowerBuilder version change
    • Operating system change
    • Hardware change
    • The application running with other applications taxing the system resources
  • Whoever is now in charge of solving this, has no clue what is going on or how to deal with it, so they start peppering the code with Yield() statements (I've literally seen comments beside a Yield() that said "I don't know why this works, but it solves problem X")
    • Note that Yield() allows any and all events in the message queue to be processed, while this developer really wants only one particular event to get through
    • Also note that the commonly-seen-in-my-career DO ... LOOP UNTIL (NOT Yield()) could loop infinitely on a heavily loaded system
  • Something happens to change the event sequence again
  • Now when the Yield() occurs, there is a different sequence of messages in the queue to be processed, and not the message the developer had wanted to be processed
  • Things start failing again

My advice to get rid of this problem (if this is your problem) is to either:

  • Get rid of the cross-event dependence
  • Get rid of event sequence assumptions
  • Manage the event sequence yourself

Good luck,

Terry


P.S. Here's a couple of quotes from your question that make me think of Yield() (not that I don't love the opportunity to jump all over Yield() grin)

The only consistency in when the errors occur is that it seems to be more likely if they attempt to navigate to the other windows quickly.

Seen this when the user tries to initiate (let's say for example) two actions very quickly. If the script from the first action contains a Yield(), the script from the second action will both start and finish before the first action finishes. This can be true of any combination of user actions (e.g. button clicks, menu clicks, tabs, window closings... you coded with the possibility that the window isn't there anymore after the Yield() was done, right? If not, join the 99% of those that code Yield(), don't, and live dangerously) and system events (e.g. GetFocus, Deactivate, Timer)

My gut tells me that for some reason things aren't running synchronously when they should

You're right. PowerBuilder (unless you force it) runs synchronously. However, if one event is starting before another finishes (see above), then you're going to get behaviours that look like asynchronous behaviours.

There's nothing definitive in what you've said, but you did ask about Yield(). The really kicker to nail this down is if you could reproduce this with a PBDEBUG trace; you'd see which event(s) is(are) surprising you. However, the amount that PBDEBUG slows things down affects event sequences and queuing, which may or may not be helpful.