Process Mapping Logo

Process Mapping - Forums

Sharing 19 years of knowledge and experience

 
Metastorm BPM forums
Sign up Latest Topics
 
 
 


Reply
  Author   Comment  
TiredDeveloper

Member
Registered:
Posts: 23
Reply with quote  #1 
In our production MBPM 9.3.2 environment, we quite often get 'deadlock' errors. It is something that usually goes away fairly quickly, but occasionally persists for a longer period (up to one hour) with some users experiencing multiple errors of this sort, plus slow speeds.

The error occurs across several processes and actions and a typical error looks like this:-

Failed to execute deployed method 'ListOptions', using entity 'ddStatus5'. Exception 'Transaction (Process ID 294) was deadlocked on lock | communication buffer resources with another process and has been chosen as the deadlock victim. Rerun the transaction.' occurred when attempting to 'BusinessObjectStore'.

The error doesn't appear to be restricted to particular type of Form element, though it happens most frequently with dropdowns, grids or other fields populated when rendering the web form. It doesn't appear to one particular table involved and sometimes the SQL queries returning the data are very simple.

Has anyone experienced this type of problem and can you describe a method to mitigate these type of deadlock errors?
0
BMellert

Guru
Registered:
Posts: 688
Reply with quote  #2 
WOW, this is interesting.  We have also had a significant rise in deadlock issues ...

and it is causing some folders in sub-processes to "disappear" if the deadlock happens to occur when trying to exit the sub-process and return back to the main process.  We have to manually find these conditions (the users don't always realize they are disappearing, only if they are actively watching for them), fix then in the back end (update eFolder.eStageName), open the folder and have it reprocess.  I was working on an automated process to use a flag, but then realized I don't have all of the pieces to try to get this working correctly.)

I've asked our DBAs if they can see anything (no word yet) and I have not yet raised anything with OpenText, but perhaps it time we both did.  I'd be curious if you find anything out and I will do the same.
0
thomas

Senior Member
Registered:
Posts: 79
Reply with quote  #3 
How are the dropdown lists populated? From a BO with an underlying query? 

It's probably not the best solution out there, but when we ran into these issues, especially also with field controls causing deadlocks, we would add query hints on table level. e.g. FROM Table WITH (NOLOCK)
Since the data in dropdowns etc probably won't be changing all the time, the risk of dirty reads can be largely ignored. 

Not sure if this would be the fix for your issues as well. 
0
TiredDeveloper

Member
Registered:
Posts: 23
Reply with quote  #4 
The dropdowns I've looked at appear to use a direct SELECT Query, but it is a very simple one.  It also appears with some Grids which use a BO.

I'd would be a large task for me to go through all the processes and make changes, but it is perhaps something that could be done organically when other change requests are made.

I'm looking with the DBA to see if we can identify any issues on the SQL Server database which might be causing an issue. 
0
BMellert

Guru
Registered:
Posts: 688
Reply with quote  #5 
We use (NOLOCK) in most statements due to SQL's tendency to lock on reads.

However, our deadlock errors occur when MBPM is doing its internal stuff (normal back-end processing like processing timed events, regenerating actions, etc.)
0
BMellert

Guru
Registered:
Posts: 688
Reply with quote  #6 

We set READ_COMMITTED_SNAPSHOT over the weekend.

Now we wait to see if it helps out or not.  I will provide an update once we see if it helps or not.

0
TiredDeveloper

Member
Registered:
Posts: 23
Reply with quote  #7 
Be interested to see if it helps.  We're moving our DB server to a new server soon and also upgrading from SQL Server2008 to SQL Server2012.  I'm waiting to see if this has any positive effect.  Will let you know.
0
BMellert

Guru
Registered:
Posts: 688
Reply with quote  #8 
Well, it seemed to have helped thus far, however after 2.72 days (including Sunday which is lightly used) we nearly ran out of "version store" in our tempdb and it nearly halted the entire instance, including several other databases.  It seems MBPM never released, committed or whatever at least one transaction on each engine server when it started.  SQL seems keep versioning until the oldest transaction is released.

The only way we got around this was to restart the engine services on both servers.  (Restarting the service on only one server didn't help as the other server also had a transaction from the last reboot.)  Once the second service was restarted, the database started recovery of the version store and tempdb.  Unfortunately, this appears to have impacted a few users, but it seems very few since each service bounce took about 11 seconds. We didn't have much choice as we were about to crash several databases. 

We had not expected this situation and other posts doesn't seem to be mentioned elsewhere.

The default Windows service doesn't seem to allow automat restarting of the services, though there may be some ways to hack it using power-shell.  (We hesitate to do so, so we hope there may be a better alternative.) 

We have recovered for the moment after bouncing the services (while users were connected, but we didn't have a choice) are going to monitor for now to see the ramp up, but we can't restart services when users are working heavily.

0
BMellert

Guru
Registered:
Posts: 688
Reply with quote  #9 

We set this setting a couple weeks ago.  Ever since MBPM is eating up tons of tempdb space as it seems the very first transation started by the engine service (creating CREATE PROCEDURE esp_get_attachment_76 every time the service starts) never commits / releases / terminates, so all transaction after are keeping versions.  This has nearly killed our entire instance, not just our MBPM database, on a few occasions.

We've taken to bouncing the service on each server at least daily.  This at least allows the database to catch up again.  However this strictly a band-aid so we are going to have to turn this off due to the MBPM create and deal with failing MBPM processes due to deadlocks.

0
BMellert

Guru
Registered:
Posts: 688
Reply with quote  #10 
We turned off READ_COMMITTED_SNAPSHOT last night since MBPM with this setting nearly killed the instance by filling tempdb.

The tempdb issue went away and is working well.
Alas, deadlocks within MBPM occurred almost immediately after restarting the services.
0
praxkan

Veteran
Registered:
Posts: 142
Reply with quote  #11 
Do you use DataTables/DataSets/Lists anywhere from Mstm.SelectSQL

For whatever reason, when you return anything but a Text object from SelectSQL, metastorm seems to lock the table in question, within that object (!!!!)

So if you have a few of these instances running in parallel, deadlocks are everywhere, timeouts etc. 

Realistically, I've not found a "e-workaround" satisfactory for this. About the only thing that worked was move EVERYTHING that's not a select statement into a stored procedure, and all complex DB operations outside metastorm.

:/
0
BMellert

Guru
Registered:
Posts: 688
Reply with quote  #12 
Funny thing is, without having changed the MBPM release, we didn't have many deadlock issues, but we've had lots since.

The oldest transaction is a MBPM "create procedure" for eattachment...76 when the engine starts.  That is before our code set runs.

As soon as we turned off that database setting, it took care of tempdb growning and nearly crashing the instance ... but the deadlocks appeared almost immediately.  Interesting thing is it can go hours with no reported deadlock issues, then there are a few.  (I'd think if it was something like returning .List etc. it would be more consistent ... and would have happened all along.)

We've uploaded two sets of database trace files to OT.  Hopefully they can identify something as indications imply to me the issues are internal MBPM processing.  Time will tell.
0
Previous Topic | Next Topic
Print
Reply

Quick Navigation:


Create your own forum with Website Toolbox!