AI, Gen AI and Cloud: IBM Integration Bus and WebSphere Message Broker FAQ for Memory

Technote (FAQ)

Question

The following is a list of answers to frequently asked questions (FAQ) about Memory related problems in IBM Integration Bus (IIB) and WebSphere Message Broker (WMB) for new and experienced users

Answer

Index of questions in this document:

1. Is there a memory leak in my DataFlowEngine?
2. Why is my DataFlowEngine memory usage not reducing on the UNIX systems?
3. How can I monitor JVM memory usage of a DataFlowEngine process?
4. How can I tune JVM heapsize for a DataFlowEngine, bipbroker and biphttplistener processes?
5. How can I monitor memory usage on Windows and UNIX?
6. I see an abend with an out of memory error. What could be causing this failure?
7. How do I isolate the problem of memory growth to a particular message flow?
8. How does WebSphere Message Broker (WMB) handle the memory usage within a DataFlowEngine (DFE) Process?
9. What is memory fragmentation in a DataFlowEngine process?
10. How can I tell if the out of memory is caused due to the DataFlowEngine running out of JVM Heapsize or total process memory?
11. Are there any useful references in the area of broker/execution group memory usage?
12. What should I do in a broker to process a very large message?
13. What is MQSI_THREAD_STACK_SIZE and do I need to set it?
14. Are there any needs for modifying the system kernel parameters for WMB/IIB?
15. How do you set the limit of total memory consumed by the DataFlowEngine process?
16. What kind of memory usage patterns are expected on Integration nodes and Integration servers during runtime processing? Is there a formula to calculate/predict memory usage of a message flow?
17. When should I consider splitting the message flows across multiple EGs and what are its benefits?
18. Is the DataFlowEngine process size expected to grow if the same message is processed through the same message flow and undergoes the exact same processing again and again?
19. What are the key factors and metrics that will determine the amount of CPU, memory and disk space used/required for IIB9?
20. How do I know if my Integration Node/Server is running out of memory resources?
21. How can I troubleshoot a problem with my Integration node processes running out of memory?
22. Are there any patterns or samples available based on which I can design my message flows optimally?
23. How does deploying a message flow application with CMF (compiled message flow) enabled Vs disabled affect the DataFlowEngine memory usage?
24. How can I limit memory usage on the DataFlowEngine(DFE), is there an operating system or kernel parameter setting?
25. What is the effect of additional instances on memory usage?

1. Is there a memory leak in my DataFlowEngine?

During message processing, it is possible for execution groups to continue to grow in memory usage. This could be due to a number of factors and does not necessarily indicate that there is a memory leak. The following link provides information on the Execution group memory growing behavior and the reasons.

http://www.ibm.com/support/docview.wss?uid=swg21106136

2. Why is my DataFlowEngine memory usage not reducing on the UNIX systems?

DataFlowEngine acquires memory while processing messages through its message flows. The amount of memory required to process messages depends on the size of the messages. After some processing, the DataFlowEngine acquires enough memory that it can internally re-use the memory for all the subsequent messages that are processed through its message flows. The UNIX process holds on to the memory anticipating that it could be required for subsequent processing of messages and to avoid performance impact due to releasing and reacquiring memory.

3. How can I monitor JVM memory usage of a DataFlowEngine process?

JVM memory usage by an execution group can be monitored using resource statistics. The following product documentation page gives more information on this.
http://www.ibm.com/support/knowledgecenter/SSMKHH_9.0.0/com.ibm.etools.mft.doc/bj43370_.htm

Use MBX explorer and/or WebUI to enable and view resource statistics
- Register subscriptions and use command line parameters to monitor the resource statistics
http://www.ibm.com/support/knowledgecenter/SSMKHH_9.0.0/com.ibm.etools.mft.doc/bj43320_.htm

- Export the following env variable in the broker service Id's profile and restart the broker to monitor the garbage collection (GC) in broker -
export MQSIJVERBOSE=-verbose:gc
OR

- You may run the following command at the execution group level and restart just the execution group -

On AIX and Linux:

mqsichangeproperties <broker> -e <EG> -o ComIbmJVMManager -n jvmSystemProperty -v"-verbose:gc -Xverbosegclog:<path to gc.trc>

On Sun Solaris:

mqsichangeproperties <broker> -e <EG> -o ComIbmJVMManager -n jvmSystemProperty -v"-verbose:gc -Xloggc:<path to gc.trc>

Once the output is collected, you can use the IBM Support Assistant (ISA) workbench, a GUI based application to view the GC activity in graphical form. The ISA workbench can be downloaded from the following IBM site:
http://www.ibm.com/software/support/isa/download.html
After download, you can go to Update > Find updates and add ons to install "IBM Monitoring and Diagnostic Tools for Java - Garbage collection and Memory Visualizer". This will install the tool that can now be used to view the garbage collection activity. You can use this tool to analyze the stdout file.

4. How can I tune JVM heapsize for a DataFlowEngine, bipbroker and biphttplistener processes?

JVM maximum and minimum heapsize can be tuned using mqsichangeproperties command. The following online documentation gives more information
http://www.ibm.com/support/knowledgecenter/SSKM8N_8.0.0/com.ibm.etools.mft.doc/ac55070_.htm

For biphttplistener process:
http://www.ibm.com/support/docview.wss?uid=swg21626362

5. How can I monitor memory usage on Windows and UNIX?

At any given point, you can check the memory usage for processes in the following way:

Windows:
Ctrl-Alt-Delete > Start Task Manager > Processes > Show processes for all users and go to the process "DataFlowEngine" and look at the field "Memory (Private working set)

If you want to continuously monitor the memory usage, then check the following link for Windows Sysinternals for process utilities:http://technet.microsoft.com/en-us/sysinternals/

UNIX:
ps -aelf | grep <PID for DataFlowEngine>

If you want to continuously monitor the memory usage, then the above command may have to be incorporated into a simple shell script.

6. I see an abend with an out of memory error. What could be causing this failure?

An out of memory error in an abend file indicates the process has terminated as it could not acquire more memory to continue processing. The memory required for processing could be native memory or java heap memory. Investigation would be required to identify why the process was unable to acquire more memory. If using java compute nodes, it is a good practice to tune JVM Heapsize for the environment. Refer to the questions below to gain better understanding of this subject to be able to carry out the investigation.

7. How do I isolate the problem of memory growth to a particular message flow?

Separating the message flows that are deployed to an Execution group into multiple Execution Groups would allow you to recreate the problem with isolated sets of message flows deployed to their own execution groups and thus allow you to identify the specific message flows that may be contributing towards memory growth for that execution group.

To identify a potential memory leak, the same input message should be repeatedly sent into the message flow. When a message flow processes the same input message over and over again, then this should drive the same message flow paths each time. When this happens then we would expect the DataFlowEngine process to plateau very quickly unless the flow had experienced an expected growth.

8. How does WebSphere Message Broker (WMB) handle the memory usage within a DataFlowEngine (DFE) Process?

When considering memory usage within a DataFlowEngine process there are two sources that the storage is allocated from, and these are :
1. The DataFlowEngine main memory heap
2. The operating system

When message flow processing requires some storage, then an attempt is first made to allocate the memory block required from the DataFlowEngine's heap. If there is not a large enough contiguous block of storage on the heap, then a request will be made to the operating system to allocate more storage to the DataFlowEngine for the message flow. Once this is done, then this would lead to the DataFlowEngine's heap growing with the additional storage, and the message flow will use this extra storage.
When the message flow has completed its processing, then it issues a "free" on all its storage and these blocks will be returned to the DataFlowEngine's heap ready for allocation to any other message flows of this DataFlowEngine. The storage is never released back to the operating system, because there is actually no programmatic mechanism to perform such an operation. The operating system will not retrieve storage from a process until the process is terminated. Therefore the user will never see the size of the DataFlowEngine process decrease, after it has increased.
When the next message flow runs, then it will make requests for storage, and these will then be allocated from the DataFlowEngine heap as before. Therefore there will be a re-use within the DataFlowEngine's internal storage where possible, minimizing the number of times that the operating system needs to allocate additional storage to the DataFlowEngine process. This would mean there may be some growth observed on DataFlowEngine's memory usage which is of the size of the subsequent allocations for message flow processing. Eventually we would expect the storage usage to plateau, and this situation would occur when the DataFlowEngine has a large enough heap such that any storage request can be satisfied without having to request more from the operating system.

9. What is memory fragmentation in a DataFlowEngine process?

As explained in the above question, at the end of each message flow iteration, storage is freed back to the DataFlowEngine memory heap ready for re-use by other threads. However, there are objects that are created within the DataFlowEngine that last the life of the DataFlowEngine and therefore reside at that point in the heap for that time. This leads to what is known as fragmentation and as a result reduces the size of contiguous storage blocks available in the DataFlowEngine when an allocation request is made. This means that DataFlowEngine process has the memory blocks for allocation but are fragmented to be allocated to requests made during message processing. In most of the cases, the requesters of storage require a contiguous chain of blocks in memory. Therefore, it is possible for a message flow to make a request for storage against the DataFlowEngine's heap that does not have enough free storage to satisfy the request for this contiguous chain of blocks, but the storage is fragmented, such that the contiguous block does not fit into any of the "gaps". In this situation, a request would have to be made to the operating system to allocate more storage to the DataFlowEngine so that this block can be allocated.

However, when unfreed blocks remain on the DataFlowEngine's heap then this will fragment the heap. This means that there will be smaller contiguous blocks available on the DataFlowEngine's heap. If the next storage allocation cannot fit into the fragmented space, then this will cause the DataFlowEngine's memory heap to grow to accommodate the new request.

This is why small increments may be seen in the DataFlowEngine even after it has processed thousands of messages. In a multi-threaded environment there will be potentially many threads requesting storage at the same time, meaning that it is more difficult for a large block of storage to be allocated.

For example,

Some message flows implement BLOB domain processing which may result in the concatenating of BLOBs. Depending on how the message flow has been written, this may lead to fragmentation of the message heap due to the fact that when a binary operation takes place such as concatenation, both the source and target variables need to be in scope at the same time.

Consider a message flow that reads in a 1MB BLOB and assigns this to the BLOB domain. For the purposes of demonstration, this ESQL will show a WHILE loop that causes the repeated concatenation of this 1MB BLOB to produce a 57MB output message. Consider the following ESQL :

DECLARE c, d CHAR; SET c = CAST(InputRoot.BLOB.BLOB AS CHAR CCSID InputProperties.CodedCharSetId); SET d = c; DECLARE i INT 1; WHILE (i <= 56) DO SET c = c || d; SET i = i + 1; END WHILE; SET OutputRoot.BLOB.BLOB = CAST(c AS BLOB CCSID InputProperties.CodedCharSetId);

As can be seen, the 1MB input message is assigned to a variable c and then this is also copied to d. The loop then concatenates c to d and assigns the result back to c on iteration. This means that c will grow by 1MB on every iteration. Since this processing generates a 57MB blob, one may expect the message flow to use around 130MB of storage. The main aspects of this are the ~60MB of variables in the compute node, and then 57MB in the Output BLOB parser which will be serialized on the MQOutput node.

However this is not the case. This ESQL will actually cause a significant growth in the DFE's storage usage due to the nature of the processing. This ESQL encourages fragmentation in the memory heap. This condition means that the memory heap has enough free space on the current heap, but has no contiguous blocks that are large enough to satisfy the current request. When dealing with BLOB or CHAR Scalar variables in ESQL, these values need to be held in contiguous buffers in memory.

Therefore when the ESQL SET c = c || d; is executed, in memory terms this is not just a case of appending the value of d, to the current memory location of c. The concatenation operator takes two operands and then assigns the result to another variable, and in this case this just happens to be one of the input parameters. So logically the concatenation operator could be written SET c = concatenate(c,d). This is not valid syntax but is being used to illustrate that this operator is like any other binary operand function. The value contained in c cannot be deleted until the operation is complete since c is used on input. Furthermore, the result of the operation needs to be contained in temporary storage before it can be assigned to c.

10. How can I tell if the out of memory is caused due to the DataFlowEngine running out of JVM Heapsize or total process memory?

11. Are there any useful references in the area of broker/execution group memory usage?

See the following references:

https://developer.ibm.com/answers/questions/184554/why-does-the-iib-wmb-biphttplistener-process-occup.html
https://www.ibm.com/developerworks/community/blogs/aimsupport/entry/important_facts_about_garbage_collection_in_websphere_message_broker?lang=en
Back to top

12. What should I do in a broker to process a very large message?

13. What is MQSI_THREAD_STACK_SIZE and do I need to set it?

MQSI_THREAD_STACK_SIZE
Purpose : For any given message flow, a typical node requires about 2KB of the thread stack space. Therefore, by default, there is a limit of approximately 500 nodes within a single message flow on the UNIX platform and 1000 nodes on the Windows platform. This limit might be higher or lower, depending on the type of processing being performed within each node. If a message flow of a larger magnitude is required, one can increase this limit by setting the MQSI_THREAD_STACK_SIZE environment variable to an appropriate value( broker must be restarted for the variable to be effective). This environment variable setting applies to brokers, therefore the MQSI_THREAD_STACK_SIZE is used for every thread that is created within a DataFlowEngine process. If the execution group has many message flows assigned to it, and a large MQSI_THREAD_STACK_SIZE is set, this can lead to the DataFlowEngine process requiring a large amount of storage for the stack. In WMB, it is not just execution of nodes that can cause a build up on a finite stack size. It follows from the same principles for any processing that leads to a large amount of nested or recursive processing and might cause extensive usage of the stack. Therefore, you may need to increase the MQSI_THREAD_STACK_SIZE environment variable in the following situations: a) When processing a large message that has a large number of repetitions or nesting. b) When executing ESQL that recursively calls the same procedure or function. This can also apply to operators. For example, if the concatenation operator was used a large number of times in one ESQL statement, this could lead to a large stack build up.

However, it should be noted that this environment variable applies to all the message flow threads in all the execution groups, as it is set at the broker level. For example, if there are 30 message flows and this environment variable is set to 2MB then that would mean that 60MB would be reserved for just stack processing and thus taken away from the DataFlowEngine memory heap. This could have an adverse effect on the execution group rather than yielding any benefits. Typically, the default of 1 MB is sufficient for most of the scenarios. Therefore we would advise that this environment variable NOT be set unless absolutely necessary.
Please refer to the following link for more information:
http://www.ibm.com/support/knowledgecenter/SSMKHH_9.0.0/com.ibm.etools.mft.doc/ac55020_.htm

14. Are there any needs for modifying the system kernel parameters for WMB/IIB?

In WMB/IIB there are no suggested kernel settings for the tuning of an operating system. However, the WebSphere MQ and some database products do, and WMB/IIB runs under the same environment as these. Hence, its best to check and tune your environment as guided by these applications.

15. How do you set the limit of total memory consumed by the DataFlowEngine process?

There is no functionality within the Broker product to set a maximum memory limit on an execution groups. Operating systems may offer such functionality and the user should look into the ways of doing this on HP-UX. However, it should be noted, if a DataFlowEngine limit is set to a value that it needs to exceed, then it will terminate when the request for storage is refused.

16. What kind of memory usage patterns are expected on Integration nodes and Integration servers during runtime processing? Is there a formula to calculate/predict memory usage of a message flow?

It is a bit hard to predict the any memory usage patterns during runtime processing, as there are a number of variables involved such as:
Overall message flow design - nodes, ESQL, Java, Parsers etc. used in the message flow
Size of the messages getting processed
Number of messages getting processed
Other message flows deployed to Integration Servers
Any internal or external monitoring/tracing/statistics etc running

If there are multiple message flows running within an Integration Server then the patterns of memory allocations/frees can not be predicted. There is no formula regarding memory usage for a message flow / Integration server(DataFlowEngine).
However, there are some practical recommendations that may be followed as good operating practices to optimize the memory usage and overall performance of your message flows. Please see the
links below:
Good operating practice: Configuration recommendations to reduce memory usage
Techniques to Minimize Memory Usage with IIB
IBM Integration Bus Configuration recommendations to reduce memory usage

17. When should I consider splitting the message flows across multiple EGs and what are its benefits?

IIB Good Practices

Good operating practice strategy

18. Is the DataFlowEngine process size expected to grow if the same message is processed through the same message flow and undergoes the exact same processing again and again?

While this is the recommended scenario to test for establishing benchmarks, it is expected that the DataFlowEngine process may show some size increase in such a scenario. The reason for this is that there is no way to guarantee that the same pattern of allocations will occur in exactly the same locations as they did the first time. Especially when you consider that the DataFlowEngine is a multi-thread process where multiple threads will be allocating and freeing storage at the same time. This means that when the message flow allocates the storage it requires, it is highly unlikely that it will get the exact same allocations and size, it did during the first run. However, this is expected to plateau soon after a few runs.
The DataFlowEngine is expected to reach a plateau when its heap is of a sufficient size to satisfy all allocation requests from any pattern/size of allocation from its threads.
The maximum plateau value would be at the point where all message flows are processing at the same time and are processing their largest message and executing the longest path within the message flow.

19. What are the key factors and metrics that will determine the amount of CPU, memory and disk space used/required for IIB9?

The blog on Techniques to minimize memory usage with IIB describes some of the key factors to consider to determine the memory usage in your environment.
The IIB V9 Performance reports document some of the metrics around CPU and Performance as well.

20. How do I know if my Integration Node/Server is running out of memory resources?

There are 2 types of memory used by IBM Integration Bus, native heap memory and JVM heap memory. Hence, if the Integration Node/Server is running out of memory resources, it could either be due to native memory exhaustion, where the system runs out of resources, or JVM exhaustion, where the IIB processes run out of JVM heap.
When the system runs out of resources, the system on which IIB is running, could slow down or crash. From IIB perspective, typically you would see abend files in <WMB workpath>/common/errors directory and possibly a core dump. The system log should indicate that system ran out of memory.
When IIB processes runs out of JVM heap, then it is expected to generate an abend file, core dump, javacore and java heap dumps. The stderr file (UNIXes) or console.txt (Windows) typically logs java.lang.outofmemory exceptions.
These are the typical symptoms of an IIB process running out of memory.
Please refer to the Self guided troubleshooting section on IIB mustgather for Problems with Memory growth and IIB mustgather for problems with high CPU and Performance.

21. How can I troubleshoot a problem with my Integration node processes running out of memory?

For memory growth problems, it will be easier to isolate the problem if the message flows deployed to this Integration Server may be distributed across other Integration Servers and problem is
recreated. This helps to eliminate message flows which are not exhibiting the problem. The troubleshooting technique may vary between native memory issues and java memory issues.
Please refer to the Self guided troubleshooting section on IIB mustgather for Problems with Memory growth and IIB mustgather for problems with high CPU and Performance.

22. Are there any patterns or samples available based on which I can design my message flows optimally?

See the Message splitter pattern used to split large XML messages into smaller chunks for processing.
See the Large messaging sample demonstrating a scenario of processing of large messages with repeating structures and splitting them into smaller chunks.

23. How does deploying a message flow application with CMF (compiled message flow) enabled Vs disabled affect the DataFlowEngine memory usage?

When message flows are compiled in the BAR file then the Toolkit is incurring the memory/performance cost of in-lining all of the message flow constituent parts into the single CMF file.
However, when a .msgflow file is deployed with CMF option un-checked, the Integration node run-time compiles this message flow into its run-time form.
Also, when a message flow or a subflow is redeployed, its definitions are updated by Integration node runtime and all the flows that are affected have their runtime XML re-inlined.
While this may indicate that the Integration node run-time administration component needs extra memory and CPU time to complete this task on each deploy, this is counteracted by the fact that
the deploy messages would be smaller in size and so the deploys can be processed much faster. If care is taken to keep the message flow size smaller, then it would not affect the runtime memory usage.
Please refer to the topic: Deploying message flows to integration servers

24. How can I limit memory usage on the DataFlowEngine(DFE), is there an operating system or kernel parameter setting?

25. What is the effect of additional instances on memory usage?

The message flow additional instances decide the number of threads that a message flow can use at run time to process messages. Each message flow instance occupies its own storage for all run-time variables, bitstreams, message trees, database connections and interaction etc. When using message flow additional instances, the memory allocated to the message flow threads are likely to take more time to reach the plateau value. The plateau value is a point at which the memory heap has enough storage to satisfy all the requests from its threads. Once this value is reached, the memory usage of the DataFlowEngine process stabilizes.

AI, Gen AI and Cloud

Wednesday, October 12, 2016

IBM Integration Bus and WebSphere Message Broker FAQ for Memory