Nov 6th, 2011, 10:01 PM
Memory usage: Message Payload - File Vs. Byte
My application reads large files from which SI Messages are constructed. The Message is then transformed to a byte array and put through various JMS Queues for processing. I've been running into some OutOfMemory issues of late. I thought it was because of the large payloads(byte arrays) being held in memory. I wanted to see if using File Payloads will be more memory efficient than byte array payloads.
What do you think is the most efficient way to process such large files? What payload would be ideal to run the messages through the JMS queues? The most common task in my application would be to xsl transform or to STAX parse the file. I'd really appreciate any inputs. Thanks.
Nov 7th, 2011, 01:15 AM
I would use File. The File is only a pointer to the file on disk, whereas the byte is the full content of the file (which could thus range from 1byte to several mega- or gigabytes)... I would read the file when needed in a memory efficient way.
Nov 7th, 2011, 08:21 PM
Thanks Marten, I really appreciate your response.
Nov 8th, 2011, 05:02 AM
Using java.io.File is definitely a better choice than sending a payload but only if the message moves between the components on the same system. Since you are using JMS which possibly involves remote systems too, using File object is not a very good idea.
Consider using Claim Check and pass a more generic structure as the key in the message.
Nov 9th, 2011, 03:30 AM
Thanks for the response. Yes, I understand that using File payloads writes the message to the same system. I'm considering using a centralized repository(disk location) to which the Files will be written to. I wanna be able to stream the file's contents and not load the entire file into memory at any point. Please correct me if I'm wrong, If I just use Spring's Claim Checks, I would not be able to stream the file's contents.
Nov 9th, 2011, 10:35 AM
There are two issues we are talking about.
1. Moving the file to the central repository (which we want to do by not loading the complete file in memory)
2. Sending messages between systems without sending the actual file as the payload but some kind of reference to the file.
Claim check is the answer to the second issue, where the key is some UUID which can uniquely find a file on that centralized resource just in case you access the contents.
The claim-check-in and claim-check-out cannot be used here I feel as it stores the message in the MessageStore passing the message id (which is a UUID) as the paylod of the message, the original message is later retrieved using this UUID from the MessageStore. So at some point of time you will have to create a Message with the complete contents as the payload, which is what we don't want.
So if i was in your place, i would have used the claim check pattern but would have implemented using some other core components and adapters in Spring integration.
To address the first problem, we can use the <ftp:outbound-channel-adapter> to transfer the file to that centralized location (assuming your centralized location is a filesystem). This adapter would be transferring the file to the remote location using ftp and hence the complete file will not be loaded in memory of the sending process.
See here for more details on FTP adapters
Let me know your views on it which can make things more clear and come up with some solution.
Nov 14th, 2011, 01:42 AM
Thanks for the detailed response, I really appreciate it. Sorry to respond late, I've been out.
Yes, you're right. These are two different problems, I believe using File Payloads and streaming the file's contents when necessary will solve both the problems. This way, I wouldn't even need to use Spring's Claim Checks. In a way, I'm using the Claim Check pattern as I'm writing the file to the file system and just maintaining a File pointer as the payload. I'd just not be using the claim check tags:-)