Mar 23rd, 2009, 12:27 PM
Two tables to XML
Dear spring users,
I am currently writing a program using spring batch (2.0) that take some data from a (Sybase) database an put them into an XML file.
The input data are in two tables. The first one (let's call it Teams) contain a list of team name and ids. The second one (let's call it Players) have 3 fields : an id, a foreign key linking the 'Teams' table, and a player name.
What I already did is :
- to retrieve data from the first table (Teams), with a JdbcPagingItemReader;
- to put the data in a Java object using a rowMapper;
- to create a fake collection of players in the rowMapper, this way my Teams object are fully filled;
- to put all the Teams data and fake players in an XML file, which have at the end of the execution the good structure.
What I would like to do, as you can imagine is to fetch the Players data and put them into the Team object collection. I am not working with any ORM, and it wouldn't be an option to go in this way. Furthermore, I really looking for a solution which would stream data from the database to the XML (the amount of input data is very large) as it is the Spring Batch spirit. I hope not having to query database for each 'Teams' row...
Although this problem seems very simple, I wasn't able to solve it. Thank you in advance for any help.
Mar 23rd, 2009, 04:20 PM
If I were implementing it, I would use the Team to drive it, and return it from the ItemReader. I would then implement an ItemProcessor that depended upon a dao that can pull back Players for a given team, which I would use to populate it. I would then pass this to my xml writer. Seems pretty straightforward to me, unless I'm missing something?
Mar 24th, 2009, 04:02 AM
Thank you very much for your reply. I have thought about your solution but unless I'm missing something, that's not exactly what I'm looking for. Here is why.
What you suggest is :
- first, to retrieve a page of data from the Teams table (let say 1000 Teams), through the PagingItemReader ;
- then, to feed the Team Java objects with their respective player, with an external DAO called by an ItemProcessor.
But (correct me if I'm wrong) it seems that in the second step I will query the database for each team. Thus, the final count of queries will be 1/page of Teams + 1/Team. What I'm looking for is doing only 2 query by page of Teams :
- one to retrieve a page of Teams ;
- a second one to retrieve the Players associated to the Teams just fetched.
Is it a way to do it like this?
I hope I'm clear in my explanations....
Thank you in advance for your help.
Mar 24th, 2009, 08:52 AM
I understand what you're saying. Essentially you're talking about a performance optimization. I hate to be cliche but I'll have to quote Knuth here: "Premature optimization is the root of all evil". I understand that there are extra database hits here, but depending upon your performance requirements, is this really an issue? Have you performance tested it? As long as prepared statements are used correctly (and they should be through the jdbc template) the access plan shouldn't have to be regenerated, leaving only the roundtrip database time, which depending up on where the job is running in relation to the database may or may not be an issue. Furthmore, you should be able to return all the players with one database hit, so essentially one hit per team. (Plus the paging hits which depend upon page size)
Once you have performance tested it, and assuming the performance was an issue, your options are likely somewhat limited, due to the 1 to many relationship of players to teams. If the numbers of players is relatively small, you could cache it (or something along those lines) Either way, you're not really limited by Spring Batch. However, without seeing all the details, including performance requirements and databases sizes and layout, it's hard to tell. Still, it's so quick to create a simple DAO, you really should performance test it first. If it doesn't work, my gut feeling is that you'll need to partition with the same approach anyway, which should be trivial given the database driven nature of it. You could then toss a process at the end to cat the files and put in some document begin/end tags.
Mar 25th, 2009, 04:57 AM
lucas and Y, thanks for your contributions. I had difficulties with data transfer to XML and needed nearly a week to get it done.