Hi, I am wondering how and if I can speed up my db inserts. At this pace I am looking at 30 days to complete!
I was running this via spring batch but removed that from this test case.
It is splitting up a million or so XML records and inserting it into the graph. One record has 15 sub elements types with an average of 30 sub elements. I then link to another two Nodes for year and each record gets added to a another node that represents the source. Some sources (150 odd) contain a few records, some tens of thousands.
I have set a large amount of memory 2GB for the standalone neo4j database.
The code is as follows with timings shown
I tried various indexing combinations but couldn't speed this up. It is mainly the save that is the time consuming part although finding the source and adding the records could do with speeding up.Code:parseXML - {~1ms} neoRecord = new Record(...); - {~1ms} y = yearRepository.findAllByPropertyValue("year", 2012).singleOrNull(); {~30ms} if (y == null) { y = new Year(2012); yearRepository.save(y); {not sure how long this takes - doesn't matter it rarely happens} } neoRecord.setYear(y); {~1ms} Source s= sourceRepository.findAllByPropertyValue("sourceId", foo).single(); {~150ms} s.addRecord(neoRecord); {~700-750ms} //here i create the (up to) 15 sub elements and add them to the record - {~1ms suprisingly fast including some more xml bits} ElementType[] ets = xml.getETArray(); Set<SubElement> subElements = new HashSet<SubElement>(); for (ElementType et : ets) { SubElement c = new SubElement(); c.setFoo(fooString); c.setRecord(neoRecord); subElements.add(c); } neoRecord.setSubElements(subElements ); (repeated 15 times) // recordRepository.save(neoRecord); {~3000-4000}
I would appreciate any tips or suggestions. A few ms here and there will make a big difference.
Regards,
Mark
Code:@NodeEntity public class Year { @Indexed(unique=true) String year; ... } @NodeEntity public class Record { @Indexed private Long id; } @NodeEntity public class Source { @Indexed @RelatedTo(direction=Direction.OUTGOING, type=RelationshipTypes.CONTAINS) Set<Record> records = new HashSet<Record>(16, 1f); }


Reply With Quote