Feb 21st, 2012, 09:06 AM
Saving complex property hierarchies as nodes with simple relations in Neo4j
Just getting my head around Neo4j and the Spring extension....using Neo4j 1.6 milestone from Eclipse
I have POJOs that I want to store in Neo4j that don't only contain simple properties. Neo4j stores objects (node/relation) as a flat list of simple properties.
Thought about using the @RelationTo annotation on all complex properties (which are other classes) and storing a POJO which is made up of a hierarchy of complex properties which themselves can contain complex properties as a hierarchy of simple relationships in Neo4j.
For example, Class Dog has properties
@GraphId Long id
@Fetch @RelatedTo(type = "OWNS") Owner owner
@Fetch @RelatedTo(type = "OWNS") Set<Bone> bones
Owner has properties
@GraphId Long id
@Fetch @RelatedTo(type = "OWNS") Address address
Using the @Fetch automatically pulls the relation (default direction is outwards) and does this recursively down the complex property tree.
Seems like an easy way to store complex POJO (which I will be generating from xml schemas with XJC JAXB). The negative is deleting and added complexity with searching down the property tree. And I am wondering about performance problems?
Basically I am forcing Neo4j be more like my POJOs.
Feb 21st, 2012, 06:00 PM
You aren't forcing Neo4j to be like POJOs. you just have nodes Dog, Owner, Bone, and Address. Each have their own properties. And each can have relations to each other via different Relationship types.
Sometimes you might have the Relationship itself have properties.
So say the Dog has a relationship to an Address Node, inside the relationship between Dog and Address you store owner information from the Owner object. Now Owner object is no longer a Node in this case but a Relationship, and the properties of Owner becomes properties of the Dog-Address relationship. If you so wanted to. Sometimes that makes sense , sometimes it doesn't.
With Neo4j you could annotate your Owner class with @GraphRelationship and in Dog use @RelatedToVia annotation on your Owner property in Dog.
As far as performance goes, Neo4j can do extremely fast traversals between nodes no matter how many nodes you have and how far you traverse through them. You don't even have to worry about inserts or deleted. It is just creating nodes and creating relationships.
Feb 21st, 2012, 06:02 PM
And, there isn't an issue with Objects POJOs helping define your nodes. Nodes are more OO like to me than an SQL 3rd normal form relational database.
Feb 22nd, 2012, 02:42 AM
I'm not sure if the generic "OWNS" relationship-type really represents your domain correctly.
Perhaps a more concrete relationship-type (like LIVES_AT, etc.) would help you modeling your domain better?
What is the percentage of create, update, and query operations in your use-cases?
Also Neo4j only stores the properties and relationships that are needed.
Performance wise SDN is mostly focused on interacting with the graph in a simple and clean manner.
If you need high performance queries (where Neo4j shines) those will be done with graph traversals, cypher or gremlin and only the relevant results will be converted back to SDN entities.
Feb 22nd, 2012, 03:56 AM
Agree with Mark that Neo4j is more OO than what I am doing currently saving the meat-n-potatoes of my JAXB POJOs into XML columns in DB2.
When I generate with XJC (JAXB) the result is a bunch of top level CI (configuration items) that inherit (extend) from a root Element (abstract) into concrete classes like Software or Publication or Environment. Each of these CIs (POJOs) will get a XmlRootElement annotation earmarking it as a root class that I allow to be rolled out to the client via RestEasy. The CIs will have a @GraphId acting a unique identifier. Whether or not a CI instance is logically unique (this is determined by the service layer) is dependent on a composition of CI properties (complex or simple). Every CI can have a hierarchy of complex properties that either reference (type) other CIs or do not. This is where the idea of the CI POJO owning it's data breaks down for me. A complex property not referencing another CI POJO will not have a @GraphID since I don't care if it is unique or not and want the property to automatically discarded if the "owning" CI POJO deleting the property or that CI POJO is itself deleted. Complex properties to other CI POJOs seem like a natural fit for Neo4j and the only thing I want to watch is that I related to persisted nodes only (otherwise a save on the "owning" CI could persist a related node that might logically not be unique).
The relation types don't have to be anything speical (a generate "owns" is fine). I will be creating Relation CI POJOs as well and allowing for @RelatedToVia so I can "dynamically" connect CI elements to another via concrete Relation CIs. This is the basis for my CMDB implementation. These related to via relationships will never be fetched eagerly while complex properties will always be fetched eagerly (with the risk for pulling to much but I have control right now over the generation of my CIs).
Problem is (because I haven't tested enough) that I don't know how Spring Neo4j will handle relations? Say I create a Software CI and it has a complex property that doesn't reference another CI and that complex property is a Set. If I add, update or delete from that Set detached (from the client) does Spring Neo4j reflect these adds, updates, and deletes back to graph automatically? Assuming not (Hibernate doesn't work this way detached). That means I have to do a total compare of the detached object and the persisted (if update).
Stuff like that I am unsure about.
Feb 22nd, 2012, 07:33 AM
Please try to speak of relationships otherwise it is confusing, I also assume that your "complex properties" are actually modeled as relationships?
Btw. you might look into Willie Wheelers Skybase blog posts as this is also a CMDB he's in the process of writing a book and might share the SDN for CMDB chapter with you.
Regarding your questions:
* please note that @GraphId is used for storing the internal Neo4j id (a Long value) which is not changeable in type and value
* if you want to add your own unique properties you would want to index the properties and then check if they already exist in the index (I'm hopefully going to work on integrating the new neo4j unique-entity features this week).
* I would try to work as little with detached entities as possible as detaching (and re-attaching) them adds additional complexity
* I would try to start using the spring-data-aspects and work with entities within transactions, that is the most clean and simple way
* SDN tries to do the delta detection for your collections of related-node-entities and also relationship-entities. So if you modify a collection (add, delete) it will detect the delta and also add / delete relationships in the graph
I think before worrying too much I would start with a Spike that covers the simple cases and if those work w/o issues iterate to the more complex ones (probably test-driven) so when you encounter any blockers you can get back to us.
Feb 22nd, 2012, 08:40 AM
:-) Exactly. Will sit down and run through a bunch of test cases and return with concrete questions. BTW....saw Skybase while hunting around for info. Wheelers has a cool concept that I ran into yesterday. Will check out his site before reinventing the wheel.
Will post back when I have played around more with Neo4j. But thanks for the Spring wrapper. It makes diving into NoSQL a lot easier!!!
Tags for this Thread