-
Feature Request
-
Resolution: Done
-
Blocker
-
2.3.0.Final
-
None
-
Medium
The JPA connector periodically deletes unused (large value) records, and currently this is called whenever a node is removed. This can be costly, and it actually is somewhat unnecessary to call this that frequently. After all, it is perfectly okay for unused records to remain in the database for a moderate amount of time. This is especially true of the LargeValueEntity records in the simple model, since the LargeValueEntity is used to store a single property value and is keyed by the SHA-1 hash of the value. It is find if the value happens to not be used at the moment, and future updates may actually reuse that same property value, at which point the LargeEntityValue becomes used again.
There is another issue with the current approach. The deletion of unused LargeEntityValue records may be expensive (especially in MySQL, where we've had to use a hack-like workaround; see MODE-691), and will almost certainly block all other activities (of course, depending upon the transaction isolation level). In fact, deleting a large number of LargeEntityValue records may take some time and could cause timeout issues (this may be the cause of MODE-1066, tho this is still under investigation). Therefore, we shouldn't be doing these deletes every time a node is deleted.
It would be far better to reclaim the "garbage" periodically, asynchronously, and on a fairly large interval. However, the JPA source shouldn't have to implement all of this logic. The ModeShapeEngine should provide a mechanism that automatically calls the garbage collection on the sources (that need it), and the connectors then only need to implement the logic to do the garbage collection (and not the logic of doing it periodically or asynchronously).