-
Enhancement
-
Resolution: Unresolved
-
Major
-
None
Given that listing all S3 objects in a bucket and checking for an "unused" header is not very efficient, I propose to use another mechanism to regularly clean unused objects.
It would rely on 2 features provided by S3:
- lifecycle management: http://docs.aws.amazon.com/AmazonS3/latest/dev/object-lifecycle-mgmt.html
- object tagging: http://docs.aws.amazon.com/AmazonS3/latest/dev/object-tagging.html
Summary: automated, regular removal of S3 objects based on the value of a ModeShape-specific tag.
In practice, such a lifecycle rule would be set-up for/by ModeShape:
<LifecycleConfiguration>
<Rule>
<ID>ModeShape Garbage Collection</ID>
<Status>Enabled</Status>
<Filter>
<Tag>
<Key>unused</Key>
<Value>true</Value>
</Tag>
</Filter>
</Rule>
</LifecycleConfiguration>
The main advantage would be to delegate the clean up process to S3, freeing up ModeShape of iterating over a (possibly) humongous list of objects.
On the other hand, the interval at which this clean up would take place is handled by S3: in practice, every 24h AFAIK.