Given that listing all S3 objects in a bucket and checking for an "unused" header is not very efficient, I propose to use another mechanism to regularly clean unused objects.
It would rely on 2 features provided by S3:
- lifecycle management: http://docs.aws.amazon.com/AmazonS3/latest/dev/object-lifecycle-mgmt.html
- object tagging: http://docs.aws.amazon.com/AmazonS3/latest/dev/object-tagging.html
Summary: automated, regular removal of S3 objects based on the value of a ModeShape-specific tag.
In practice, such a lifecycle rule would be set-up for/by ModeShape:
The main advantage would be to delegate the clean up process to S3, freeing up ModeShape of iterating over a (possibly) humongous list of objects.
On the other hand, the interval at which this clean up would take place is handled by S3: in practice, every 24h AFAIK.