which naming scheme would give optimal performance on S3?

If an application is storing hourly log files from thousands of instances from a high traffic
web site, which naming scheme would give optimal performance on S3?

A.
Sequential

B.
HH-DD-MM-YYYY-log_instanceID

C.
YYYY-MM-DD-HH-log_instanceID

D.
instanceID_log-HH-DD-MM-YYYY

E.
instanceID_log-YYYY-MM-DD-HH

19 Comments on “which naming scheme would give optimal performance on S3?

  1. venkat sai says:

    Yes B is right option. The main reason is the random prefix and the performance would be higher in this case.

    A – Don’t make sense
    C – YYYY ( This would be same and would be difficult to achieve good performance)
    D & E – The instance Id would be same for the first two characters ( i-)

  2. BDA says:

    D , the random hostname prevents hammering a specific partition, and the HH-DD following hostname is more random than E

    B will hammer a partition once per day at HH-DD

    A changes i/o pattern, does not apply

    C is just as bad as A

    E is almost as good as D by YYYY will not be as random as D


Leave a Reply

Your email address will not be published. Required fields are marked *