Data Storage for Photographers – Oct. 1st Thurs Meetup

Hi everyone!

Below are notes from my talk at last week’s 1st Thursday Meetup.  Thanks to Christine for hosting and for everyone that was able to be there for your input.   Mike did the original drawing of the flow diagram on the stuff we talked about and  I have updated his original with some additional stuff although  my drawing skills do not quite rise to the level of his 🙂

Digital Asset & Data Storage Talk
FTP First Thursdays – Thursday Oct. 1, 2009

Photography Digital Asset Storage

 

Notes

Areas of concern:

  1. Data Corruption
  2. Hardware Failure/Theft/Loss
  3. Disaster Recovery

How to address some of those concerns:

  1. Data redundancy (data backup, multiple copies)
  2. Hardware redundancy (Dual Cards, RAID Arrays, Drobo Systems, etc.)
  3. Multiple site storage (Onsite + Offsite)
  4. Additionally – i) Anti-Virus Software; ii)Firewall implementation; iii) UPS Battery Backup for workstations/servers/external drives

Tips for Originals (Before & During the Very First Import):

  1. If your camera supports dual cards, use the 2nd as a backup.
  2. If shooting on location create 2 backup copies of the originals if you can and keep them separate (e.g. one with you and one back at the hotel). If you can (especially on longer shoots) you might want to courier a copy back to the studio/office/home.
  3. Write your name, phone # and possibly e-mail address on your media cards.
  4. Always do a visual inspection after getting data off the card to make sure the images transferred properly.
  5. Ideal flow would be MEDIA CARD –> BACKUP –> [PRIMARY STORAGE + VISUAL INSPECTION] (Note: Copying from the backup to the primary as opposed to the other way around means that a visual inspection of the primary will confirm the data in the backup as well)
  6. If you decide to dedicate a set of media cards to an event, do NOT let that event be the first use of the cards. If they happen to be defective from the factory, that is not when you want to find out. Actually more broadly, never use brand new, never-been-used cards for an event/important shoot.

Tips for Derivative Files (After the Original Import):

  1. The derivative files that are being edited and not yet delivered to the client are considered to be your Working Files. (Note: Everything that comes after working on your original files are a derivative of the originals.)
  2. Working files are critical in that a fair amount of post work has been done on them but they have not yet been delivered to client.
  3. Because they are working files, these files see a lot more day-to-day change and those changes need to be backed-up, at the very least, on a daily basis.
  4. If the working files are on an external drive attached to the workstation, that drive needs to be treated just as if it were an internal drive and should be backed up (both internal & external drives are just as vulnerable to data corruption and hardware failure).
  5. Files for shoots that have been delivered to clients can then be moved to the Archive where they do not need as fast a storage medium as the Working Files since they are more static in nature.

RAID Arrays:

  1. RAID 0 = Striped Array = Bad idea! (no redundancy, primary benefit is speed but single drive failure kills all the data)
  2. RAID 1 = Mirrored Array (writes the same data to multiple drives => data is intact as long as one drive is good but array size is that of 1 drive)
  3. RAID 5 = Striped Array with Distributed Parity (stripes the data across multiple drives but can survive the loss of one drive because of the parity data)
  4. RAID 6 = Striped Array with Dual Distributed Parity (similar to RAID 5 but can survive failure of 2 drives)
  5. RAID 0+1 = Mirrored Striped Array. Usually two drives are Striped (as in RAID 0) and then that set is mirrored onto another set of 2 drives. Pretty fast since it does not require parity calculations but is an inefficient use of storage space. Will survive failure from either set but not both at the same time.

RAID vs Drobo

  1. RAID arrays require all the drives in the array to be of the same size which makes it a bit of a headache to expand the size of an array once it starts to run out of space. If the array needs to grow, ALL the drives will need to be replaced with larger drives and very careful migration of the data will need to be done.
  2. The Drobo system, on the other hand, has as one of its major selling points – the ability to dynamically grow the size of the array & to mix and match drives of varying sizes in order to create the array. It, however, does this using proprietary technology that is part the hardware which means that the failure of the Drobo hardware or its firmware will leave you unable to read your drives until you get a replacement from Data Robotics. (This is as opposed to the open standard used in a true RAID array).
  3. For folks who are not very technical, this situation is not much different from the failure of the hardware in a RAID array so that it is very critical to note that neither a RAID array nor a Drobo system is a backup system. They provide some measure of redundancy at the drive level but they still present a single point of failure and that data needs to be replicated/backed up elsewhere.
  4. Just having a RAID/Drobo is not a backup system!

Other Tips:

  1. You need to have an offsite copy of your backups in order to recover from a catastrophic loss of data like a fire, natural disaster, theft etc.  Your insurance will usually be able to replace hardware and such, but not your images/data and that may be when you need them the most – to keep the money coming in and your business going.
  2. Use non-destructive image editing software such as Lightroom or Aperture.
  3. When picking this and/or deciding file formats, consider – i)Platform neutrality, ii) Ease of migration, iii) Future proof.
  4. If you happen to shoot high profile clients or subjects which you absolutely cannot afford to be made public, you might want to consider some level of data encryption. That way if your storage media falls into the wrong hands, that data is inaccessible.

Highly recommend getting and reading – “The DAM Book: Digital Asset Management for Photographers” by Peter Krogh (O’Reilly). This book covers more than just your data storage but the whole Digital Asset ecosystem for photographers – from Metadata to Storage Organizing to Workflow.

See also – Drobo Best Practices – http://www.drobo.com/support/best_practices.php