Monday, 8 November 2010

Storage in Forensic Labs

As you probably appreciate the Sausage Factory type of computer forensics lab has to store and retain vast quantities of data. In the early days, even in the Sausage Factory, we imaged individual hard drives to individual hard drives. But because of the volume of data and the economics of this methodology we realised that we had to use some form of centralised storage. That was in 2002 and since then we picked up a few tips along the way.

I know of a number of LE labs that have invested large sums (£100k plus) buying their storage area networks. Unfortunately further down the road they could not afford to increase capacity, had maintenance issues, or had other difficulties exacerbated by the shear complexity of their set up. At the other end of the scale I know of sizeable outfits who stick to imaging to hard drives because they believe that they would never acquire the budget to go down the centralised storage route.

I believe there is a middle ground. It is possible to buy 26TB of useable RAID6 storage (32TB raw), a Server and a backup solution for circa £15k. This solution is scalable with further units of 26TB useable storage costing circa £7k each. With a sensible set of operating procedures this type of solution will remain serviceable and fit for purpose for a number of years.

The observant amongst you will have counted nine raid enclosures in the picture. The youngest unit is a Jetstor 516F which when equipped with 16 2TB enterprise class SAS hard drives provides 26TB usable storage and costs less than £10k. The oldest Infortrend unit is over five years old (and does not store production line data any longer). None of these units have ever lost data. They routinely recover from the inevitable hard drive failures. Although these units are not in the same league as EMC et al they are manufactured for the enterprise and in my experience have longevity built in. It is possible to provide similar levels of storage even cheaper with consumer grade equipment but this would probably be a false economy.

All of these units are directly attached (via fibre) to a server. I have found that both Intel and HP manufacture (and support) servers that will probably last forever. Again I look after servers that have not missed a beat in five years.

Although I have found that this type of kit will last I think it is sensible to plan to cycle replacement of primary production line equipment over a three to four year period. Since 2002 I have learnt a lot about this type of kit but have also found that choosing a supplier that will hold your hand when necessary can be particularly useful. In the UK I have found that VSPL understand the needs of LE computer forensic labs and most importantly have always been available to support me when required.

This type of setup, in my experience, has worked well in supporting the production line nature of our forensics work. However a certain way of operating it is required. Which if I had to sum up in two points the first is that storage performance is best alongside processor performance - on the forensic workstation, and secondly if you want data resilience keep two copies of your data (in one form or another) at all times.

Obviously there is a little bit more to it than that. If you are interested in finding out more please let me know,


7 comments:

Marek said...

Nice article with quite a lot of results and ideas that we came across as well during the last years, when thinking about how to store our incident data.
But unfortunately one of the most important points from my point of interest is missing: How do you transfer the date, e.g. harddrive images, from this selfmade NAS to a forensic workstation for analyis?
Do you transfer the orignial image, work on it locally on on the workstation, and transfer the results back when finished? Or is the data always held identical on the workstation and the NAS?

DC1743 said...

Marek,

We image to the centralised storage (via a mapped network drive). We then synchronise the image to the investigators workstation (using Syncback) and verify it. At this stage you have two separate copies of your data. The investigator carries out their work, making backups of case files, reports etc locally. Extracted data is not normally backed up (as it can always be re-extracted).

At the conclusion of the case the work product is synced back to the centralised storage and then the individual case is archived to tape. At this point the workstation storage can be freed up for new cases. Two copies of the data (storage and tape) are retained until the case is finalised (in court or whatever).

The network is gigabit ethernet which we found sufficient (because you are not running cases across the network).

Richard

Anonymous said...

Nice... What's the power consumption for the rack?

p said...

A downside to using RAID6 over filesystems such as ZFS is that you don't have end-to-end protection against data corruption. When your image hashes don't match because one of the drives have a 1 instead of a 0, you can now no longer use that evidence (and there may be other consequences).

The only viable options I see for litigation-bound computer forensics is some sort of monthly hashing script to verify the files have not changed. ZFS is great but Solaris comes with its own implementation problems.

HP said...

Many Hi Tech Crime Units went over to networked storage a few years ago like the sausage factory. I considered this activity as somewhat boys toys, simplistic and a generalisation I know but it is an expensive solution compared to imaging to single hard drives. My (now ex)unit stayed with single hard drives against cries of what if we have a big job. I bought a big lump containing 5 raided drives to cover this and used it less than a handful of times in five years, perhaps we don't do big jobs! I would suggest a lot of units could get away with imaging to hard drives and save a lot of money on infrastructure and the overheads in time and money of maintaining the network storage.

I recently visited the best organised private CF company I have ever seen and was a little surprised to find that they did not use any network storage for drive images.

H

Anonymous said...

Hi,

Have used these Jetstors for a while, but did have one fail once - showing 3 discs had failed at the same instant on a power off re-start, so could not recover the raid. Got the data back but it cost £18.5k to re-assemble the raid. Although these units have triplicated PSU's etc, there must be a common mode failure possibility somewhere ?
Posted as ANON as I no longer work for the agency that had the problem.

Great devices, good product and good service from VSPL.

regents said...

Good article on how to deal with ever expanding data we acquire. We either do one for one or else use a 2TB for larger / multiple jobs. We would be reluctant to commit all the data to one NAS - perhaps if the data keeps increasing!