Forensics from the sausage factory: March 2009

Tuesday, 24 March 2009

Seagate Barracuda Firmware problems

As has been widely reported elsewhere many Seagate Barracuda hard disk drives have faulty firmware that causes them to effectively freeze.

In our lab we have 12 of the affected Seagate Barracuda 7200.11 1TB drives and 50 Seagate ES.2 1TB drives. Two of the 7200.11 have failed with symptoms that suggest that faulty firmware was the cause. I have flashed the firmware of all of the affected working drives and hope that I can look forward to a long fault free period.

One of the failed 7200.11 drives was one third of a Raid 0 array in one of our forensic workstations. This workstation needed an OS reinstallation (onto a separate single drive) and temporarily some data on the OS drive (normally backed up to the Raid) only existed on the Raid. During subsequent software installations a number of reboots were required triggering the firmware bug and a failed raid. Oh bother!

Normally data on our Raid 0 arrays is backed up but due to the aforementioned OS reinstall there was a small amount of data that lost. I therefore had to find a way to unfreeze the locked Seagate Barracuda. A considerable trawl of the internet led me to this post. Gradius2 details a fix involving a significant amount of down and dirty electronics and low level hard drive programming. Not having all the necessary adapters/serial cables etc led me to call Disk Labs. They quoted £1000 - that was a no go then. Luckily Hugh Morgan at Cy4or successfully repaired the drive broadly following Gradius2's fix for significantly less. I reintroduced the drive to the two other drives in the Raid 0 array and Bob's Your Uncle!

Sunday, 22 March 2009

Monetizing Helix

The forensics community has benefitted from the free Linux forensic distro Helix3 for some time. This distro was developed by Drew Fahey and distributed via e-fense.com (archived Helix 3 website). I suppose, like many free things, the issue of how to you support it and develop it when you are not making money from it became an issue for e-fense. I was under the impression that a revenue stream was available via Helix3 training courses (run by CSI Tech in the UK). I know that both Nick Furneaux and Jim Gordon were very busy with these courses, and having attended one myself, I thought they were a great success.

Anyhow it seems that training provision wasn't enough. Late 2008 e-fense invited e-fense helix forum members to make donations. Unsurprisingly take up wasn't that great. This resulted in a slightly hectoring email from e-fense announcing that Helix3 was now only available to those who subscribed for access to their forum. The subscription is around US$20 per month. So be it but as someone who has already paid circa US$1000 for a training course to use a product I cannot now download without subscription I am left feeling slightly disappointed.

Nothing stands still in this arena however. I have posted in the past about WinFE and some subsequent comments led me to a Grand Stream Dreams blog post written by Claus Valca. He referred to two free forensic Linux distros:

Perhaps one of these is the new helix?

It seems one or two others have commented on the same subject -it seems they are not planning to subscribe either.

I noticed a bit late in the day that there is an extensive thread over at Forensic Focus about this issue also.

Friday, 20 March 2009

Facebook Chat Forensics

Background
Facebook has a built in instant messaging facility which has grown in popularity along with the Facebook social networking site itself. Many cases involve potential grooming offences in which the use of instant messaging needs to be investigated.

The instant messaging facility creates a number of artefacts which are easily found and I know have been commentated on elsewhere. The purpose of this blog post is to suggest a methodology to automate the discovery and reporting of Facebook messages.

For those who have not looked at this area in detail yet messages are cached in small html files with a file name P_xxxxxxxx.htm (or .txt). These messages can be found in browser cache, unallocated clusters, pagefiles, system restore points, the MFT as resident data and possibly other places. It is possible for the messages to be cached within the main Facebook profile page (although I have never seen them there - the main facebook page does not seem to be cached that often).

An example of a message is shown below:

for (;;);{"t":"msg","c":"p_1572402994","ms":[{"type":"msg","msg":{"text":"Another Message","time":1237390150796,"clientTime":1237390150114,"msgID":"3078127486"},"from":212300220,"to":1123402994,"from_name":"Mark PPPPPP","to_name":"Richard XXXX","from_first_name":"Mark","to_first_name":"Richard"}]}

The bulk of the message is in fact formatted as JavaScript Object Notation normally referred to as JSON. The format is a text based and human readable way for representing data structures. The timestamps are 13 digit unix timestamps that include milliseconds - they can be divided by 1000 to get a standard unix timestamp.

Although keyword searches will find these messages they are difficult to review particularly if you are only interested in communication between selected parties. Having found relevant hits you then have to create a sweeping bookmark for each one. For these reasons I follow the following methodology.

Suggested Methodology

Create a Custom File Type within the Encase Case Processor File Finder module entitled Facebook Messages using the Header "text":" and the footer }]} making sure GREP is not selected.

Run the file finder with the Facebook Messages option selected.
When the file finder completes you will have a number of text files in your export directory (providing there are messages to be found).
These text files are in the form of the example above. They do not have Carriage Return and Line Feed characters at the end of the text. We need to remedy this by utilising a DOS command at the command prompt.
At the command prompt navigate to the directory containing your exported messages (please note Encase creates additional sub directories beneath your originally specified directory).
Then run the following command:
FOR %c in (*.txt) DO (Echo.>>%~nc.txt)
This command adds a Carriage Return and Line Feed to the end of the extracted message.
Next we want to concatenate the message text files into one file using the command at the DOS prompt: copy *.txt combined.txt
Alternatively create (or email me for) a batch file that executes these two commands direct from windows.
An additional file combined.txt will be created in your export directory.
Launch Microsoft Excel and instigate the Text Import Wizard specifying Delimited with the Delimiter being a comma and the text qualifier " .
Put the data into your worksheet (or cell J3 of my pre-formatted worksheet).
All that's needed now is to tidy up the worksheet with some Excel formulas the full details of which can be found within my example pre-formatted worksheet. The formula to process the time values (which are Unix time stamps) is (RIGHT(K2,13))/1000/86400+25569 where K2 is the cell containing the source time data.
Perform a sanity check and remove obviously corrupt entries.
It can be seen below that after applying a data sort filter you can sort by time or user.

The spreadsheet also allows you to de-duplicate the found messages. In my recent case over half the recovered messages were duplicates. In Excel 2007 these duplicate (rows) are easily removed (Data/DataTools/Remove Duplicates). In Excel 2003 an add-in called The Duplicate Master will do this for you.

Further Thoughts
Non Encase users may be able to use an alternative file carver (e.g. Blade) to carve out the messages. I am sure that the header and footer could be refined a bit to reduce false positives, however for me the ratio of legitimate versus false positives is OK. UPDATE 22nd April 2009 - non encase users may wish to look at my more recent post.

I have the pre-formatted spreadsheet in template form. Please email me for a copy (with a brief explanation of who you are - thanks).

To further investigate the data you recover you may wish to check out http://www.facebook.com/profile.php?id=xxxxxxx. Just substitute the xxxxx with the User ID's you recovered.

Enscript Method
I have collaborated with Simon Key and now have an enscript to parse out JSON objects including messages. It outputs to a csv spreadsheet and in my tests parsed 160GB in about an hour. It might not be as tolerant of corrupt strings as the method detailed above. The script will only run in 6.13 or newer. I have a template that tidies up the formatting of the csv- email me if you want a copy.

References and thanks
http://coderrr.wordpress.com/2008/05/06/facebook-chat-api
http://video.google.com/videoplay?docid=6088074310102786759
http://json.org/
http://en.wikipedia.org/wiki/JSON
http://www.wilsonmar.com/datepgms.htm#UNIXStamp

Thanks to Glenn Siddall for sparking my interest and providing me with some notes of his research.
Thanks to Mark Payton for his assistance in researching this.

Thursday, 19 March 2009

The need for speed

We are lucky in our lab that our workstations are upgraded on a regular basis so once in use we don't often make many changes.

The most important bits of my current spec are as follows:

Supermicro X7-DWA-N fitted into a Supermicro CSE-733TQ-645 chassis
Two Intel Xeon X5482 processors
16GB DDR2 800 Ram
Western Digital 300GB VelociRaptor10,000 rpm hard drive for the OS
3 x 1TB Samsung HE103UJ Spinpoint F1 hard drives in a RAID 0 array
Microsoft XP 64 bit
256MB ATI FirePro V3700 GPU

At one time our primary forensic software Encase would max out the processor when carrying pretty much any process. With the advent of multi-core dual processors we aim to max out one core (which on my box is 13% cpu utilisation in task manager). As processors get faster and faster I have noticed that often the CPU core is not maxing out. Something else is slowing us down! We store our Encase evidence files on the Raid 0 array (and just before someone posts a comment about the lack of data resilience etc., the way our lab is set up all the data on my Raid 0 array is mirrored elsewhere). We do this for speed and capacity. When Encase (and most other forensic utilities for that matter) is processing it has a voracious appetite for data. Just look at the Read bytes value in Task Manager. The multi-core processors allow us to run other forensic programs (FTK, Netanalysis hstex etc etc) along with Encase, we can even run other instances of Encase, and because we can - we do. The net result of all these programs running is that they compete to read data from the Raid 0 array in my case (and from wherever you store yours in yours) - the net result once your data storage is maxed out is things slow down. It follows then that performance can be increased by having faster data storage.

One way to achieve this would be faster hard drives. We use sata hard drives for capacity reasons and to an extent cost. SAS hard drives are faster but don't provide the capacity. So as things stand three hard drives in a Raid 0 array was the best that could be done. I decided to see how I could make some improvement.

Currently the three hard drives (and the OS drive) connect to the Intel ESB2 raid controller integrated on the motherboard. Conventional wisdom would have it that by adding a fourth hard drive to the raid 0 array would speed things up.

HD Tach details an average sequential read speed of around 200 MB/s for a three drive array utilising the default stripe size (128kb) with NTFS formatted with the default cluster size.

Adding a fourth drive slowed the sequential read speed to around 180 MB/s.

I tested a variety of different stripe sizes and aligned the partitions but came to the conclusion that the Intel ESB2 controller just did not scale up to four drives very well. The arrays were created via the utility accessed via the controller bios during boot up. This utility is very basic and does not allow much configuration. Intel also provides a Windows utility called Intel Matrix Storage Console. When running this utility I found that by default Volume Write-back cache was disabled. Enabling it made a significant improvement.

Conventional wisdom has it that a hardware Raid controller would improve performance over the Intel ESB2 and in my testing this seems to be the case. I have used an Areca 1212 PCI-E raid card and achieved a sequential read speed of over 600 MB/s.

This array has four 1TB sata hard drives with a 64kb stripe, is partition aligned at 3072 sectors and has one NTFS volume with the default cluster size. Using Syncback to write to the array from our file server across a copper gigabit ethernet network produces some pretty impressive network utilization stats.

Sunday, 8 March 2009

Yahoo mailbox

An MLAT request brought CD-R to my door recently. The OIC informed me that the CD contained a Yahoo mailbox but wanted my help because he could not read them. I found that the CD contained a tar.gz file.

Once this archive was unpacked I saw it contained two very large text files. These files were generic Mbox files. The next problem was how to view the contents. I found that Apple Mail would happily import Mbox files (File/Import Mailboxes) however I live in a mainly windows world so needed a Windows method for the OIC to preview the emails.

Thunderbird came to mind, however although this program uses the mbox format for its mailboxes it does not offer an easy way to import them. I did track down an extension to Thunderbird that provided this functionality but it only worked on one of my two mbox files. I also found that Opera 9 would also import my mbox files.

The problem with both Thunderbird and Opera is that the boxes available to the OIC in this case, and our customers in general, mostly do not have these programs installed. Ideally a way of getting the email messages into Outlook Express would be the best bet. The solution to this is provided by using the Mid Michigan Computer Forensics Group's M2CFG Yahoo! Email/Text Parser. This program parses out the email messages into .eml files which can be dragged into Outlook Express (and a number of other Email clients).

As it turned out the two mboxes I had extracted for the OIC were so full of emails with attachments that it was too complicated for him to process efficiently. So they came back to me to investigate. I added the mbox text files into Encase v6.12.1 and searched for email with the mbox option selected which resulted in Encase parsing out the emails and attachments very well. Reporting them was another matter!

Forensics from the sausage factory

Tuesday, 24 March 2009

Seagate Barracuda Firmware problems

Sunday, 22 March 2009

Monetizing Helix

Friday, 20 March 2009

Facebook Chat Forensics

Thursday, 19 March 2009

The need for speed

Sunday, 8 March 2009

Yahoo mailbox

Where to find me

Factory Clock

Websites I go back to

Blogs I read

Podcasts I listen to

Blog Archive

Forensics from the sausage factory

Tuesday, 24 March 2009

Seagate Barracuda Firmware problems

Sunday, 22 March 2009

Monetizing Helix

Friday, 20 March 2009

Facebook Chat Forensics

Thursday, 19 March 2009

The need for speed

Sunday, 8 March 2009

Yahoo mailbox

Where to find me

Factory Clock

Please Subscribe To This Blog

Websites I go back to

Blogs I read

Podcasts I listen to

Blog Archive