Sunday, 21 March 2010

Flock shepherds in a Life of Grime

After writing that title I am wondering if I wouldn't be better employed writing crossword clues. Back at the sausage factory I am bogged down investigating a digital Life of Grime. You probably have done a case like it - there are several hard drives in the suspect's box all with multiple partitions. Every nook and cranny is stuffed with IPoC and IVoC in a semi organised way. Almost as if all the material was neatly stored at one time until my suspect got lazy and started storing stuff on the floor, in the hall, under the table, that sort of thing. There are new folders after new folders -you don't see a New Folder (13) everyday. Not content with filling up his hard drives my suspect had also felt the need to back his stuff up to CD-R or DVD. Repeatedly.

The sheer quantity of files and folders in this case presents many problems and I thought I would share a few of them with you. Most of the problems are linked to pictures, that is the vast quantity of pictures, all of which have to be categorised and counted.

As many of my readers will know in a case like this C4P is an essential tool (going off at a tangent -googling C4P led to the discovery that c4p dot com is an online community for swingers - and now having said that we better watch out for some dodgy google ads appearing to the right ->> but hey they might improve my minuscule return from them ;-) ). C4P stores the results of categorisation, whilst the case is being worked on, in an Access database that has a 2GB file size limit. Now depending on how deeply your MDP folder is buried within your case folder structure this equates to about 1.9 million picture files. A long standing problem has been that if the c4p graphics extractor enscript carves out more than about 1.9 million picture files C4P itself would fail to create a case, due to the underlying Access database maxing out. In this case well over 2.1 million images were carved from the optical media alone and I was pleased to discover that C4P version 4.01 at least dealt with the maxing out issue fairly gracefully by creating a .c4p4 file and then a second .c4p4 file for the overspill. Case creation failed but I was subsequently able to create a new case by opening the double clicking on the first c4p4 file, which is associated with C4P. Doing this gave me a case with exactly 1,900,000 pictures which when viewed pragmatically is better than starting again. It is clear that the Access database problem will become more of a problem as main stream hard disks become bigger and bigger and I was pleased to discover that a beta version (4.03) of C4P is available that utilises an SQLite database (and whilst at it, runs happily on 64 bit boxes). This version does not limit the number of pictures you can have in one case.

In a case like this another C4P bugbear also rears its ugly head. The latest C4P graphics extractor enscript (4.03) carves out embedded thumbnails within jpgs, so for every picture file potentially you may have three carved images- the original, a thumbnail and a preview image. I am not sure why the latest script does this as a feature of earlier scripts was that it didn't carve embedded images to avoid duplication. It is that word -duplication- that is the problem. C4P will allow you to create a report directly from the program but if you need to compile statistics (how many Level 1s, 2s etc.) the report will be inaccurate. I do not use C4P for reporting and instead use the C4P Import enscript to bring back the C4P results into my Encase case. Once in Encase, sorting the C4P bookmarks by file offset provides a good indicator of how many embedded images there are. Luckily it is a fairly simple exercise to remove the duplicate bookmarking -simply selecting (blue checking) each category folder in turn and tagging selected files, and then from Entries view rebookmarking the files concerned will effectively sort out most duplication. Obviously different considerations will apply to pictures embedded into files for other reasons.

Moving away from C4P another issue in this case is establishing where the pictures came from. Obviously establishing which browsers are being used is a job that need doing and an Enscript that does just that has been written by Mark Woan over at Woanware. The WebBrowserInformationFinder enscript outputs to the console enabling you to copy and paste into your contemporaneous notes -very neat. Talking of contemp notes I use John Douglas's Case Notes - another very neat program. Looking at the problem a little wider it is worth considering some other angles to this problem, the registry contains a few pointers too, Harlan Carvey's post Browser Stuff covers this in more detail.

As it turned out my suspect used Flock as his default browser. This browser is built upon Firefox 3 and stores internet history and cache in a profile in the same way as Firefox. In the XP box I was looking at the profile folder was stored at the path C:\Documents and Settings\USER_NAME\Application Data\Flock\Browser\Profiles\xxxxxxxx.default. I have analysed this with NetAnalysis 1.50. More on NetAnalysis, Firefox 3 and the tools available to analyse this browser's artefacts in my next post.


1 comment:

Kush Wadhwa said...

Please let me know from where can I get C4P enscript pack.

Thanx,

Kush Wadhwa