Monday, 28 June 2010

Safari Internet History round up

The last few posts all concern the recovery of internet history created by the Safari browser. I like to think of internet history in the wider sense and consider any artefact that demonstrates that a user visited a URL at a particular time.

Recovering Safari browser history from unallocated deals with history.
Safari browser cache -examination of Cache.db deals with the cache.
Never mind the cookies lets carve the crumbs - Safari Cookie stuff looks at Cookies.
Safari History - spotlight webhistory artefacts examines Spotlight snapshots of web pages accessed with Safari.

To round things up I will briefly list some other files or locations that may provide internet history created by the Safari browser (the ~ denotes the path is within a user profile)

Used to store details of the last browser session allowing a user to select Reopen All Windows from Last Session from the safari history menu.


Used to store the associations between websites and their favicons.

~/Library/PubSub/Feeds/............... .xml
~/Library/Caches/ Previews

TopSites is a gallery of recently visited web sites. The binary TopSites.plist details the websites featured in this gallery. The image representing each webpage is stored within the Webpage Previews folder. This folder also stores any Quicklook representation of a webpage, for example when managing Bookmarks or reviewing History. File names of files in the Webpage Previews folder are the MD5 of the associated URL. Safari monitors whether a page has altered since it was last viewed and appends a blue star to the TopSites view for those sites that have. The xml files in PubSub/Feeds are connected with the monitoring.

An xml plist the contents of which are self explanatory.
~/Library/Caches/Metadata/Safari/History/.tracked filenames.plist
A binary plist that may be connected to Safari spotlight web history artefacts.

Tuesday, 22 June 2010

Never mind the cookies lets carve the crumbs - Safari Cookie stuff

Safari versions 3, 4 and 5 amalgamates Cookie data into one large file Cookies.plist stored at the path ~/Library/Cookies. This plist is an XML plist. The Encase Internet History search will parse these files and when set to Comprehensive search will find fragments of them in unallocated. However perhaps due to its lack of granularity this search takes forever to run across a Mac and in my experience often fails to complete

As is becoming a recurring theme with my Safari examinations I have turned to Blade to carve out Safari Cookie data from unallocated. The Cookie.plist consists of an array of dictionary objects.

Using Apple's Property List Editor it can be seen that this Cookie.plist has an array of 7074 Dictionary objects. Each Dictionary object is a Cookie in its own right.

Looking at the underlying XML you can see how each dictionary object is structured.

In creating a recovery profile I considered whether I wanted to carve out deleted cookie plists in their entirety or whether I should carve each dictionary object separately. These dictionary objects are fragments of the cookie.plist - hence the crumb reference in the title -after all fragments of cookies are clearly crumbs. I decided that it would be a more thorough search if I carved for the dictionary objects themselves and the following Blade data recovery profile did the business (this data is extracted from Blade's audit log -another neat feature).

Profile Description: Safari Cookie records
ModifiedDate: 2010-06-17 06:33:30
Author: Richard Drinkwater
Version: 1.3.10168
Category: Safari artefacts
Extension: plist
SectorBoundary: False
HeaderSignature: \x3C\x64\x69\x63\x74\x3E\x0A\x09\x09\x3C\x6B\x65\x79\x3E\x43\x72\x65\x61\x74\x65\x64\x3C\x2F\x6B\x65\x79\x3E\x0A\x09\x09\x3C\x72\x65\x61\x6C\x3E
HeaderIgnoreCase: False
HasLandmark: True
LandmarkSignature: <key>Expires</key>
LandmarkIgnoreCase: False
LandmarkLocation: Floating
LandmarkOffset: 0
HasFooter: True
Reverse: False
FooterIgnoreCase: False
FooterSignature: \x3C\x2F\x73\x74\x72\x69\x6E\x67\x3E\x0A\x09\x3C\x2F\x64\x69\x63\x74\x3E\x0A
BytesToEOF: 19
MaxByteLength: 9728
MinByteLength: 200
HasLengthMarker: False
UseNextFileSigAsEof: True
LengthMarkerRelativeOffset: 0
LengthMarkerSize: UInt16

Processing the Carved Files

If your case is anything like mine you will carve out thousands and thousands of individual cookies (or at least the cookie data represented in XML). There are a number of options to process this data further.

Option 1

  • Drag output into Encase as single files.
  • Run Encase Comprehensive Internet History search.
  • View results on records tab.

There are two issues with this method. Firstly Encase does not parse the Cookie created date which is stored as an CFAbsolute timestamp. Secondly there is the issue of duplicates. You will have thousands and thousands of duplicates. These can be managed by hashing the carved files. I would also recommend running the data recovery profile over any live cookie.plists, loading the output into Encase as single files, hashing the output and then creating a hash set. This hash set will allow you to spot additional cookies over and above those in the live cookie plists in any cookies carved from unallocated.

Option 2

  • Concatenate the contents of each output folder by navigating to the folder at the command prompt and executing the command copy *.plist combined.plist.
  • With a text editor add the plist header and array tag at the beginning of combined.plist and the closing plist and array tags at the end.
  • Make sure the formatting of combined.plist looks OK with a text editor.
  • Process combined.plist with Jake Cunningham's safari cookie plist parser.
  • The utility is run from the command prompt using a command in the form
    >[path to Safari_cookies.exe] [path to combined.plist] > cookies.txt
  • This parses the plist into the file cookies.txt
  • This text file may contain many thousands of Cookies. Ideally it would be nicer to port this data into a spreadsheet. To do this I (there is probably a far more elegant way to do this BTW) open cookies.txt in a hex editor (PSPad Hex) and delete all the carriage returns 0D0A. I then find the string Path [50617468] and replace it with 0D0A7C50617468 -in other words preface path with a carriage return and the pipe symbol |. I then find and replace the strings Domain, Name, Created, Expires and Value and replace each in turn with the same string prefaced with | (e.g. |Domain, |Name etc. etc.)
  • I then use Excel's text import wizard to import the edited cookies.txt setting the delimiter to the pipe symbol | only.
  • This results in each row relating to one cookie. You can then utilise Excel's very powerful duplicate removal tool.
Both the Mac and Windows versions work OK and the utility converts the CFAbsolute formatted cookie created timestamp.

Tuesday, 15 June 2010

Safari History - spotlight webhistory artefacts

June is Safari month here in the Sausage Factory and this post is the third in the series. Just imagine having an observation point in the house across the road from your suspect. When the suspect surfs the internet the man in the OP (with the help of a good pair of binoculars) makes notes of what he reads on screen (OK.. he may use a long lens instead of binoculars and take photos but bear with me). Essentially this is exactly what Spotlight does when a user utilises the Safari web browser (versions 3,4 and 5) to view web pages - it writes the URL, Web Page Title and all the text content in the web page into a file.

  • These files filenames are in the format URL.webhistory
  • Their internal structure is that of a binary plist with three strings to each record Full Page Text, Name and URL
  • They are stored at the path ~/Library/Caches/Metadata/Safari/History
  • The file created date of these files represents the time that the URL was first visited (since History was last cleared)
  • The file modified date represents the time that the URL was last visited

It can be seen that it is possible to deduce information from these files that amounts to internet history and therefore it it may be appropriate to consider this data along with records extracted from history.plist and cache.db files.

Recovery from Unallocated
These files are deleted when a user clears Safari history. However it is possible to recover these files from unallocated. Using my file carver of choice - Digital Detective's Blade I wrote an appropriate Data Recovery Profile (which I will happily share with you upon request)

Click on image for larger version

Running this profile resulted in the recovery of over ten thousand files. I then added the recovered files into Encase as single files. I noticed that a small percentage of these files had the text content stored as ascii and not unicode text. I am at this stage not sure why.

Investigation of Live and Recovered Spotlight Webhistory Files using Encase
If you review these files using Encase you will see in the View (bottom) pane the relevant data -the URL is at the start of the file, followed by the text in unicode and then the webpage title near the end of the file. If the content is relevant reporting on it is a pain -potentially three sweeping bookmarks are required using two different text styles. The unicode text sweeping bookmark is also likely to be truncated due its length. Therefore reviewing any number of these files this way is not a good plan.

The eagle eyed amongst you will have observed that in my Blade Data Recovery Profile I gave the recovered files a plist file extension (as opposed to a webhistory file extension). This because these files have a binary plist structure and I use Simon Key's binary Plist Parser v3.5 enscript to parse them. This excellent enscript allows the option to create a logical evidence file which creates a file for each plist name/value pair. I run the enscript with this option, add the logical evidence file back into my case and the review the contents with just a unicode text style selected and bookmark as appropriate. This method is much quicker and removes the need to mess about with unicode formatting. It also makes keyword searching easier. For example to view all URLs green plate (set include) your logical evidence file, apply a sort to the name column in the table pane, scroll down to cause each URL to appear in turn in the view pane. Use a similar method for the Full Page Text and Name items.

Click on image for larger version

Miscellaneous Information in relation to the webhistory file format
Prior to considering the Plist Parser enscript to parse these files I briefly looked at its format with a view to tempting some programming friends to write me a parser. I established that

  • The file is a binary plist. I do not want to too far into the intricacies of how these plists are assembled. We are interested in objects within the object table. Binary plists use marker bytes to indicate object type and size. The objects we are interested in are strings, either ASCII or unicode. Looking at Apple's release of the binary plist format (scroll about a fifth of the way down the page) it can be seen that the Object Format Marker byte for ASCII strings found in this file is in binary 01011111, followed by an integer count byte. In hex these marker bytes as seen in this file are 5Fh 10h. The Object Format Marker byte for unicode strings found in this file is in binary 01101111, followed by an integer count byte. In hex these marker bytes as seen in this file are 6Fh 11h.
  • The byte immediately prior to the URL (generally starting http) and after the marker 5Fh 10h decoded as an 8 bit integer denotes the length of the URL. However if the URL is longer than 255 bytes the marker will be 5Fh 11h indicating the following two bytes are used to store the length decoded as 16 bit big endian
  • Following the URL there is a marker 6Fh 11h - the next two bytes decoded 16 bit big endian is the number of characters of text extracted from the web page - multiply by 2 to calculate the length of the unicode text element of the record
  • Following the unicode text element is a marker 5Fh 10h -the next byte immediately prior to the webpage title decoded as an 8 bit integer denotes the length of the webpage title
  • the last four bytes of the file formatted 32 bit big endian is the record size (detailing the number of bytes from the start of the URL to the end of the fifth byte from the end of the file)

Example file format

Click on image for larger version


Tuesday, 8 June 2010

Safari browser cache - examination of Cache.db

Following on from my post about Safari browser history I want to touch upon Safari cache. My suspect is running Mac OSX 10.5.6 Leopard and Safari 3.2.1. This version stores browser cache in an sqlite3 database ~/Users/User_Name/Library/Caches/ Earlier versions of Version 3 and Version 1 and 2 store cache in a different format, and/or a different place. The Episode 3 Shownotes of the Inside the Core Podcast cover this succinctly so I will not repeat it here but FWIW I have cached Safari artefacts in all three forms on the box I have examined. Currently Netanalysis and Encase do not parse the Safari Cache.db file so another method is required.

Safari Cache.db basics
What follows I believe relates to versions 3, 4 and 5 of Safari running in Mac OSX.
The file contains lots of information including the cached data, requesting URL and timestamps. The file is a Sqlite3 database file which has become a popular format to store cached browser data. The cache.db database contains four tables. For the purposes of this post think of each table as a spreadsheet with column headers (field names) and rows beneath representing individual records.
Two tables are of particular interest:

  • cfurl_cache_blob_data
  • cfurl_cache_response

cfurl_cache_blob_data contains one very notable field and a number of slightly less useful ones. The notable field is receiver_data which is used to store the cached item itself (e.g. cached jpgs, gifs, pngs, html et al ) as a BLOB. A BLOB is a Binary Large OBject. Two other fields request_object and response_object contain information relating to the http request/response cycle also stored as a BLOB which when examined further are in fact xml plists. The entry_ID field is the primary key in this table which will allow us to relate the data in this table to data stored in other tables.

cfurl_cache_response contains two notable fields - request_key and time_stamp. The request_key field is used to contain the URL of the cached item. The time_stamp field is used to store the time (UTC) the item was cached. The entry_ID field is the primary key in this table which will allow us to relate the data in this table to data stored in cfurl_cache_blob_data.

In a nutshell cfurl_cache_blob_data contains the cached item and cfurl_cache_response contains metadata about the cached item.

Safari cache.db examination methods
I would like to share three different methods using SQL queries and a few different tools.

Safari cache.db examination methods - contents quick and dirty
Safari cache.db examination methods - metadata quick and dirty
Safari cache.db examination methods - contents and metadata

Safari cache.db examination methods - contents quick and dirty
Depending on what you wish to achieve there are a number of different methods you can adopt. As regular readers will know I work on many IPOC cases. If all you want to do is quickly review the contents of cache.db (as opposed to the associated meta data) I can not recommend any application more highly than File Juicer. This application runs on the Mac platform (which I know is a gotcha for some) and parses out all cached items into a neat folder structure.

I drag the File Juicer output folders into Encase as single files and examine the contents further there. File Juicer is not a forensic tool per se but the developer has at least considered the possibility that it may be used as such. If using a Mac is not an option a Windows app SQL Image Viewer may suffice (with the caveat that I have not actually tested this app).

Safari cache.db examination methods - metadata quick and dirty
Sometimes overlooked is the fact that most caches contain internet history in the form of urls relating to the cached item. The cfurl_cache_response table contains two fields - request_key and time_stamp containing useful metadata. We can use an SQL query to parse data out of these fields. I use (for variety more than anything else) two different tools (i.e. one or the other) to carry out a quick review of meta data.

Method A using Sqlite3 itself ( scroll down to the Precompiled Binaries for Windows section)

  • extract your cache.db file into a folder
  • copy sqlite3.exe into the same folder [to cut down on typing paths etc.]
  • launch a command prompt and navigate to your chosen folder
  • Type sqlite3 cache.db
  • then at the sqlite prompt type .output Cache_metadata.txt [this directs any further output to the file Cache_metadata.txt]
  • at sqlite prompt type Select time_stamp, request_key from cfurl_cache_response; [don't forget the semi colon]
  • allow a moment or three for the query to complete the output of it's results
  • Launch Microsoft Excel and start the Text Import Wizard selecting (step by step) delimited data, set the delimiters to Other | [pipe symbol] and set the Column data format to Text
  • Click on Finish then OK and bobs your uncle!

Click image to view full size

Method B using SQLite Database Browser as a viewer in Encase

  • from your Encase case send the Cache.db to SQLite Database Browser
  • on the Execute SQL tab type in the SQL string field enter Select time_stamp, request_key from cfurl_cache_response
  • Review results in the Data returned pane
  • from your Encase case send the Cache.db to SQLite Database Browser
  • File/Export/Table as CSV file
  • Select the cfurl_cache_response Table name
  • Open exported CSV in Excel and adjust time_stamp column formatting (a custom date format is required to display seconds)

Safari cache.db examination methods - contents and metadata
What we need to do here is extract the related data from both tables - in other words be able to view the time stamp, URL and the cached object at the same time. This can be done using SQLite2009 Pro Enterprise Manager. This program has a built in BLOB viewer that will allow you to view the BLOB data in hex and via a image (as in picture) viewer if appropriate.

  • Once you have launched the program open your extracted Cache.db file
  • In the Query box type (or copy and paste) all in one go
    SELECT cfurl_cache_blob_data.entry_ID,cfurl_cache_blob_data.receiver_data, cfurl_cache_response.request_key,cfurl_cache_response.time_stamp
    FROM cfurl_cache_blob_data, cfurl_cache_response
    WHERE cfurl_cache_blob_data.entry_ID=cfurl_cache_response.entry_ID

  • Then key F5 to execute the query
  • This will populate the results tab with the results
  • To view the cached object BLOB data in the receiver_data field highlight the record of interest with your mouse (but don't click on BLOB in the receiver_data field). This will populate the hex viewer (bottom left) and the BLOB viewer (bottom right).
  • To view a full sized version of a cached image click with your mouse on BLOB in the receiver_data field which launches a separate viewing window

Click on image to view full size

SQLite Database File Format weblog - Extracting data from Apple Safari's cache
Inside the Core Episode 3 Show Notes
Define relationships between database tables -Techrepublic

Sunday, 6 June 2010

Recovering Safari browser history from unallocated

One of my cases involves the examination of an Apple Mac running Mac OSX 10.5.6 Leopard . The primary web browser in use is Safari version 3.2.1. Typically with Safari I run the Comprehensive Internet History search in Encase but in this case the search would not complete so I had to consider another method to recover and review internet history. Browsing history is stored in a binary plist ~ /Users/User_Name/Library/Safari/History.plist however the live one was empty. I recalled from a much earlier case that you can carve deleted plists from unallocated. I had documented a method for doing this over at but at the time of writing this resource is still offline.

One of the best file carvers around is Blade and I decided to use it to recover the deleted History.plists. Blade has a number of pre-configured built in Recovery Profiles but there wasn't one for Safari. However one of the neat things about Blade is that you can write your own profiles and share them with others. In conversation I had found out that Craig Wilson had written a Safari history.plist recovery profile which he kindly made available to me (after all why re-invent the wheel). I imported it into my copy of Blade and I was then good to go.

Click image for a full size version

Another really neat feature with Blade is that you can run it across the Encase evidence files without having to mount them. Having done this in my case Blade recovered over three thousand deleted History.plist files. I then loaded the recovered plist files into Netanalysis 1.51 resulting in over 300,000 internet history records to review. Cool.