Thursday, 1 May 2008 and search.xml

I have mounted a drive with Encase PDE and run Histex 2.11.0002 against it -loaded the resulting index.dats into NetAnalysis and have 577004 URL records to review - joy! So I start by applying the Advanced Search Engine Criteria filter and I only have 27000 of those including lots of URLs like:


and so on..

The URL file name for each was in the form search[1].xml search[2].xml etc.etc.  I had not seen this before so a quick play with google revealed that this domain had been around in 2006... must be the length of the queue!

So what is it all about then?  Well in this case it seems to be down to the google toolbar and the auto suggestions feature that changes as you type into the toolbar search box.

The drop down suggestion box contains the content of the associated xml file

A new  search.xml file is created each time the auto suggestion list changes which appears to happen when an additional character is typed into the toolbar field.   The contents of these search.xml files themselves are not necessarily good evidence but the URL that led to their creation is.

1 comment:

Anonymous said...

It may be worth noting a few other things in relation to this. Firstly, whilst the URLs show how a search string is built up (great evidence that a human is conducting the searches), the search is not actually submitted until you hit enter. To show that this has happened, look for the more familiar "" URL which contains the full search term.

Tested using the toolbar and firefox (v3) search box (look at output= in the URL to determine which was used).

Nice blog by btw!