Beyond the What - Getting to Why

01.5-puzzle.png — We are brought in to investigate to find the "Why".

Yet so many investigative reports we read stop short and simply present the "What".

Knowing "What" is important. It is a 'necessary but not sufficient condition' to take a page from our mathematician friends. We are brought in to investigate to find the "Why". Yet so many investigative reports we read stop short and simply present the "What".

Let me explain.

There are two approaches to an investigation.

The first is most common - recover the evidence then list details of the found evidence. Turn it over and look at it, note its characteristics, determine where is was found... In the world of the 5 "W"'s, this investigation is the "What" part.

Most digital forensic reports that I have read in the last decade spend all of their time on the "What".

This image was created on Wednesday December 12, 2016 at 10:03am local time.

This document was found in the Recycle Bin of the c: drive of the Computer.

This email was sent to Sally Jones by Bob Smith - content provided below.

And despite there always being hundreds and sometimes thousands of pages to these reports - these "What" reports - the reader is left to their own devices to figure out the other "W"'s for themselves.

At the end of the tomes, you can feel the authors slapping the dust off their hands and hear their self-congratulations for a job well done.

But these reports and these investigators have missed the point. Data without interpretation is "noise".

Our approach takes a larger view. Of course we have "What", but we add Who, When, Where and How.

"Who, When, Where and How" are more than just a subset of "What".

Dropping the sender's name and email address, along with the date and time the email was sent and reporting that it came from Microsoft Outlook 2010, into the report that presents the 'email' is really just more "What" reporting. It is easy and if that's all that is provided, it is lazy. It is not analysis.

Analysis comes from taking these aspects of "What" and putting them into context.

Profiles of the evidence that originates with a specific party, their emails, documents, web browsing history show who they are, what they are up to. An analysis of the channels of communication, calls, texts, emails, chats, between two parties can give insight into who is influencing who - whose idea was this "hare brained" scheme anyway?

Communication patterns can uncover the "master mind". Sure we might have the data from 6 of the parties. But a pattern analysis based on a theme can demonstrate perhaps that a seventh person, perhaps missing in the evidence, was the leader of the development of the idea. Presenting this graphically, and adding the dimension of time, can vividly show the evolution of the idea and point clearly to their "promoter".

One of the most powerful techniques we use comes from the eDiscovery world and before that from the paper evidence world. Our lawyers always want a timeline. Put the documents in chronological order please - every time.

Why do they want that? Because, while the document itself is important - it's "What", the dimension of time is what tells the story. It is not sufficient to know Winnie-the-Pooh's height, weight and circumference and that of Rabbit's rabbit hole entrance to appreciate Pooh getting stuck in the door. We need the story. And we need it to be accurate, complete and consistent if we are going to entertain or, more importantly, persuade the listener.

With digital investigations the data replaces the 'documents' in the eDiscovery story. Of course, documents make up part of the digital evidence, MS Word documents, PDF files, email - the traditional 'documents' are found on computers and mobile devices and cloud stores - but they are all still documents in the eDiscovery/Paper world. But there is so much more information to be had in the "data" from a digital investigation.

For a start, there is the "simple stuff". In many cases a "business letter" has a date written on the first page. In the eDiscovery/paper world that's the date that is used to place this bit of evidence in the timeline. This is a valuable piece of information and must be captured. But…

When we recover that business letter from a digital storage media, like a USB thumb drive or a hard drive in the laptop or off a file server, we get so much more.

A Word document has a "creation date" - when the file was either first drafted or when the file was copied to the data storage volume (call it a drive letter) where it was recovered from.

It also has a last modified date. When was the file last saved - perhaps think of this as the date of the final version stored on the hard drive.

Right off the bat we have three dates - the letter's date on the first page, the creation date and the last modified date. Are all three important? They might be very important.

Placing this information about the document in a time line might show that (a) the letter was started three weeks BEFORE the date on the first page of the letter. (b) the date the letter was last edited was 2 months AFTER the date on the first page of the letter. Was the letter genuine? Can we learn about the thinking process, dare we say "intentions", of the author knowing it was started weeks before and last modified months after it was dated?

If we don't have this information available in a timeline, we can't see anything but the date on the first page - we are potentially misled and the true events are missed.

Digital evidence provides far more insight than this simple example. With regard to a Word document we can usually determine when the document was last printed, who the author was, how many minutes the document was worked on and much more.

For some, we might be too far into the technical weeds already. You don't need to understand that a the area of a circular rabbit hole can be calculated as "Pi*r^2" to appreciate the story of Pooh getting stuck.

For the story to exist at all you need someone to point out that the hole is too small for Pooh. The work of the digital investigator is to provide that context.

Computers and mobile devices are machines and everything they do leaves a trace. Unlike humans who can sit quietly thinking computers can't compute without instructions, without logging the actions they are made to take, without leaving behind a happy hunting ground of evidence.

Are these elements of evidence "documents"? No, not in the traditional sense.

They are, nonetheless, substantial, reproduceable, evidence.

When was the computer turned on? Who logged in? What did the user of the computer do first? Web browsing - to what sites? What did they search for? Did the user delete any documents? Which documents? From where? Was a USB memory key inserted into the laptop? Can it be identified? What was copied to it? When? Did they save an Excel spreadsheet, what time? What spreadsheet? Did they open Facebook and start an online chat? With who? What was said?

And on and on, nothing for miles and miles but tall grass and weeds.

We integrate all of this data into a Grand timeline. Sometimes for just one user, sometimes for a group of users across different devices.

Much of it is not relevant. "Hi mom, I'll come see you on Saturday". While other events might be vital, user SmithB copied the "Master Contact List.doc" file on Wednesday 5:10pm to a USB stick labelled "Bob's Stuff". Our machete, guided by our understanding of the case, working closely with counsel, cuts a path.

It is our view that digital forensics is about providing this context and assisting counsel in sorting out all the data to provide meaning not just of the documentary data but of the actions taken by users.

What about "Why"?

"Why" is the important part.

Once you have established the other "W"'s and the "H", "Why" is the difference between winning the case and losing it.

Pooh ate too much honey. That's why he got stuck in Rabbit's doorway.

We know the events, the arrival, the reluctant invitation, the eating from one honey pot after another after another, the departure. The problem arises from the timeline of events. The solution also is shown in the timeline. What did people, or animals I suppose, do?

The story flows from the data. The why is much easier to understand and explain in the context of a timeline.

Pooh LOVED honey and he had little self-control. Given the events told as a story, everyone accepts that was the cause.

When your case has a bear stuck in a door consider asking us to help you reach the "Why", efficiently.

Steve EllwoodMarch 1, 2018