Do today's organizations have a data problem? It's a big question that means a lot to Nuix—after all, if the answer is 'no' and we've just been assuming it all along, we've got a major problem on our hands. I think it's safe to say that's not the case, but there's a lot more to the story than just saying "We all have lots and lots of data."
Organizations are challenged not just by volume, but by the diversity of data they need to make sense of. It's a fairly straightforward assumption that you can tell certain stories by looking at user-generated information like email messages and documents, while seeing something else entirely if you focus on log files, network traffic, or even structured information found in databases.
We started a while ago to think about data categories, or dimensions, in order to make sense out of the gross oversimplification of phrases like 'massive volumes.' It became fairly apparent that organizations often had a handle on each dimension, ten of them by our count, but only individually. Data management, investigation, discovery, and security tools all concerned themselves with mastering a certain aspect of data, or maybe a few areas, rather than the sum total.
Fighting Data Blindness
Imagine walking Citizens Bank Park—home of my Philadelphia Phillies—or really any baseball or football field (American or otherwise) in the world. What would you see if your eyes were only capable of seeing in shades of green? You'd get a basic understanding of the field's shape thanks to the green grass, but you'd be missing important details. At a baseball field, you wouldn’t see the yellow foul poles, white baselines, and red clay of the base paths. I think it's safe to say those are important inputs to understanding what's happening in a baseball game. The same goes for football: Where are the boundary lines? The goal posts? Where did the crowd go?
Tools that only ‘see’ one dimension or data are equally limited, a data blindness that’s a considerable detriment when you consider what’s at stake. Nuix gives you a full spectrum view of your data, providing visibility into all ten dimensions of data, including:
1. Human-generated Content
Often given the most attention in investigations and litigation, human-generated content includes email messages and common documents, PDFs, spreadsheets, and presentations.
Multimedia is rampant, and will continue to be so as organizations capture more audio and video for business purposes. As compelling as audio and video are for the end user, they present a challenge to extract intelligence from in an investigation.
3. Digital and Mobile Forensic Data
Forensic evidence from devices includes many of the other artifacts included in this list, but formatted as forensically protected images or consisting of information related to user and system activities, like Windows Registry files and unallocated or file slack space.
4. Network Data
Data flowing over the network can often paint an interesting anecdotal picture, alluding to activities occurring on individual devices or locations within the enterprise.
5. Log Data
Log data can be saved by virtually any connected device within the network, giving investigators or incident responders an even greater wealth of intelligence on user and device activity.
6. User Data
How a user interacts with the endpoints on a network is very often the key to unlocking a case. Would it help investigations to capture a suspect’s keystrokes in real time, or be alerted if they print a large volume of material at the same time as plugging in an unauthorized USB storage device?
7. Communication Patterns
Seeing an email saying “I’m sick and can’t come in today” doesn’t raise many red flags. In an investigation, for example, into intellectual property theft, what if that email was correlated against a text message reading “I’ve got the files, when can I meet you for payment?” Communication patterns aren’t always this clear cut, but can be a huge intelligence source when they’re viewed in context with each other.
8. Structured Data
Just because structured data—records found in well-organized databases—is often overlooked because of that very same organization, doesn’t mean there isn’t a vast wealth of intelligence that the data can offer.
9. Enterprise and Cloud Repositories
Who isn’t concerned with the cloud? Enterprises are turning to all manner of cloud storage solutions to help them scale their own capabilities and reduce costs. Just because data is stored in the cloud, doesn’t mean you don’t need access to it. Add on enterprise archival systems and these repositories can make, or break, a case.
10. Real-time Feeds
Third-party intelligence feeds give you industry-level context into why something is happening, whereas activity on social media can give you valuable intelligence about your users (or even attackers), sometimes before they take any action that targets the organization.
Managing Risk across All Dimensions
Organizations are responsible for managing risk across all ten dimensions of data. Doing so can take an inordinate level of effort to protect the data, investigate when something goes awry, and produce results for litigation.