Saturday, April 28, 2012

What Biosecurity Can Learn From Infosec

Recently there has been been debate over research on the H1N1 strain of influenza. This is the strain sometimes called avian flu or bird flu. Many researchers have been studying all they can about the disease, while many researchers, institutes and governments are trying to prevent more research. The arguments on both sides are complex and nuanced, and each side has many valid points.

While I don't want to recap the entire argument, I'll try to summarize each position in a sentence or two. The pro- side believes that legitimate research will help us deal with any eventual version of the virus that can spread from human-to-human. The con- side belives that research makes it more likely that a very deadly strain might make its way out of the lab, or that terrorists or governments will be able to more quickly have a weaponized strain.

Current Situation
While public science on H1N1 may cease, the virus itself will keep evolving. Life will always experiment with new forms. Eventually one of these may be a variant of H1N1 that is successful in spreading human-to-human. That is, it finds a new evolutionary niche it can exploit.

And certainly it's to be expected that organizations currently researching biotechnology for warfare would continue. What we saw with chemical weapons in the first World War was private research done by corporations being co-opted for use by the military. Bioweapons research groups may even reduce or discontinue their work in the presence of public research, because any effective weapon is likely to be much less effective if it is well understood and can be effectively combated.

Comparison to Infosec
There has always been a lot of research on finding security vulnerabilities in software. Some researchers look for vulnerabilities so issues can be fixed. Some researchers look for vulnerabilities so they can break into systems. And so in the Infosec community we used to have similar discussions as those going on in the Biotechnology and Biosecurity community now. 

That debate always used to remind me of a bad movie chase scene. When the person fleeing sees a big rock coming up quickly, he stays calm and turns at just the right angle - a near miss. When the person chasing sees it he panics and throws his arms in front of his face - a gratuitous explosion ensues.

Fortunately in the Infosec industry we have mostly moved toward the first course - staying calm and taking just the right angle. But for a while we, too, had lots of people who tried to make the rock go away by hiding from it. Many times this was software developers who reacted more violently to the legitimate research than to the criminal research! And the software developers have benefited by having much more robust and secure products.

Benefits of research and publication
Research helps in that:
  • identifies potential issues before they are found in the wild
  • allows us to prepare for likely strains before we see them
  • able to refine methods of doing this kind of research
  • gives us a better idea of the actual threat level - more or less severe than imagined
  • shows us indicators of what an outbreak might look like
  • shows us indicators of what an attacker might need to create a bioweapon
Publication helps in that:
  • publicizes the fact that these risks exist and are being studied
  • attracts more scientists
  • attracts more funding
  • allows results to be peer reviewed
  • identifies those working in the field to facilitate collaboration
Alleviating fears
One fear is that the research may be co-opted by a nation state for biowarfare. But I would argue that the antidote for that is more research, not less. The same fear exists in Infosec and that's just the way we've tried to deal with it. Hundreds of people doing security research, banging away on software. They're not going to find all of the bugs, but they're likely to find quite a few of them. Looking at it from the other direction, if governments quash open, public research, the only people who will be looking for a bug will be the ones looking for a weapon. And of course you can't legislate nature not to find a virulent strain.

But like in the Infosec community, there is still a question of the right amount of disclosure. Some in both fields advocate a full disclosure stance. That is, every detail should be published as soon as it is known. Others advocate a more limited disclosure policy, only publicly releasing enough information to describe the issue and to protect against it. Releasing certain technical details only to those who will be a part of the solution to the problem.

Publishing technical details is important for a couple of reasons. First, it ensures that the results can be replicated. This part of the scientific process is critical to the reliability of the results, as well as to identify potentially significant but unknown variables or mistakes. Second it provides foundations upon which future scientists can improve their techniques. Process and methodological innovation are critical to the scientific process, especially in this case where nature and bioterrorists are continually improving their results.

In many minds, the biggest fear is that we do nothing. If nature or a bioweapons group creates a viable threat, our lack of preparation will doom us to a greater impact. But if we understand the H1N1 virus well then we will either have defenses in place or can quickly take action. And as previously mentioned, public research may discourage groups from trying to develop weaponized versions in the first place.

Biotechnology should be looking (particularly in the case of Avian H5N1 Influenza) to increase scientific study and publication, rather than suppress it. The more scientists who work on it and publish their results, the more likely we are to find a way to defeat both a natural and unnatural strain of the disease. But certain technical details should be limited to a smaller group who rigorously review those details to make sure legitimate researchers stay ahead of the alternatives.


Update 2012-05-07: Since I published this article, I've had some discussions and there have been some new developments.
  • First, the paper in question has been published. Second, Nature has written a good article explaining the circumstances around publishing the controversial article
  • Second, I want to make clear that what I'm advocating is for Biosecurity to review our discussions and debate and apply it to their own situation. In other words, learn from our mistakes, successes and thought processes to speed up and improve their own.
  • Third, I've replaced "Biotechnology" with "Biosecurity" where it seems appropriate, in order to clarify to whom I am referring. I know Biotech spends billions and has well developed processes in place for research. Infosec ourselves can probably learn from their process.

Friday, April 20, 2012

Cybercrime Does(n't) Pay?

Earlier in the week a couple of Microsoft researchers released a study of cybercrime financial loss statistics (Sex, Lies and Cybercrime Surveys - PDF link). Effectively their research indicated that bad sampling, survey and statistical methods have led to a number of dubious results. I think most of us who are involved in the industry have known this intuitively for a while. Any time you have metrics purportedly for the same thing that vary by factors of 10-1000, that says something isn't quite right. 
The conclusion of the paper is essentially that estimates of the cybercrime economy are grossly exaggerated. And they make the point well enough that I won't belabor it here. Go read the actual article (linked above). I'm more interested in how this applies to other areas and studies. Here are a couple of points I think are particularly relevant, as well as a couple of others.
  1. Heavy Tails. Means (averages) are most useful when all the data are clustered closely around that number. When the distribution is very wide, you're going to have a problem getting people to understand what the results mean. For example, if I said that the average cost of a DVD player is $100 it doesn't tell you anything meaningful about the market for DVD players. That's because the costs range broadly, so the mean is almost arbitrarily in the middle somewhere.
  2. Garbage In, Garbage Out (GIGO). Since the data in these studies is typically collected by sending surveys, it's impossible to verify its integrity. In some cases people outright lie, but in others they simply don't know true costs and are just guessing. They may be higher or lower than the actual, but since there are never negative values, the overall trend almost necessarily has to push the number higher than the true value. But by how much it's impossible to know.
  3. Attribution. It's not easy to know where fraud came from. How do you know that somebody stole your credit card number from an online database, versus going through your trash or copying it at the restaurant down the street? This kind of attribution is especially hard for consumers who often can only know about an incident after actual fraud or if they are issued a new credit card. If both things happen within a year or so, the consumer is likely to think one caused the other, though as we know correlation does not imply causation.
  4. Self-Selecting Population. The people who respond have at least one thing in common - they return surveys. They may have other things in common, like a tendency to overestimate numbers, to be particularly susceptible to cybercrime, or any number of dozens of things that could influence the validity of these studies.
This isn't just a problem that affects cybercrime statistics, though. The Ponemon institute annually puts out a similar report on losses due to breaches (as well as a report on cybercrime). Their methodology is similar to the ones discussed in the Microsoft paper, and therefore suffers from some of the same flaws. To get consistent results over time that show a trend consistent with expectation, I suspect that some data manipulation goes on, which would add yet another layer of bad science (if true - I only have my gut instinct to go on, not any facts).
One group that tries obsessively to get the science right is Verizon Business who puts out an annual breach report. This uses much better science and statistics and can be counted on to have some rigor. Results can vary wildly year-over-year because they are always introducing new data populations (several groups contribute their figures, most with a different demographic that they serve), but that will normalize somewhat in the next 5 years as their data set gets big enough that new populations skew it less. The raw data is collected and published openly so there can be some good peer review of it, an important step for ensuring validity of results and conclusions.

But these studies don't have to be done poorly. By changing the way the researchers go about it, they could get much better results. For example, if instead of going to consumers with a survey, the authors had been able to get the information from banks the numbers would have likely been very different. The banks would have objective measurements of customer losses, a properly large sample size, a randomly distributed population, and would likely know more about the source of the fraud.

Although many of these studies fail at basic science, I'm hopeful that the information security industry will get better. Both at true academic research and at coming up with accurate public metrics for many of the most important data. We'll get there as we mature as an industry, but it will take a while. Until then, stay skeptical.

Thursday, April 19, 2012

Back: Better, Faster, Stronger

I'm back! After about a 4 year hiatus in this space, I plan on remaking my place in the security blogosphere. Not that I haven't been active since then in security - I have! And I've been involved in the community, too. But this space has been conspicuously vacant as I've tried to maintain a relatively low profile.

But now I'll be back to saying it publicly, rather than sending it through a corporate lens or self-censoring. I'll be posting as often as I find the time to cobble something together. If the past 4 years of output is anything to judge by, that will probably be a lot of stuff coming your way! And I'm going to try to play around with the content, format and delivery too. Keep it loose and entertaining, as well as informative.

One key to that, I think, will be to make better use of social media. I'm going to start off with Twitter, as that tends to be where most of my colleagues and peers gather. So if you haven't already, hit me up @beauwoods.