<This is part 1 of a 3 part series>
I have been meaning to comment on the idea of herd intelligence, which I originally noticed via Hoff (here) and Shimel (here), who referenced an article by Matt Hines from Infoworld (here) who was commenting on a research note by Andrew Jaquith (here – requires Yankee Group account) which I haven’t read, so my comments are based on the materials that orbit around his work. Within the various interpretations there lies some valuable ideas, however there are so many flaws with the concept that it is difficult to know where to begin.
Bio-organisms are a poor metaphor for technology
First is simply the use of the term “herd”, comparing technology to bio-organisms aside, a herd is genetically predisposed to raise the importance of the group over the individual, at first glance there may be nothing wrong with that but remember that a percentage of the herd must die if the herd is to survive.
Anyone who has children and shared a moment viewing the circle of life brought into our living rooms through the Discovery channel has probably had to respond to the little ones. My son asked the question following the brutal attack of a young wildebeest that had been stalked by a lion. As soon as the attack began the herd fled, seemingly hundreds of huge bulls, with horns, and hooves, running in a panic away from a lone lion. The question finally comes “Dad, why don’t the wildebeest just attack the lion, why do they let their own family die?” and we all know the answer “Because some must die for the wildebeest to live, if there were no predators all the prey would die of starvation brought on by overpopulation”. So in nature some must die for the herd to live, but this is a ridiculous metaphor for computers and clearly doesn’t work in the technology world, there is no evolutionary pull for some guy in accountings laptop to succumb to digital predators for the organization to stave off total annihilation.
Another issue with the term herd intelligence is that it isn’t intelligent at all; it is a primitive response to millions of years of predator/prey relationships, herd animals do not even have a concept of specialization to improve the chances of survival of the group. If one must compare the distributed and collective intelligence of computing systems to a bio-organism then look to swam insects, such as bees or ants, who are not genetically predisposed to allow the individual to die and when attacked will counter-attack in force and give their own lives to protect the hive/nest/lair. They have clearly mastered the idea that a lone individual is weak but when combined with thousands or hundreds of thousands of other individuals, some specialized to perform certain tasks, then they become extremely powerful, efficient and evolved to the point that they are able to support the success of the individual as well as the group.
But this post is not in reaction to a poor use of organism-based metaphor to refer to technology. There are other problems with the idea, including relying on vendor cooperation in a competitive free-market, the trend to distributed computing systems which adds a level of complexity to centralized processing, the proliferation of intermittently connected mobile devices, the idea that distributing honeypots will provide enough visibility into targeted attacks, and the most troubling in my mind, which is that it maintains the idea that organizations must always be on the defensive.
Organizations cannot rely on vendors to cooperate
Part of the herd intelligence idea put forth by Jaquith calls for vendors and customers to cooperate and share intelligence in an orgy of goodwill and the desire to improve security defenses.
“It will become vital for vendors to aggregate threat data using customers’ computers….The idea is simple, according to the analyst. If attackers are going to attempt to create different attacks for nearly every individual user, then security software vendors must use their customers’ machines as their eyes and ears for discovering and addressing those variants.”
Aside from the obvious security problems with aggregating customer data between vendors, not to mention issues with contractual privacy agreements or other legal protections, there is the issue of vendors actually cooperating.
Vendor cooperation is a novel idea, but it is neither new nor effective. Actionable intelligence, especially the kind that would result in dynamically reconfiguring a computing device, is competitive. Symantec and Trend already tout their large, global SOC’s not to mention the size of the their install base, so what benefit would they receive from sharing information with McAfee or even a small vendor like Sana?
Intelligent centralized servers vs. intelligent distributed agents
The computing environment that IT is expected to secure has changed dramatically over the decades, most recently we have seen an increasingly opaque and transparent perimeter in which information and systems pass through like water through a sieve. The proliferation of powerful mobile computing devices with wireless capabilities, the ubiquitous nature of internet connectivity, and the difficulties facing an overtaxed IT department in simply knowing these systems exist, let alone actually managing them, make the idea that distributed computing devices would provide information back to a centralized location for aggregation, correlation and processing and that “intelligence” would be redistributed back to the computing devices themselves is simply outdated. Jaquith forwards this idea
“When an unknown binary attempts to execute, the client-side agent sends detailed telemetry information to a remote centralized server and asks whether it is good, bad, or unknown,” said Jaquith. “The server makes a disposition decision based on all the collective history accumulated by the herd. By pooling information about all executing programs across its installed base, the herd makes smarter decisions and can confer immunity faster to new variants.”
The concept of dynamically reconfiguring a system based on environmental variables is extremely powerful, and I will further this in greater detail below and in future posts, but there are far too many obstacles to heavy systems with back end intelligence and processing capabilities trying to command and control pseudo-intelligent agents. In addition to the problems of fidelity, there are also the issues of timeliness and quality of control. Evolving attacks require real-time collective intelligence to provide a dynamic response, central command and control points will not work and they become less effective as more of the computing infrastructure is moved farther away from the core.
The only viable option for collective intelligence in the future is through the use of intelligent agents, which can perform some base level of analysis against internal and environmental variables and communicate that information to the collective without the need for centralized processing and distribution. Essentially the intelligent agents would support cognition, cooperation, and coordination among themselves built on a foundation of dynamic policy instantiation. Without the use of distributed computing, parallel processing and intelligent agents there is little hope for moving beyond the brittle and highly ineffective defenses currently deployed.
Using end-point intelligence to improve security defenses
When I was still with Gartner I wrote a research note on the need for network security tools to integrate endpoint intelligence to increase accuracy, performance, and effectiveness. The concept was simple. Security controls should dynamically reconfigure themselves based on environmental intelligence provided from the collective. For example IDS/IPS systems could integrate with vulnerability assessment tools to gain knowledge of the computing devices they are protecting, they would understand their state and reconfigure themselves based on that knowledge.
The main driver was originally to reduce false-positives, since at the time IDS’ were highly inaccurate and prone to a high rate of false positives (actually IDS/IPS’ still are). If a Solaris attack is launched against a window machines does it matter? If someone attacks a vulnerability that no longer exists on the end-point does it count as a security incident? Would you feel the same at 3 in the morning?
Some argue that they want to see this data to prepare themselves for some impending attack; the argument goes that they can somehow determine that a successful attack is coming if they see a series of port scans or something. The problem is these “non-events” happen far too frequently to be meaningful – networks are under constant attack, probing, poking, rattling, shaking, and inspection, most of this results in non-events.
There are vendors that have tried to integrate end-point intelligence with the IDS. This is done either pre-incident, that is end-point intelligence is used to dynamically reconfigure security defenses, or post-incident, that is end-point data is used to drive more intelligent reporting. When nCircle provided an IDS solution they would forward VA data to their IDS and the IDS would dynamically reconfigure what it was looking for based on this context – it was a “target aware IDS”. Sourcefire does this with their IPS, and back in the late 90’s I worked on a project at McAfee called “Active Security” which attempted to integrate Cybercop scanner data to automatically change firewall settings based on vulnerability data.
This is a step in the right direction but it hasn’t been widely adopted. The challenge comes from the quality of the data, the problems with integrating disparate systems (network and host), and the architectural limitations of the systems attempting to provide this level of cross functional cooperation.
Collective intelligence or swarm intelligence if you prefer (btw – the term swarm intelligence is used in robotics as well as other technology related disciplines) is the next evolutionary step in information security. This is where we will see innovation in the security industry and in the next post I will dig into the details of a distributed computing, parallel processing architecture based on intelligent agents that are context, state, location, and environmentally aware with the ability to cooperate and coordinate with each other independent of a central processing system. With such an architecture it is only a small leap to completely weaponize IT and turn a weak defensive posture to a highly efficient offensive security strategy.