Heuristic analysis
From Wikipedia, the free encyclopedia
Heuristic Analysis is a method employed by many computer antivirus programs designed to detect previously-unknown computer viruses, as well as new variants of viruses already in the wild.
Heuristic Analysis An expert based analysis. Expert determines the
susceptibility of a system towards particular threat/risk using various
decision rules or weighing methods. MultiCriteria Analysis (MCA) is one
of the means of weighing. This method differs with Statistical
Analysis, which bases itself on the available data/statistics.
[edit] How it works
Most antivirus programs that utilize heuristic analysis perform this
function by executing the programming commands of a questionable
program or script within a specialized virtual machine,
thereby allowing the antivirus program to internally simulate what
would happen if the suspicious file were to be executed while keeping
the suspicious code isolated from the real-world machine. It then
analyzes the commands as they are performed, monitoring for common
viral activities such as replication, file overwrites, and attempts to
hide the existence of the suspicious file. If one or more virus-like
actions are detected, the suspicious file is flagged as a potential
virus, and the user alerted.
Another common method of heuristic analysis is for the antivirus program to decompile the suspicious program, then analyze the source code
contained within. The source code of the suspicious file is compared to
the source code of known viruses and virus-like activities. If a
certain percentage of the source code matches with the code of known
viruses or virus-like activities, the file is flagged, and the user
alerted.
[edit] Effectiveness
Although heuristic analysis is capable of detecting many
previously-unknown viruses and new variants of current viruses, the
effectiveness is fairly low regarding accuracy and the number of false
negatives. This is because computer viruses, just like biological viruses,
are constantly changing and evolving. Since heuristic analysis mostly
operates on the basis of past experience (by comparing the suspicious
file to the code and functions of known viruses), it is likely to miss
new viruses that contain previously unknown code or methods of
operation not found in any known viruses. Fortunately, heuristic
analysis is also evolving along with the viruses. As new viruses are
discovered using alternative methods of detection, information about
them are added to the heuristic analysis engine, thereby providing it
the means to detect any new viruses based on the previously-unknown
code.
How Heuristic Analysis in AVG works
Using program analysis, this is an attempt to find out if the program
contains a construction typical for a computer virus. In the AVG
heuristic analysis, the core is the emulator of the Intel processor
instructions. It is a kind of "virtual computer" allowing you to "run"
a program or a system operation, such as booting the operating system
from the boot sector or from the hard disk MBR.
The program emulated in the virtual computer is not directly run in any
sense of the word. The code emulator receives its individual
instructions and, in a safe way, imitates their activity so that
everything happens in the "virtual computer" and the instructions can
in no way affect the contents of the real memory shared by other
programs or by AVG itself.
The code emulator makes the issue of complex encryption or
untransparent code of the examined program entirely irrelevant. When
the program performs a meaningful activity in the real computer, the
emulator will also perform the same activity.. With a little
mischievousness, let us observe a minute's silence for virus writers
who have been toiling for months with the objective to create "the most
perfect generator of polymorphous decryption loops", with the only
result that AVG effortlessly passes through this section of code and
presents the deciphered, constant part of the virus for subsequent
check.
The code emulation process is accompanied by the collection of
information about the "meaning" of the emulated code, and by evaluating
this information, AVG tries to assess if it is an activity typical for
a harmless program or on the contrary, for a computer virus.
Heuristic analysis benefits and drawbacks
The main benefit of the heuristic analysis is the ability to detect new
viruses even before Grisoft can get hold of them and update AVG with
information for their detection. Another way to capture new viruses is
the Integrity Check. However, this method, based on monitoring and
evaluating the changes of programs and computer system areas, can
capture the virus only after its intrusion into the protected computer,
and this can often be far too late. It is only the heuristic analysis
that is able to detect a new virus before it gets a chance to do any
harm.
In our world, there is a price to be paid for every benefit, and the
heuristic analysis is no exception. Its major drawback is potential
occurrence of false alarm, when the heuristic analysis labels as
suspicious a program that contains no virus at all. However small the
likelihood of such an event may be, it cannot be completely ruled out.
However, we believe that this drawback is richly compensated by the
heuristic analysis benefits.
Limits
To understand the heuristic analysis principles also means to
understand its limits. Primarily, the heuristic analysis is unable to
detect viruses programmed in high level programming languages (C,
Pascal, Basic, etc.). The boot code and the libraries used by these
languages are extensive, and even if there were no technical obstacle
impeding an in-depth emulation of such a program, to process a single
file would take a few hours more that would be tolerable.
The other limitation of the heuristic analysis is its ability to
process only specific types of objects prone to infection. The
heuristic analysis requires very detailed information about the
examined object. For programs, it includes knowledge of all the
microprocessor instructions (including the undocumented instructions
and officially unsupported by its manufacturer), knowledge of the way
the program is loaded into memory at its startup, outline of services
provided by the operating system (naturally, including undocumented
services or the services dedicated for the internal needs of the
operating system authors) and information about the exact meaning of
some reserved computer memory areas used by the BIOS or operating
system. When a new kind of object prone to infection appears, all the
information specified above must be collected, and the heuristic
analysis must be modified so that it is able to analyze it. At worst
(for example, for macro viruses), a completely new analysis must be
written.
Heuristic Analysis Results
Note, that the heuristic analysis is not a method to detect 100% of all
the known or unknown viruses. It is a complementary method that
significantly increases your chances to capture a new virus. No more,
no less. Our tests prove that the heuristic analysis is able to detect
over 70% of the existing file viruses and boot viruses, while the
number of false alarms is insignificant.
|