Data Security: Resource


 Search Resources
Hot Words
Key Words
In Field

 By Chris Gudy
Use MD5 to Identify Spyware File
In pure random mode, there is no any relationship among file names that belongs to same spyware and run in same computer. If a file want to access others, it cannot identify the destination by name but by content. Absolutely, it is more complicated to identify a file by its content than by its name. In spyware, MD5, a famous file security technology, is usually utilized to check whether a file is wanted depending on its content.

MD5 is an invention from Professor Ronald L. Rivest of MIT. In rfc1321, you can read detail documents about its algorithm. Simply stated, the MD5 algorithm takes a message of arbitrary length as input, and produces a 128-bit “fingerprint” or “message digest” of the inputted message as output. It is conjectured that it is computationally infeasible to produce two messages having the same message digest, or to produce any message having a given pre-specified target message digest. In other words, under MD5 algorithm different file must have different fingerprint regardless of its name or other properties.

At beginning, the MD5 algorithm is intended for digital signature applications, where a large file must be “compressed” in a secure manner before being encrypted with a private (secret) key under a public-key cryptosystem such as RSA. In essence, MD5 is a way to verify data integrity, and is much more reliable than checksum and many other commonly used methods.

Nevertheless, MD5 is not perfect in practices despite that it is much better than checksum. Different contents might produce same message digest. Some collision weaknesses are published recent years. An ideal message digest algorithm would never generate the same signature for two different sets of input, but achieving such theoretical perfection would require a message digest as long as the input file. Practical MD5 algorithm only has 128 bits digest.

Spyware use MD5, an information security technology, sounds a little humor. But it is true. When spyware wants to identify a file with pure randomized name by its content, MD5 usually is the first option for the following reasons.
  • Basically, MD5 is a reliable method tested by a number of excellent experts.
  • There are many executable codes and other documents about MD5 on Internet. When one uses it in software, never face real challenge only if you are a qualified programmer.
  • Although MD5 occurs exception, spyware is rarely provably perfect. In the world of spyware or the world of spy, danger always exists. As a matter of fact, the probability of failure of MD5 is far less than others.
  • At last, MD5 has best efficiency in all algorithms with same reliability. Although spyware does not care efficiency, or consider it seriously, an algorithm with high quality is always helpful to the whole system.

Therefore, if you find two files have same MD5 but different names, they are 100 percent spyware or other malware. If not, why does their creator want to hide them?