Abstract: Cloud security technology company OpenDNS recently announced the development of NLPRank, a tool prototype that leverages natural language processing technology that automatically detects malicious domain names (phishing sites) and attacks on high-value targets in real time. The so-called malicious (cybersquatting) domain
OpenDNS, a cloud security technology company, recently announced the development of NLPRank, a tool prototype that leverages natural language processing technology that automatically detects malicious domains (phishing sites) and attacks on high-value targets in real time.
The so-called malicious (cybersquatting) domain name is usually used for phishing websites, ie domain names are often spelled similar to well-known websites we are familiar with. Cybercriminals registered these domain names will be done after the website is very similar with the well-known website, once the user wants to visit those well-known websites wrong individual letters (such as G00gle.com) will enter the fishing site, because the interface is very similar to the Of users will not be aware of, so continue to enter personal account passwords and other sensitive information leading to privacy leaks. Some network hackers use the security psychology of users to send some prompts for security updates in various ways, and the provided link addresses adopt domain names (such as adobeupdates [.] Com) that are closely related to well-known websites. Trick users into.
Traditionally, security software solutions are post-processing. Because there are too many domains, malicious domains can not be collected in advance, so it is usually only possible to identify certain domain names as threats after a victim's report. However, OpenDNS engineers took advantage of this malicious domain name at least deliberately similar to the well-known Web site features, using the past has been applied to biological information and data mining natural language processing technology, combined with ASN mapping and empowerment, WHOIS data model and HTML Label analysis coupled with OpenDNS global network data has resulted in the development of NLPRank, a tool prototype that identifies malicious domains in real time.
Jeremiah O'Connor, an OpenDNS researcher, first analyzed the attack methods and data of the DarkHotel and Mandiant APT1, two cybercrime groups. They found that they all had similar means and were phishing attacks. And after getting the data from these criminal groups, he found that the phishing websites used some of the same domain names followed by the idea of NLPRank.
This real-time detection model includes a popular legal domain name dictionary (such as "java", "gmail", "adobe", etc.) that is often used as a reference point for phishing websites and then compares it to English words commonly found in fishing activities Such as "install", "update", "download") for comparison. Next, we use the sequence alignment technique in bioinformatics to rank domain names such as "install-ad0be" and evaluate the likelihood that they will be used for phishing campaigns. For example, if a domain name is similar to a well-known website, NLPRank compares the IP address of the domain name with the IP library corresponding to the well-known website domain name to see if it belongs to the IP library range of the well-known website. If not, then the domain name Fishing sites are more likely.
This kind of real-time detection method using natural language processing technology should be a relatively new approach, not only because of its real-time nature, but phishing websites can be more difficult because if phishing websites wish to name themselves less famous than well-known websites to circumvent software Identify the possibility, the user may not be so easily deceived by that domain name.