Demystifying Domain Generation Algorithms 

February 6, 2024
Demystifying Domain Generation Algorithms 
Share on LinkedIn

One of the easiest ways for Security Operations Centers (SOCs) to detect and block malware, ransomware, and phishing is to block items that we call Indicators of Compromise (IoC). Some of these, such as file names and hash values, are used in endpoint protection software such as anti-virus. At the infrastructure layers, IoCs can be domains, Fully-Qualified Domain Names (FQDNs), IP addresses, and network blocks. 

Most SOCs block IoCs proactively based on threat intelligence or reactively as part of incident response when they have an endpoint that has been compromised. By looking at the network traffic coming from an infected endpoint, response teams can rapidly block any related activity for the entire enterprise. 

SOC operators and Cyber Threat Intelligence (CTI) teams began sharing these IoCs with each other informally, as a written advisory, or via an Information Sharing Analysis Center (ISAC) or equivalent group. Over time, the analysis, information sharing, and automated domain blocking evolved into a solution called Protective DNS. 

Malware writers realized they needed a way to add variation to their malware, delivery systems, and Command and Control (C2) to evade being identified and blocked via simple IoCs. Out of these efforts, Domain Generation Algorithms (DGAs) were born. 

Malware starts using domain generation algorithms. 

DGAs are methods used in malware to periodically generate many seemingly random domain names. By implementing a DGA in their malware and registering some of the domains that it creates, cybercriminals can create thousands of potential domains for their command-and-control servers, making it challenging for security systems to predict which domain the malware will communicate with next. 

The primary aim of a DGA is to ensure that a malware’s network communication can evade blocking and detection and maintain an open line to its C2. This is achieved by generating vast quantities of domain names that a botnet can communicate with, often daily. As a result, traditional approaches to blocking malicious domains, such as blacklisting, become less effective due to the sheer volume of potential domains to monitor. 

MITRE ATT&CK and DGAs. 

MITRE ATT&CK is a large repository of information about cybercriminals, their tools, and their typical attack processes. Technique T1568.002, Dynamic Resolution: Domain Generation Algorithms, describes the use of DGAs by malware for C2.

ATT&CK has led to a better understanding of attacker behavior and capabilities and serves as a core resource for building better detections and adversarial simulation, or purple teaming, to identify areas where SOC systems don’t have good visibility. 

Looking at DGAs. 

Let’s look at some DGA examples to understand more about them and what their domains look like. 

Most DGAs have some kind of variable that is used by the algorithm as input such as: 

  • Hard-coded seed variable  
  • Previous DGA domain 
  • Current or future date  
  • Initialization string (“magic” value) 
  • Text of a URL (Uniform Resource Locator) 

Bumblebee 

Bumblebee makes random alphanumeric domains that are 11 characters long and inside of the .life TLD, such as: 

cmid1s1zeiu.life  

itszko2ot5u.life  

3v1n35i5kwx.life  

newdnq1xnl9.life  

jkyj6awt1ao.life  

ddrjv6y42b8.life  

Gozi 

Gozi uses a file, available via a URL, as a dictionary or word list. The dictionaries are lengthy documents like the US Constitution or the GNU Public License. In the examples below, the algorithm uses Martin Luther’s 95 Theses, which is in Latin and German and has an English introduction and copyright. The algorithm selects words from the dictionary up to a maximum length. For example, here are some domains generated by Gozi: 

Quodpresidentemaxsagit.com <= Quod presidente max sagit  

Pertantumfitusu.com <= Per tantum fit usu  

Indulgentiarumlicet.com <= indulgentiarum licet  

Moriblasphemianegocii.com <= mori blasphemia negocii  

Tribueretnossetmortes.com <= Tribueret nosset mortes  

Nonsicordinario.com <= non sic ordinario  

Qakbot 

Qakbot generates domains that have a higher frequency of the letters q, x, and z, such as: 

bqkrtxgkmriwsiwcngtivpx.info  

jdtmfupdyueqeldvhsjzdvzob.net  

guhmpoxzivhba.com  

nqqxqhuacaqhzurde.org  

lgqsqgpqzijwid.info  

ykolyecdcyk.biz  

Detecting DGAs. 

Despite the potential for harm, DGAs have inherent weaknesses that can be exploited by SOC operators and CTI teams. Many DGA-based malware families tend to generate nonsensical, random domain names. While visually detectable by SoC operators, they may not be detectable at line speed or scale. 

Due to the mathematical nature of algorithmic generation, the resulting domains often exhibit predictable patterns. These patterns can be analyzed and predicted using automated techniques such as letter frequency, domain length, and choice of top-level domain (TLD). These techniques are well-suited for Artificial Intelligence (AI) and Machine Learning (ML) approaches. 

Numerous DGAs have been Reverse Engineered by researchers and instrumented outside of the malware. This allows CTI teams the ability to generate lists of DGA candidates, block suspect domains, and perform threat hunting in their logs. 

It is important to note that despite the use of a DGA by malware, it still requires infrastructure for its C2 operations. This implies that CTI teams have access to various other data points such as IP reputation, hosting details, registrar information, and more. 

MITRE ATT&CK lists 2 mitigations for DGAs: 

  • Network intrusion prevention: Network intrusion detection and prevention systems can help mitigate malicious activity at the network level by using network signatures to identify adversary malware. However, reversing malware variants that use DGAs and extracting seed values to determine future-generated domains can be a time-consuming and resource-intensive process. Additionally, the sheer volume of domains generated per day makes it impractical for defenders to preemptively register all possible C2 domains. 
  • Restrict web-based content: A local DNS sinkhole, such as provided by a Protective DNS solution, can be employed to prevent DGA-based command and control, offering cost-effective protection in certain scenarios. 

MITRE ATT&CK also lists network traffic flow to detect DGAs. Detecting dynamically generated domains can be challenging due to the evolving nature of malware and the complexity of algorithms. Various approaches, such as frequency analysis and machine learning techniques, have been used to detect these domains. Additionally, checking for recently registered or rarely visited domains can help identify suspicious activity. Overall, a combination of methods is necessary to effectively detect and classify DGA-generated domains. 

Protective DNS to combat DGAs. 

Protective DNS solutions are an effective line of defense against DGAs. By performing large-scale analysis of DNS request data, Protective DNS solutions can identify and block the DGA domains by monitoring DNS queries for patterns indicative of DGA activity, such as high volumes of non-existent domain (NXD) responses. Once a potentially malicious pattern is detected, the Protective DNS solution can block further requests to the associated domain. This proactive approach of Protective DNS can help interrupt the malware’s command and control communications, thereby stopping its harmful operations. The high efficiency of Protective DNS makes it an integral part of a robust cybersecurity strategy. 

Protective DNS offers many advantages when it comes to detecting and blocking DGAs: 

  • Early Blocking is Cost-Effective: Every online interaction begins with a DNS query. This query happens before the malware can infect an endpoint. This makes Protective DNS one of the most cost-effective ways to detect and block malware and ransomware that use DGAs. 
  • Integration with Cyber Threat Intelligence: Protective DNS solutions ingest a large amount of CTI feeds as seed data for AI and ML analysis. 
  • Ease of implementation: Since the Protective DNS infrastructure (feeds, AI, etc.) is already built, an organization can easily forward DNS queries from its on-network resolvers to be protected. 
  • Economy of Scale: By doing computational-intensive analysis once and blocking an infinite number of times, Protective DNS overcomes some of the problems associated with the volume of domains generated by DGAs. 
  • Law of Large Numbers: By receiving more DNS queries across various organizations, Protective DNS has more data to analyze. This leads to increased effectiveness in detection and blocking of malware using DGAs. 

The importance of Protective DNS 

Considering what we have discussed in this blog post, Domain Generation Algorithms present an ongoing challenge for Security Operations Centers and other Blue Team operators. Their complexity and adaptability require an ever-evolving approach to defense informed with Cyber Threat Intelligence. It is paramount for professionals within this field to stay abreast of the latest developments and strategies to counter these threats. Remember, every bit of knowledge gained is another step forward in the fight against cybercrime. One such way is the use of a Protective DNS solution. 

To learn more about our Protective DNS solution, visit our product page.  

Published On: February 6, 2024
Last Updated: March 14, 2024

Interested in learning more?

View all content.
Experience Unbeatable Protection
Schedule a demo to see our cloud solutions
  • Solutions
  • Products
  • Industries
  • Why Vercara
  • Plans
  • Partners
  • Resources
  • Company