Stop the Scammers. Detection of Homoglyph Attack Attempt using KQL (Kusto Query Language)!

Phishing attempts are getting sneakier, often leveraging homoglyph attacks or unusual characters to trick employees.

I put together a simple but effective query to scan for new users created with "weird" characters in the email domain that indicates a potential sign of a spoofed or malicious account creation attempt.

KQL Breakdown:

This query scans 7 days of CloudAppEvents for the `Create user.` action, then checks the new user's email domain for any non-ASCII characters (characters outside the standard English keyboard set: $\text{U+0000}$ to $\text{U+007F}$).
This is a great starting point for spotting internationalized domain name (IDN) abuse or other sophisticated L3 attacks.

CloudAppEvents
| where TimeGenerated > ago(7d)
| where ActionType == "Create user."
| extend Email = tostring(parse_json(RawEventData).EmailAddress)
| extend Domain = tostring(split(Email,"@")[1])
| where Domain matches regex @"[^\u0000-\u007F]"
| project TimeGenerated, AccountDisplayName, Email, Domain

This is a great starting point for spotting 
internationalized domain name (IDN) abuse or other sophisticated L3 attacks.
Here is a sample response you might get for this query:

[
  {
    "TimeGenerated": "2025-11-28T10:15:30Z",
    "AccountDisplayName": "System Administrator",
    "Email": "security_alert@microsøft.com",
    "Domain": "microsøft.com"
  },
  {
    "TimeGenerated": "2025-11-27T08:20:15Z",
    "AccountDisplayName": "IT Helpdesk",
    "Email": "support_id@citißank.com",
    "Domain": "citißank.com"
  }
]


Why these are flagged:
Row 1: The 'o' in 'microsoft' has been replaced 
with the slashed zero character 'ø' (U+00F8).
Row 2: The 's' in 'citibank' has been 
replaced with the German sharp 's' or 'eszett' character 'ß' (U+00DF).

All these non-ASCII characters satisfy the regular expression condition I added 
to the query:
Domain matches regex @"[^\u0000-\u007F]"

Regardless of the query language you use in your company, 
you can always execute similar queries to scan and you can also create rules for this. 
Now, let’s run another query for a more realistic scenario…  
Because you can always prevent this when users create an account.
But emails…  
Homoglyph Attacks are primarily coming from the emails. 
Thinking that the whole company uses Office 365, 
we can run the following query that scans the email data of the entire company.


EmailEvents
| where TimeGenerated > ago(7d)
| extend SenderDomain = tostring(split(SenderMailFromAddress, "@")[1])
| where SenderDomain matches regex @"[^\u0000-\u007F]"
| project TimeGenerated, RecipientEmailAddress, SenderMailFromAddress, Subject, DeliveryAction

The EmailEvents table is a core component of Microsoft Defender for Office 365 (MDO) Advanced Hunting (part of Microsoft Defender XDR).
The table is designed to ingest and store all email processing events for all mailboxes
protected by MDO within the Microsoft 365 tenant. 
This includes Inbound, Outbound, and Intra-org (internal) email traffic.
Since we didn’t include a specific filter on RecipientEmailAddress 
or the EmailDirection column, the query will search the entire available dataset 
(the last 7 days of email events for this specific sample)

[
  {
    "TimeGenerated": "2025-11-29T11:30:00Z",
    "RecipientEmailAddress": "alice.user@contoso.com",
    "SenderMailFromAddress": "security_update@microsøft.com",
    "DeliveryAction": "Delivered",
    "Subject": "Action Required: Your password has expired",
    "ThreatTypes": ["Phish", "Homoglyph"]
  },
{
    "TimeGenerated": "2025-11-28T09:10:00Z",
    "RecipientEmailAddress": "ceo@contoso.com",
    "SenderMailFromAddress": "invoice_team@citißank.com",
    "DeliveryAction": "Quarantined",
    "Subject": "Overdue Invoice #90210",
    "ThreatTypes": ["Phish"]
  }
]
This query catches suspicious emails regardless of direction. 
But most Homoglyph Attacks are inbound. 
We can make the query to be more strict and efficient 
if we add filter for inbound emails.

EmailEvents
| where TimeGenerated > ago(7d)
| where EmailDirection == "Inbound" // <--- Explicitly filter for emails coming INTO your org
| extend SenderDomain = tostring(split(SenderMailFromAddress, "@")[1])
| where SenderDomain matches regex @"[^\u0000-\u007F]"
| project TimeGenerated, RecipientEmailAddress, SenderMailFromAddress, Subject, DeliveryAction

But hold on… Wait a minute… 
You can’t just run big, costly queries in real life. 
The priority should be always the most optimized way possible.
We still need to filter the massive dataset before applying costly string operations 
like matches regex. 
The order of operations in KQL is critical for performance. 
If the organization is very big, you shouldn’t run the previous queries… 
And in most cases it is…
You can always apply different strategies for running queries in most performant way.
One of those techniques in my opinion would be user groups. 
It is a good way to balance the need for comprehensive threat detection 
with the performance and cost concerns of running a massive query on a system 
with over thousands of employees. 
Running the query in smaller batches by user groups (or roles) 
is an effective optimization strategy. 

We can also reduce the time from 7 days to 1 day, 2 days or even to seconds.
We can identify high-risk-groups first, 
and split those user groups into different tiers…
Tier 1: High-Value Targets (HVT): Executives (CEO, CFO, CTO), Legal Team, Finance Team, System Administrators.
Tier 2: Elevated Access: IT Support, HR, Department Managers.
Tier 3: General Population: All other employees. (You can still split them by country, 
department, employment time, privileges they have etc)

And last but not least, 
we can also use Watchlists to Exclude False Positives 
to skip the legitimate non-ASCII domains if we have any…
Here is how we can create an Azure Sentinel Watchlist 
for safe international domains…

let SafeList = _GetWatchlist('SafeInternationalDomains') | project DomainName;
EmailEvents
| where TimeGenerated > ago(7d) 
| where EmailDirection == "Inbound"
| extend SenderDomain = tostring(split(SenderMailFromAddress, "@")[1])
| where SenderDomain !in (SafeList)
| where SenderDomain matches regex @"[^\u0000-\u007F]"
| project TimeGenerated, RecipientEmailAddress, SenderMailFromAddress, Subject, DeliveryAction

The Optimized Query:
This query joins the watchlist with the EmailEvents table before applying an expensive regex operation… 
It drastically reduces the amount of data the regex has to process..

let HighRiskRecipients = _GetWatchlist('HighRiskUsers') | project UserEmail;
EmailEvents
| where TimeGenerated > ago(1d) 
| where EmailDirection == "Inbound"
// 1. Filter the entire dataset to only include High-Risk Recipients
| join kind=inner (HighRiskRecipients) on $left.RecipientEmailAddress == $right.UserEmail
// 2. Now that the set is small, apply the slow regex operation
| extend SenderDomain = tostring(split(SenderMailFromAddress, "@")[1])
| where SenderDomain matches regex @"[^\u0000-\u007F]"
| project TimeGenerated, RecipientEmailAddress, SenderMailFromAddress, Subject, DeliveryAction

This approach, allows us to maintain high coverage and speed for the highest-risk users
while still providing broad coverage for the whole organization, 
and managing cost and performance at the same time.

Phoenix E.

Gl1tch | Risk - Articles

Search This Blog

Stop the Scammers. Detection of Homoglyph Attack Attempt using KQL (Kusto Query Language)!

Why these are flagged:

Labels

Comments

Post a Comment

Popular posts from this blog

Beyond the Pentest: Why We Do What We Do

Entering Password Protected Windows Computer without the Password