A while back, a customer asked me a question regarding with a Kusto Query Language statement which was no working. The query they had was searching for specific strings in PowerShell script that had been encoded in a hash. For some reason unknown to them it was failing on the where function. Looking at it visually, it should have worked.

Here I have provided a much simplified version of the query.

print Base64String = "SABlAGwAbABvACAAVwBvAHIAbABkACEA"
| extend DecodedString = base64_decode_tostring(Base64String)
| where DecodedString == "Hello World!"

Here you can see a Base64 encoded string which is then decoded and the where function then compares the value with the string “Hello World!”. Go ahead and run it, see if you can work out why it produces no results. If you want somewhere to test it, use https://aka.ms/lademo, you will be required to sign in with a Microsoft account, but you require no permissions to play with the data.

Chances are, you have hit this error message. “No results found from the last 24 hours”. If you run the first two lines of the code, you can see that the decoded string does indeed look like “Hello World!”, but is it actually?

First of all I thought, there is probably white space, we can just use the has() operator as we are looking for complete terms within a string.

print Base64String = "SABlAGwAbABvACAAVwBvAHIAbABkACEA"
| extend DecodedString = base64_decode_tostring(Base64String)
| where DecodedString has "Hello World!"

No Results found. What about just the word “Hello”

print Base64String = "SABlAGwAbABvACAAVwBvAHIAbABkACEA"
| extend DecodedString = base64_decode_tostring(Base64String)
| where DecodedString has "Hello"

Still, no results found. There is clearly something inside this string which we cannot see. The easiest way to find out what this string is made up of is using unicode_codepoints_from_string() function (Deprecated aliases = to_utf8()).

print Base64String = "SABlAGwAbABvACAAVwBvAHIAbABkACEA"
| extend DecodedString = base64_decode_tostring(Base64String)
| extend DecodedStringValues = unicode_codepoints_from_string(DecodedString)

Here you can see each character with its Unicode decimal value.

[72,0,101,0,108,0,108,0,111,0,32,0,87,0,111,0,114,0,108,0,100,0,33,0]

A quick search online will show that 72,101,108,108,111 will spell Hello, but what are the 0 values in between. Once again, a quick search will show it is a null value. Now this explains why the original where function was failing, because we didn’t account for these hidden values. It is also a sign that the original command run could be subject to some sort of null byte injection, whether it is to exploit vulnerabilities or evade detection.

Now we know this, we can strip out the null values to perform our detections. As we cannot type the null character easily on our keyboard, I will convert it from the Unicode decimal character to a string and use the replace_string() operator to strip it out.

print Base64String = "SABlAGwAbABvACAAVwBvAHIAbABkACEA"
| extend DecodedString = base64_decode_tostring(Base64String)
| extend DecodedString = replace_string(DecodedString, unicode_codepoints_to_string(0),"")
| where DecodedString == "Hello World!"

Voila! It works. Now when you are evaluating strings, make sure you are aware of the characters that aren’t visible to the naked eye. It could be a sign of something much more malicious going on in your environment.