Image of a person handing over a bank card

A colleague was working on a problem with the Luhn algorithm, to check and validate credit card numbers with Kusto query language, but ran into a dead end, and I am never one to shy away from a challenge.

What is the Luhn Algorithm?

So first of all I had to learn this algorithm. It was interesting to learn this algorithm is a simple check digit formula that is used to validate a variety of identification numbers, such as credit card numbers, IMEI numbers, and others. It was named after its creator, IBM scientist Hans Peter Luhn, who described it in U.S. Patent No. 2,950,048. The algorithm is in the public domain and is widely used today.

The basic idea of the algorithm is to start from the rightmost digit of the number to be validated, and double the value of every second digit. If the result of doubling is greater than 9, then add the digits of the result. Then, sum all the digits together. The number is valid if the sum is a multiple of 10. For example, to validate the number 7992738:

  1. Moving from right to left, double every other digit, starting with the one immediately before the final digit.
  2. If the doubled value is greater than 9, subtract 9 from the value
  3. Add the sum of the digits together.
  4. Perform a modulo operation by 10 on the sum. If the value is equal to 0, woohoo You have a valid card number.
Digit7992732
Step 11846
Step 2946
Step 3+7+9+9+4+7+6+8

Step 4. Total = 50, which is a multiple of 10 or using modulo, 50 % 10 = 0 (no remainders)

Note the final number, in this case 8. This is called the check digit. If we performed all the steps but didn’t include the check digit in then the total sum would be 42. To find the check digit, perform the following calculation

10 – (42 % 10) = 8

Try it with your own card numbers. Just don’t share your real world examples!

Now lets do this in KQL

Now if we could perform iterators in KQL, like for and while loops, it might look something like this python code here:

#python
def luhn(n): 
    r = [int(ch) for ch in str(n)][::-1] return (sum(r[0::2]) + sum(sum(divmod(d*2,10)) for d in r[1::2])) % 10 == 0

https://www.creditcardvalidator.org/developer

But that would be too easy.

First of all, we need some card numbers. Use the fake card generator here (https://www.creditcardvalidator.org/generator) to make some valid (but unusable) card numbers and define a datatable().

let CreditCardNumbers = datatable(cardcompany: string, cardnumber: string)[
    "invalid1", "1234567890123456",
    "invalid2", "1334557896123756",
    "VISA", "4716677586750780",
    "Mastercard", "5505434629913583",
    "AMEX", "379753502925891",
    "Unionpay", "6222022722487934",
    "Diners", "3602311567018921",
    "Discover", "6011858121198779"
];

Now lets build out a basic user defined function which accepts a tabular input. This function accepts a table (which is continently named the same as our data table) and requires a column called cardnumber, which is a string type. Notice on the previous datatable, there are two columns. In this function, you won’t be able to use the cardcompany column, as we haven’t declared it.

let luhn_check = (CreditCardNumbers: (cardnumber: string)) {
    CreditCardNumbers
};

The reason we have used a tabular function is due to limitations with the toscalar() function, which prevents us performing on each row. this was the issue my colleague was facing.

Looking at the steps, I first to determine the index of each number, I need to create an array of numbers using the range() and strlen() functions. Note the arguments for the range function.

  • Start = strlen(cardnumber) -1: This is because the strlen function gives us the total number of characters in the string, but because indexes start with 0, we have subtracted one to accomadate this.
  • Stop: This is 0 as we are counting down
  • Step: This is minus one, as we are counting backwards
CreditCardNumbers 
| extend cardDigitIdx = range(strlen(cardnumber) - 1, 0, -1) 

Now we have to use mv-expand to perform actions on each value in the cardDigitIdx array. This is because KQL does not have iterators like “for” and “while” loops. I’ve used the with_itemindex argument so we can work out which values to double.

| mv-expand with_itemindex=doubleIndex cardDigitIdx to typeof(int)

Now using substring and the cardDigitIdx value, we can extract each number at the postion it is found in the card number.

| extend cardDigit = toint(substring(cardnumber, cardDigitIdx, 1))

Finally, we can perform step 1.

| extend sumDigit = iif(
    doubleIndex % 2 != 0,
    iif(
        (cardDigit * 2) > 9,
        (cardDigit * 2) - 9,
        (cardDigit * 2)
    ),
    cardDigit
)

If the doubleIndex is odd (using modulo “%” operator), then double the value of the cardDigit, otherwise leave it as it is. However if the doubled value is greater than 9, then subtract it by 9.

Now we need to get the total sum of the newly calculated card number, for this we will use summarize

| summarize sumOfDigits = sum(sumDigit) by cardnumber 

and then validate that it is divisible by 10 with no remainders using modulo. No need for any iif statements here as the evaluation will produce a Boolean output.

| extend validation = sumOfDigits % 10 == 0

Now if you join the code up together and run it, you will get a table that shows the cardnumber, validation and sumOfDigits column. We are still missing the cardcompany column (and any others we may want from the original dataset. To do this, all wee need to perform is a simple join. I paste the completed code here so you can see it all working together easily. I’ve put the first part of the query in a let statement as well, so the join is clean to read and added the relevant project and project-away functions.

Complete Code

The code can be found here in my GitHub repo https://github.com/TheAlistairRoss/The-Cloud-Brain-Dump/blob/main/Toolshed/KQL%20Toolbox/the%20Luhn%20Algorithm.txt or below here.

let luhn_check = (CreditCardNumbers: (cardnumber: string)) {
    let ValidationCheck = CreditCardNumbers 
        | extend cardDigitIdx = range(strlen(cardnumber) - 1, 0, -1) 
        | mv-expand with_itemindex=doubleIndex cardDigitIdx to typeof(int)
        | extend cardDigit = toint(substring(cardnumber, cardDigitIdx, 1))
        | extend sumDigit = iif(
                        doubleIndex % 2 != 0, 
                        iif(                   
    (cardDigit * 2) > 9, 
    (cardDigit * 2) - 9,
    (cardDigit * 2)
),
                        cardDigit
                    )
        | summarize sumOfDigits = sum(sumDigit) by cardnumber 
        | extend sumOfDigits = sumOfDigits * 9 
        | extend validation = sumOfDigits % 10 == 0
    | project cardnumber, validation
    ;
    CreditCardNumbers
    | join kind = inner (
        ValidationCheck
        )
        on cardnumber
    | project-away cardnumber1
};
let CreditCardNumbers = datatable(cardcompany: string, cardnumber: string)[
    "invalid1", "1234567890123456",
    "invalid2", "1334557896123756",
    "VISA", "4716677586750780",
    "Mastercard", "5505434629913583",
    "AMEX", "379753502925891",
    "Unionpay", "6222022722487934",
    "Diners", "3602311567018921",
    "Discover", "6011858121198779"
];
luhn_check(CreditCardNumbers)

Example results

cardcompanycardnumbervalidation
invalid11234567890123456false
invalid21334557896123756false
Visa4716677586750780true
Mastercard5505434629913583true
AMEX379753502925891true
Unionpay6222022722487934true
Diners3602311567018921true
Discover6011858121198779true

Conclusion

Now there are other checks you may wish to do, such as validating card number length (depending on the company) and the Issuer identification number as well, but I will leave that to you. Now you know how to use the Luhn Algorithim with KQL.