13 Dec 2023
Min Read
Detecting Suspicious Login Activity with Stream Processing
Cybersecurity is challenging, but it’s one of the most important components of any digital business. Cyberattacks can cause disruptions to your application and put your users into harm’s way. Successful cyberattacks can result in things like identity and credit card theft, which can have a very tangible effect on people’s lives and reputations. With regulations such as the General Data Protection Regulation (GDPR), businesses can in fact be fined for lackluster cybersecurity (e.g. Cathay Pacific fined £500k by UK’s ICO over data breach disclosed in 2018).
One of the most popular tools for cybersecurity is stream processing. For most cyber threats, responsiveness is crucial to prevent or minimize the impact of an attack. In this post, we’ll show how DeltaStream can be used to quickly identify suspicious login activity from a stream of login events. By identifying suspicious activity quickly, follow-up actions such as freezing accounts, sending notifications to account owners, and involving security teams can happen right away.
Setting up your Data Stream
We’ll assume that a Kafka Store has already been set up with a topic called login_attempts
. The records in this topic contain failed login events. Before we get to our use case, we need to set up a Stream that is backed by this topic. We’ll use this Stream later on as the source data for our use case.
CREATE STREAM DDL to create the login_attempts
Stream:
Cybersecurity Use Case: Detecting Suspicious User Login Activity
For our use case, we want to determine if a user is attempting to gain access to accounts they are not authorized to use. One common way attackers will try to gain access to accounts is by writing scripts or having bots attempt to log in to different accounts using commonly used passphrases. We can use our stream of login data to detect these malicious users. Based on our source Stream, we have fields ip_address
and user_agent
which can identify a particular user. The account_id
field represents the account that a user is trying to log in to. If a particular user attempts to log in to 3 unique accounts in the span of 1 minute, then we want to flag this user as being suspicious. The following query does this by utilizing OVER Aggregation and writing the results into a new Stream called suspicious_user_login_activity
.
Create Stream As Select Query
CSAS to create suspicious_user_login_activity
Stream:
In this query’s subquery, an OVER aggregation is used to count the number of unique accounts that a particular user has attempted to log in to. The outer query then filters for results where the projected field of the aggregation, num_distinct_accounts_user_login_attempted
, is equal to 3. Thus, the output of the entire query contains the IP address and user agent information for suspicious users who have attempted to log in to 3 different accounts within a 1 minute window. The resulting event stream can then be ingested by downstream applications for further review or actions.
By submitting this SQL statement, a long-lived continuous query will be launched in the background. This continuous query will constantly ingest from the source Stream as new records arrive, process the data, then write the results to the sink Stream instantaneously. Any downstream applications reading from this sink Stream will then be able to act on these suspicious users right away.
Create Stream As Select Query Results
To get a better understanding of how the CSAS query behaves, we can inspect some records from our source login_attempts
Stream and our results suspicious_user_login_activity
Stream.
Records in the source Stream login_attempts
:
Records in the sink Stream suspicious_user_login_activity:
In the results Stream, there is a record for a Windows user who tried to log in to 3 different accounts. Inspecting the source Stream, we can see that records 3 through 5 are associated with that output. Records 1, 2, and 6 also are from the same Android user, but this user only attempted to log in to 2 unique accounts, so there is no output record for this user since we don’t deem this activity as suspicious.
The Power of Stream Processing and Cybersecurity
Streaming and stream processing capabilities are incredibly helpful for tackling cybersecurity challenges. Having systems and processes that act on events with minimal latency can be the difference between a successful or unsuccessful cyber attack. In this post, we showcased one example of how DeltaStream users can set up and deploy a streaming analytics pipeline to detect cyber threats as they’re happening. While this example is relatively simple, DeltaStream’s rich SQL feature set is capable of handling much more complex queries to support all kinds of cybersecurity use cases.
DeltaStream is the platform to unify, process, and govern all of your streaming data. If you want to learn more about DeltaStream, sign up for a free trial or schedule a demo with us.