How SOC Analysts Use AI for Threat Triage: A Step-by-Step Workflow

Alert triage is where most SOC teams lose the battle before it starts. The average enterprise SIEM generates thousands of alerts per day. Analysts burn out. True positives get buried under false ones. Attackers use dwell time to their advantage.

AI doesn’t solve this entirely — the analysts who tell you it does are selling something. But applied correctly, AI triage tooling can cut Tier 1 workload significantly, get the right alerts to the right humans faster, and reduce mean time to respond on the threats that matter. This is how we do it in practice.

Step 1: Alert Ingestion and Normalization

Before any AI model touches your data, that data needs to exist in a consistent format. This is the unsexy part that determines whether everything downstream works.

Your SIEM is doing most of the heavy lifting here. Splunk AI collects raw log data from endpoints, firewalls, cloud infrastructure, identity providers, and network devices and normalizes it into a common schema. Cortex XSIAM takes this further — its unified data lake ingests and normalizes across sources in a way that eliminates the schema translation work that traditional SIEMs require for each new data source.

Where AI enters the picture at the ingestion stage is context enrichment. Rather than storing a raw event like user jdoe authenticated from 203.0.113.45, modern platforms automatically append:

Threat intel lookups against known malicious IPs (Recorded Future, VirusTotal, internal threat feeds)
Asset inventory context: is this device managed? What classification does it hold?
User context: what’s the normal work pattern for this account? What’s their role?
Geolocation: is this IP in a country this user has ever authenticated from?

Cortex XSIAM handles this inline during ingestion through its built-in Cortex XDR and threat intelligence integrations. If you’re running Splunk, this enrichment happens via the Threat Intelligence Framework and requires integration with your intel feeds — it works, but it’s a configuration effort.

What the AI can’t fix here: Garbage in, garbage out. If your logging coverage has gaps — cloud workloads without agent coverage, network segments without SPAN traffic, SaaS applications not feeding identity events — no ML model compensates for missing telemetry. Do a logging coverage audit before you tune anything else.

Step 2: AI Severity Scoring

This is where the approaches differ most sharply between platforms, and where teams new to AI triage often get their expectations calibrated the hard way.

Traditional SIEM correlation rules work on static logic: if event X matches condition Y, generate alert severity Z. These rules are static, require constant maintenance, and are trivially bypassed by attackers who stay just below your thresholds.

AI severity scoring works differently. ML models establish behavioral baselines for users, devices, and network segments, then score deviations against those baselines rather than matching them against fixed rules.

How Cortex XSIAM Does It

XSIAM’s scoring uses a combination of behavioral analytics and threat intelligence to generate an “incident score” — not just a severity label. When an alert fires, XSIAM has already correlated it against other recent events involving the same user or asset, factored in the asset’s criticality, and compared the behavior against historical patterns. A login at 2am from a new location scores higher than the same login at 9am from the user’s usual office location.

The AI also groups related alerts into incidents automatically. What Splunk might surface as three separate alerts (anomalous login, new process execution, outbound connection to rare IP) becomes one incident with a coherent narrative in XSIAM.

How Splunk AI Does It

Splunk AI uses risk-based alerting (RBA) to aggregate risk scores across entities rather than generating individual event alerts. Over a 24-hour window, a user account accumulates risk: a failed MFA push adds 10 points, an off-hours login adds 15 points, an executable run from an unusual path adds 25 points. Only when the aggregate risk crosses your configured threshold does Splunk generate an alert. This dramatically reduces alert volume and surfaces coherent threat narratives rather than individual low-signal events.

The meaningful difference: XSIAM’s scoring is more autonomous and opinionated — the AI makes decisions for you. Splunk’s RBA is more configurable and transparent — you define risk scores per event type and thresholds per alert. XSIAM is faster to value; Splunk gives you more control once your team has the SPL expertise to use it.

Honest limitation: Both approaches require a learning period to establish behavioral baselines. For the first 2-4 weeks after deployment, expect elevated false positives as the models learn what “normal” looks like in your environment. Suppress alerts from volatile environments (build servers, CI/CD pipelines, dev workstations) during this period to avoid drowning the models in noise.

Step 3: Automated Context Gathering

By the time an alert reaches a Tier 1 analyst, the AI should have already answered the questions that analyst would spend the first 10 minutes of triage trying to answer manually.

The questions the AI should answer:

Who is this user? Role, department, manager, location, authentication history, recent access pattern changes, HR data (is this an employee under a PIP? recently terminated? traveling?)
What is this device? Is it managed? Patched? What software is installed? Has it exhibited unusual behavior in the last 30 days?
What does this IP/domain mean? Is it known malicious? New? Associated with a threat actor or campaign? Has it appeared in other alerts in the last 7 days?
Is this a pattern? Is this the only device showing this behavior, or is it one of several? Does this look like the early stages of a lateral movement campaign?

CrowdStrike Falcon’s Threat Graph handles device and threat context automatically when Falcon is deployed as your EDR. When a SIEM alert fires on a suspicious process, Threat Graph pulls the full execution chain, parent process, loaded DLLs, network connections spawned, and historical context for that file hash across all Falcon-protected endpoints globally. This context appears in the alert automatically when CrowdStrike integrates with your SIEM.

Microsoft Security Copilot adds a conversational interface on top of this context. An analyst can ask “What is the risk profile of user jdoe based on the last 30 days of Defender and Entra activity?” and receive a structured summary without writing a single KQL query. For Microsoft-stack environments (Sentinel + Defender + Entra), this significantly reduces the pivot time between data sources during triage.

What this looks like in practice

A Tier 1 analyst opens an alert. Before they’ve read the alert details, the platform has already surfaced:

Alert: Suspicious PowerShell execution
User: [email protected]
  - Risk score: HIGH (elevated from baseline in last 72h)
  - Recent auth anomaly: VPN login from NL (user normally US-based)
  - HR status: Active, no travel logged
Device: LAPTOP-7823
  - Last patched: 47 days ago (3 critical CVEs outstanding)
  - Managed: Yes (Falcon agent active)
  - Process ancestry: powershell.exe -> cmd.exe -> explorer.exe (unusual)
Threat Intel:
  - Execution hash: seen in 4 other alerts this week (2 open, 2 closed as FP)
  - C2 IOC match: medium confidence (Recorded Future score: 76/100)

The analyst didn’t produce this output. The AI assembled it from your EDR, SIEM, identity provider, asset inventory, and threat intel feeds. The analyst’s job is to make a decision with it.

Step 4: Triage Decision

The triage decision point is where AI assists but human judgment remains required. Three outcomes:

Auto-close (benign anomaly)

The AI scores the alert low, context confirms it’s expected behavior, and the alert closes automatically with a documented reason. Common scenarios:

Scheduled task firing on a new schedule (admin changed it; no IOC match; consistent with approved change ticket)
Off-hours login from an executive who travels frequently (historical pattern confirms this; no other anomalies)
Dev workstation connecting to an external IP that’s a known CDN node used by internal apps

Cortex XSIAM reports autonomous resolution rates exceeding 80% for large enterprise customers. That number is real but context-dependent — it’s higher for mature deployments with clean baselines and lower for environments with diverse, dynamic infrastructure. Expect 40-60% auto-resolution in the first six months post-deployment, rising as the model improves with feedback.

Escalate (confirmed IOC or clear threat signal)

The alert gets promoted to Tier 2 with a full AI-generated investigation summary. Escalation criteria:

Hash or IP matches a high-confidence threat intel IOC
Behavior maps to a known MITRE ATT&CK technique with no legitimate business justification
Multiple correlated alerts firing in sequence across the same user or device
The alert matches a recently issued threat advisory from your intel provider

Investigate (ambiguous — analyst required)

The honest category most platforms undervalue. The alert doesn’t match a known bad pattern, doesn’t have a clean benign explanation, and the risk score is moderate. Examples:

PowerShell downloading a file from a legitimate CDN — not unusual on its own, suspicious in context of the VPN anomaly
DNS queries to a recently registered domain — rare but not inherently malicious
User accessing file shares they’ve never touched — unusual, but could be a legitimate project

This category is where skilled Tier 1 analysts earn their pay. AI gets you to the door faster with better context. The analyst has to decide what to do next.

Step 5: Escalation with AI-Generated Summary

When an alert escalates to Tier 2, the analyst receiving it should be able to act immediately without starting an investigation from scratch. This is where Microsoft Security Copilot and XSIAM’s built-in summarization earn their keep.

Security Copilot generates incident summaries that include:

Timeline of events leading to the alert
Affected users and assets
IOCs observed (hashes, IPs, domains)
MITRE ATT&CK techniques identified
Recommended investigation steps
Draft containment actions

The output looks something like this:

Incident Summary: Suspected credential theft — [email protected]
Timeline:
  2026-02-21 01:47 UTC — VPN authentication from 203.0.113.45 (NL, no travel history)
  2026-02-21 01:52 UTC — Remote PowerShell session initiated from jdoe to FILESERVER-02
  2026-02-21 01:58 UTC — LSASS memory access detected on FILESERVER-02 (credential dump behavior)
  2026-02-21 02:03 UTC — Outbound connection to 198.51.100.22 (Recorded Future: C2 associated
                           with TA505 infrastructure, confidence: HIGH)

Affected assets: [email protected], LAPTOP-7823, FILESERVER-02
ATT&CK techniques: T1078 (Valid Accounts), T1003.001 (LSASS Memory), T1041 (Exfiltration)
Recommended actions:
  1. Disable jdoe account immediately via Entra ID
  2. Isolate LAPTOP-7823 and FILESERVER-02 via CrowdStrike Real-Time Response
  3. Reset credentials for all accounts with sessions on FILESERVER-02 in last 4 hours
  4. Block 198.51.100.22 at perimeter firewall

A Tier 2 analyst receiving this can validate and execute a response in minutes rather than spending an hour reconstructing the incident timeline. The summary is wrong sometimes — hallucination risk is real — so verify the critical facts before acting on containment recommendations.

CrowdStrike Falcon’s Real-Time Response module is worth calling out specifically here. When a Tier 2 analyst needs to act on a confirmed incident, RTR gives them a live shell on the affected endpoint without requiring RDP access or physical presence. You can pull running processes, examine registry keys, kill processes, and collect forensic artifacts remotely:

# Pull running processes on the compromised endpoint
$ runscript -CloudFile="RunningProcesses.ps1" -CommandLine=""

# Collect memory artifacts for forensic analysis
$ memdump

# Isolate the endpoint from the network (keeps Falcon connectivity)
$ containhost

Step 6: The Feedback Loop

AI triage gets better with use — but only if analysts close the loop. When an alert auto-closes and it was actually a true positive, that’s a model failure that will repeat. When an analyst marks something as a false positive, the model needs to learn why.

Every triage decision should feed back into the platform:

True positive confirmations teach the model which patterns correlate with real threats in your environment
False positive markings with documented reasons help the model lower sensitivity on specific patterns that are benign in your context
Near-miss escalations — cases where the AI under-scored a genuine threat — are the most valuable training data

Cortex XSIAM handles this through its feedback mechanism built into the incident workflow. Analysts mark disposition directly in the console; the data feeds back into the AI pipeline automatically. With Splunk, analyst feedback flows through the Mission Control interface, though the feedback loop to risk-based alerting rules requires more manual configuration.

The practical requirement: someone needs to own model hygiene. Designate an analyst — typically your most experienced Tier 1 or a Tier 2 lead — to review auto-closure accuracy weekly for the first six months. Track false positive rates by alert type, by data source, and by time of day. The patterns that emerge will tell you where your baselines are weak and where your logging has gaps.

What AI Triage Gets Wrong

No honest overview of AI-assisted triage skips this part.

It struggles with context it can’t see. If an employee is responding to a genuine incident at 2am on their personal home network, the AI has no way to know that’s authorized. Zero-day exploit techniques targeting your environment for the first time look like baseline deviation, not confirmed malicious activity, until they’re confirmed.

It inherits the biases of your baseline. If your “normal” includes a misconfigured admin account that’s been doing something wrong for two years, the AI treats that as baseline behavior. Security debt doesn’t disappear — it becomes invisible.

Over-automation creates blind spots. Organizations that let auto-closure run too aggressively without analyst review end up missing true positives that fall just below escalation thresholds. Attackers who understand your detection tooling will stay below those thresholds deliberately.

It doesn’t replace threat hunting. AI triage handles reactive detection well. Proactive hunting — querying your data to find attackers who haven’t triggered an alert — still requires human analysts who understand attacker behavior and can build targeted hunts. Falcon OverWatch and SentinelOne Vigilance exist precisely because reactive AI triage has a ceiling.

The workflow above reduces analyst workload and improves response times on the alerts that matter. It doesn’t remove the need for skilled humans at the top of the stack. Use it accordingly.