DataStun — Glossary

Jump to a feature

Threat detection & enforcement
Real-time blocklist enforcement
Public threat intel
Commercial-derived threat data
Executable reputation
— SIG (code-signing)
— NSRL (NIST known-good)
— MBZ (MalwareBazaar)
— VT (VirusTotal)
Exposed infrastructure services
Tenant infrastructure
Per-tenant dashboard URL
Agent rules + custom blocklists
AI Governance
AI Governance dashboard
Fleet analytics — Business+
Fleet SBOM
First-seen radar
Vendor concentration map
Per-machine deviation score
Patch-lag scoreboard
SaaS license reconciliation
Location-aware network health
Fleet analytics — Enterprise only
Org-wide executable analysis
Beaconing detector
Data-sovereignty rollup
Identity, compliance, support
Hash-only privacy boundary
AI + human-backed in-app support
SAML / OIDC SSO
Compliance reports
Dedicated account manager
Diagnostic add-ons
Hop Starvation
Speed Test
Advanced Packet Diagnostics (APD)

Threat detection & enforcement

Real-time blocklist enforcement

Available on every tier, including Individual

What it is

A 20,000+ entry global blocklist of known-bad IP ranges and domains, enforced by the DataStun agent at the operating-system firewall layer. Windows agents push New-NetFirewallRule entries; Linux agents use ipset + iptables; macOS agents use pfctl tables. Updates propagate from rep to ten to every agent within 60 seconds.

Why it matters

Most network-security tools either rely on a separate appliance (firewall, secure web gateway) that you have to size, license, and route traffic through, or they enforce in-app at the browser layer where they miss everything that isn’t a browser. DataStun pushes enforcement onto the endpoint itself — every TCP / UDP socket, regardless of which application opened it, regardless of network path. A laptop on a coffee-shop Wi-Fi gets the same protection as a laptop on the corporate VLAN.

How it works

The blocklist is the union of three streams: D / F-graded IPs from our reputation pipeline (rep), curated public threat-intelligence feeds, and tenant-scoped overrides. Each entry carries source attribution — who flagged this — so the dashboard can show you exactly why a destination is blocked and link to its dispute path. Agents poll the attribution-aware /api/v1/agent/blocklist endpoint and reconcile their local firewall state against the new payload.

Why it’s included on every tier

It’s a safety feature, not a premium one. Charging extra for blocking known malicious destinations would be a hostile pricing decision; the value of the platform compounds when every DataStun agent on the planet is closing the same doors.

Public threat intel

Available on every tier, including Individual

What it is

Curated public threat-intelligence feeds — reputable open IP-reputation lists — refreshed every six hours and merged into the global blocklist. Each feed’s entries carry their upstream identifier in-product so disputed entries route directly to the source.

Why it matters

The cybersecurity community publishes a wealth of threat data for free; most security tools either ignore it or charge customers to relay it. We treat public feeds as table-stakes infrastructure and ship them on Individual.

How it works

The rep service’s feed orchestrator pulls each feed at its scheduled interval, inside a single transaction replaces the prior snapshot (so a probe never sees a half-loaded feed), and writes results into reputation.threat_ip_ranges / threat_ips. The unified enforcement_blocklist_v4 view projects feed data into the agent-facing blocklist alongside grading and tenant overrides.

What we don’t use

Some public feeds moved behind paywalls in late 2025 and are intentionally omitted from Individual. We don’t pretend an Individual tier has commercial-tier coverage; if you need the breadth of licensed feeds, that’s the next entry.

Commercial-derived threat data

Available on Business and Enterprise

What it is

Threat-intelligence signal sourced from licensed commercial feeds (VirusTotal Enterprise, Recorded Future, and similar). We pay for the licenses; you get the derived blocklist + grading data. The agent enforcement is identical to the public-feed path; the difference is the depth and freshness of the data being enforced.

Why it matters

Public feeds catch the obvious. Commercial feeds catch the targeted, the recent, and the in-the-wild. For a business buyer where a single infected machine costs days of remediation and reputational risk, the marginal cost of broader coverage is dwarfed by the value of catching the attack 4-12 hours earlier.

How it works

Same pipeline as public feeds, different sources. Each commercial-derived entry carries provenance (which feed, with which freshness window), so an entry that’s downstream of an expensive license can be retired automatically when its source TTL elapses.

Why it’s gated

Licensing economics. Commercial feeds bill per-tenant or per-seat at rates that don’t make sense for free or household tiers. The Business tier ($5/agent/mo) is the threshold where the unit economics work for both sides.

Executable reputation

Available on every tier, including Individual

Hash-only. We send a SHA-256 fingerprint, never the binary itself. Why this matters →

What it is

Every binary the agent observes opening a network connection gets hashed (SHA-256) and run through a four-stage evidence pipeline: SIG (code-signing verification) → NSRL (NIST known-good catalogue) → MBZ (MalwareBazaar known-bad) → VT (VirusTotal aggregated AV). The four chips render side-by-side on every binary in the dashboard, with hover detail and click-through to the relevant external source.

Why it matters

Hash-anchored identity is the cleanest answer to “is this binary safe?” — immune to renames, copies, and superficial obfuscation. Once a hash has a verdict, that verdict is the same for every customer; our cross-tenant reputation cache means once any customer’s agent first reports chrome.exe and we verify it, every other customer’s chrome.exe inherits the same verification.

How the four chips fit together

The chips are ordered left-to-right by cost-to-execute and answer-strength. Each one is cheap insurance against the previous being wrong. None is a verdict on its own — the cluster is the verdict. SIG asks who claims to have made this; NSRL asks has a federal lab catalogued it as known-good; MBZ asks have public researchers confirmed it as malware; VT asks what does the commercial AV ecosystem think. Different questions, different evidence types — cluster-reading is what gets you to a real answer rather than over-trusting any single source.

One sequencing rule worth knowing

If SIG matches our trusted-publisher allowlist (Microsoft, Apple, Google, Mozilla, Adobe, and a small number of other major OS / runtime vendors), the remaining chips are intentionally short-circuited and you’ll see MBZ — / VT —. The dashes mean “skipped, signer-shortcut decided the verdict first” — not “we forgot to check.” The shortcut exists because signed Microsoft binaries account for the bulk of executables in any Windows fleet, and a Microsoft-signed binary failing the rest of the cluster would imply a Microsoft signing-key compromise — the kind of event surfaced through channels other than a 70-engine AV scan. The savings let us afford full coverage on the long tail where the cluster actually changes the answer. The deep dive on this is in the SIG entry.

SIG — code-signing verification

Available on every tier, including Individual

What it is

The first chip in the four-chip evidence cluster on every binary. SIG asks one question: does this file carry a valid code-signing signature, and from whom? The agent verifies the signature chain on disk before any network traffic involving the binary — Authenticode on Windows, codesign on macOS, dpkg/GPG package signatures on Debian-class Linux — extracts the publisher common name, and matches the publisher against a hardcoded trusted-publisher allowlist (Microsoft, Apple, Google, Mozilla, Adobe, and other major OS / runtime vendors).

Why it matters

A valid signature is identity, not safety. It tells you which publisher claims to have produced this binary; it does not tell you the binary is benign. Stuxnet was signed. CCleaner’s 2017 supply-chain compromise was signed. SolarWinds Orion was signed. But the absence of a signature on a Windows binary is itself a meaningful negative signal — almost every legitimate Windows application ships signed today, and unsigned Windows executables in your fleet deserve a closer look than signed ones.

How the trusted-publisher shortcut works

If SIG matches the trusted-publisher allowlist, NSRL / MBZ / VT are intentionally short-circuited. You’ll see MBZ — / VT — on those rows. This isn’t cost-cutting theatre: the trusted-publisher set is small enough (~20 names) that a Microsoft-signed binary failing the rest of the cluster would imply Microsoft’s signing infrastructure had been compromised — the kind of event discovered through forensic incident response, not through a 70-engine AV scan. The savings — signed Microsoft binaries account for ~80% of executables we observe across the fleet — let us afford the paid VirusTotal coverage on the long tail of unsigned and unfamiliar binaries where the cluster actually changes the answer.

What SIG can’t tell you

Self-signed certificates can verify successfully but mean nothing about the publisher’s identity. Stolen signing keys exist; revoked certificates aren’t always promptly enforced on the endpoint. And critically: a signed installer can drop unsigned payloads at runtime — the chip on setup.exe tells you nothing about helper.dll three directories deeper. The cluster reckons with this by hashing every binary the agent observes opening a network connection, not just the original installer.

See also: Executable reputation (the umbrella) · NSRL (the next chip).

NSRL — NIST known-good catalog

Available on every tier, including Individual

What it is

The second chip. NSRL asks: has a federal lab catalogued this exact binary as part of a shipped product? The National Software Reference Library is a US National Institute of Standards and Technology corpus — SHA-256 hashes of binaries pulled from commercial software, government-procured packages, and major open-source distributions. A hit returns vendor, product name, and product version: “this hash shipped as Microsoft Word 16.0.19929.20090.”

Why it matters

An NSRL hit is the strongest single piece of legitimacy evidence available for free. A federal lab put this byte sequence in their corpus because it shipped from a real vendor’s real distribution. It resists obfuscation that bypasses string-based identification: the binary’s filename can be renamed, its publisher field stripped, its install path obscured — but the hash either matches NIST’s catalogue or it doesn’t. There’s no spoofing a 256-bit hash collision against a federally-curated database.

What an NSRL miss does not mean

The corpus is large but not exhaustive, and its coverage is uneven. Heavily catalogued: stable Microsoft / Apple OS binaries, mainstream commercial software (Office, Adobe), government-procured packages. Sparsely catalogued or absent: Electron-based applications (Slack, Discord, VS Code, Notion, Claude Desktop, etc.), almost all macOS third-party software, almost all Linux package binaries, anything that updates frequently (browsers, dev tools, Steam games). The chip uses NSRL ø for “no record — not necessarily clean, just not catalogued” precisely so users don’t misread it as a flag. Absence of evidence is not evidence of absence.

How the lookup runs

The corpus is ~12 GB of hashes, mirrored locally as SQLite on the reputation provider. Lookups happen on-prem with no external network call. Your fleet’s hashes never leave your reputation provider for this check — one of the reasons NSRL doesn’t carry the hash-only privacy lozenge that MBZ and VT do.

See also: Executable reputation · MalwareBazaar (the next chip).

MBZ — MalwareBazaar known-bad catalog

Available on every tier, including Individual

Hash-only. We send a SHA-256 fingerprint to MalwareBazaar, never the binary itself. Why this matters →

What it is

The third chip. MBZ asks: has this exact binary been confirmed as malware by the public security-research community? abuse.ch’s MalwareBazaar is a curated, free-tier malware-sample repository where security analysts upload confirmed-malicious samples with family / signature labels (Emotet, Cobalt Strike, Lumma Stealer, AsyncRAT, etc.). A hit returns the malware-family signature; a miss reads as “not on the public known-bad list.”

Why it matters

High specificity. When MBZ confirms a hash, it’s almost always right — these are samples vetted by analysts, not heuristic guesses. False-positive rate is essentially zero. A hit is the fastest path to certainty in the entire cluster: we don’t need to wait for any other source if MalwareBazaar already names the malware family. And it’s free — abuse.ch’s public API requires only a free API key, with no commercial licensing tier needed for the volume our reputation provider runs. (Our internal docs previously called this paywalled; that was incorrect.)

What an MBZ miss does not mean

Coverage is the inverse of NSRL: only confirmed-bad samples, never “things that might be bad.” The corpus skews toward malware families that have been reverse-engineered and dissected publicly, which means freshly-deployed campaigns or narrowly-targeted threats won’t be there yet — sometimes for days, sometimes ever. The chip’s MBZ ✓ reads as “we checked the public known-bad corpus and you’re not on it” — which is reassuring but not a clean bill of health. That’s why VirusTotal is the next chip and not the last word on its own.

External verification

Each MBZ chip on the dashboard links directly to bazaar.abuse.ch/sample/<hash>/ so you can verify our reading independently. We invite the scrutiny — if MalwareBazaar says something different about a binary than our chip does, we want to hear about it.

See also: Executable reputation · VirusTotal (the next chip) · Hash-only privacy boundary.

VT — VirusTotal aggregated AV verdict

Available on every tier, including Individual

Hash-only. We send a SHA-256 fingerprint to VirusTotal, never the binary itself. Why this matters →

What it is

The fourth chip. VT asks: of 60–70 commercial antivirus engines, how many flag this binary as malicious? VirusTotal aggregates daily scan results from the major AV vendors (Microsoft Defender, Kaspersky, Bitdefender, ESET, Symantec, Sophos, CrowdStrike, etc.) and exposes a per-hash summary plus a per-engine breakdown. A hit returns a ratio like 3/68 — three engines flagged it out of sixty-eight scans — with engine names available via the linked URL.

Why it matters

Where MBZ tells you what researchers have confirmed, VT tells you what the AV ecosystem currently thinks. Different question, different value. A binary that scores 0/68 on VT is materially better evidence of legitimacy than one with no VT data at all — even if every other chip in the cluster is positive. And the per-engine breakdown matters: clicking the chip reveals the difference between “1/68 from one engine known for false positives” (probably ignore) and “42/68 with consensus across major vendors” (pull this binary off the network now). One number, lots of context behind it.

What VT can’t tell you

AV engine consensus lags fresh threats — a malware family deployed yesterday may have zero VT detections today, even from engines that will detect it next week. Public-tier API quotas are tight; the highest-volume fleets need a VirusTotal Enterprise license to keep up with their lookup queue, which is part of why commercial-derived threat data is gated above the Individual tier. And like every other chip in the cluster, “not on VT” means unknown, not safe.

External verification

Each VT chip on the dashboard links to virustotal.com/gui/file/<hash> so you can see exactly which engines flagged a binary, when, and with what classification. The same scrutiny invitation as MalwareBazaar: if VT says something different than our chip, we want to hear about it.

See also: Executable reputation · Commercial-derived threat data (where VT Enterprise coverage lives) · Hash-only privacy boundary.

Exposed infrastructure services

Available on every tier, including Individual

What it is

The agent carries a catalog of 169 services across 8 categories (databases, file sharing, directory & auth, remote management, message queues, internal web, virtualization, backup) that should never answer on the public internet unless they’re inside a VPN. On every outbound flow the agent checks the destination port/protocol against the catalog. If the destination is a public IP, the flow is tagged.

Why it matters

The single most common “how did we get breached” root cause is an internal service accidentally exposed to the internet. MSSQL on 1433, MongoDB on 27017, Redis on 6379, RDP on 3389, SMB on 445. Surfaced as critical matches that fire instant tenant-admin alerts; admins get a one-click Block this destination that pushes a firewall rule to every agent within 60 seconds.

Why it’s included on every tier

Same reasoning as the blocklist: it’s a safety feature, not a premium one. Tier-gating it would be hostile.

Tenant infrastructure

Per-tenant dashboard URL

Available on Business and Enterprise

What it is

Your tenant gets its own subdomain (e.g. yourcompany.tenant.datastun.com) with dedicated DNS, dedicated TLS termination, and an isolated dashboard surface. The shared tenant.datastun.com hostname is for Individual and Tribe tiers; paid tiers move up to per-tenant hosts.

Why it matters

Cleaner sharing with stakeholders ("here’s our security dashboard at yourcompany.tenant.datastun.com"), audit-ready URLs in compliance reports, less concerning if a screenshot leaks (no shared hostname implies a shared tenant), and the right plumbing for SSO bindings later (SAML AssertionConsumerServiceURLs are per-host).

How it works

Each tenant on Business+ gets its own DNS record under *.tenant.datastun.com backed by a wildcard TLS certificate (B3 of iteration 2). The hostname routes through the same Cloudflare tunnel, so onboarding is just a tenant-platform DB row + DNS issuance — no per-tenant infra to spin up.

Agent rules + custom blocklists

Available on Business and Enterprise

What it is

Tenant-scoped blocklist overrides distributed to every agent on your tenant within 60 seconds. Block a specific destination IP, CIDR range, domain, or executable hash — only on your fleet, without affecting other tenants. Paired with rules for soft signals: warn me when X happens, page me when Y happens, allow this destination that the global blocklist would otherwise refuse.

Why it matters

Every business has its own permitted-vendor list, its own internally-approved internal apps, and its own list of "we know that’s a known-bad IP block but our partner Acme Corp lives there and we vouch for them." The override surface is what lets you align DataStun’s default policies with your specific operational reality without compromising the global enforcement primitive.

How it works

Per-tenant overrides are stored alongside the global blocklist in reputation.blocklist_entries with a tenant_id scope; the agent fetches the union on its regular blocklist poll and enforces both. Override audit history is preserved so an SOC analyst can answer "who added this rule, when, and why."

AI Governance

AI Governance dashboard

Available on Business and Enterprise

What it is

A cross-fleet dashboard surfacing bytes uploaded to each AI vendor (Anthropic, OpenAI, Microsoft Copilot, GitHub Copilot, Google Gemini, xAI Grok, Perplexity, Cursor, Mistral, Cohere, Hugging Face, Ollama, DeepSeek, Meta AI, and the long tail), sliced by application, by machine, and over time. 50+ providers catalogued. The volume-and-attribution view of corporate AI adoption.

Why it matters

AI adoption has outpaced corporate AI policy at every company we’ve talked to. Three things every organization needs to know that nobody can answer today without metadata visibility: which AI tools are actually in use, how much data is leaving for them, and from which department. AI Governance answers all three. Compliance teams get audit trails; risk officers get exposure quantification; engineering managers get the AI-adoption-denominator one click away.

How it works

Pure metadata. The agent already sees DNS resolutions, the executable behind every TCP session, and the byte counts each direction. The AI Governance pipeline curates the AI-vendor catalog (publisher signature on the desktop app, plus destination-domain matching on the wire) and reprojects existing telemetry into the AI-adoption view. TLS hides the prompts and responses. We never see content; we see how many bytes left, for which provider, from which app, on which machine. This is intentional and load-bearing for the privacy story.

What this isn’t

This is not a Data Loss Prevention product. We don’t inspect content; we don’t MITM TLS; we don’t proxy traffic. If your buyer needs prompt-level inspection, AI Governance is the wrong tool. The honest framing matters: knowing what we don’t see is part of the trust story.

Fleet analytics — Business+

Fleet SBOM — inventory + usage + data-flow

Available on Business and Enterprise

What it is

Auto-generated, continuously-updated software bill of materials covering every distinct executable hash on your fleet. For each binary you get: identity (name, version, code-signer, hash, OS), saturation (count and % of agents running it, first-/last-seen timestamps), actual usage (sessions per day, bytes moved, time-of-day patterns, distinct hours active — how much the binary actually runs, not just whether it’s installed), and data-flow attribution (which external destinations the binary’s data goes to and comes from, byte volume per destination, country breakdown, vendor attribution). Searchable and filterable across all four dimensions.

Why it matters

The classic SBOM use case — "CVE-2026-1234 in libxml2 ≤ 2.12.4, are we exposed?" — is the wedge. Without a fleet SBOM organizations answer that question in days, by asking IT to RDP into a sample of machines. With a fleet SBOM the answer is in seconds: show me every machine running libxml2 ≤ 2.12.4. Auditors love it; SOC 2 wants it; vulnerability management gets a real input feed instead of guesses.

But the usage + data-flow layer turns SBOM from a security-only feature into a software-economics tool too:

Install vs use gap — every fleet has dozens of binaries that are installed everywhere but actually run on only a fraction of machines. "You have Adobe Acrobat on 200 agents; only 47 opened it in the last 30 days." Same data point that powers SaaS license reconciliation, applied to local software.
External phone-home detection — a tool that’s supposed to be internal-only is suddenly making external calls. "InternalCRM.exe is sending 300 MB/day to a vendor we don’t recognize." That’s a security signal that pure inventory misses.
Vendor concentration per binary — for any specific binary, see the top external destinations and their countries. Useful for compliance ("does our HR app talk to anywhere outside the EU?") and for vendor risk ("how dependent is this tool on this third-party API?").
Active hours — binaries that run only during work hours vs binaries that run continuously vs binaries that run at 3 AM. The 3 AM column is where attacker tooling tends to surface.

How it works

The agent already hashes every executable that opens a network connection and ships the hash + signer metadata up with telemetry. Crucially, every flow is also already attributed back to the executable that opened it — so the bytes-out / bytes-in / destination / country data we collect per session can be aggregated per binary at zero new collection cost. The SBOM rollup runs nightly and produces, for each (tenant, binary, day): saturation %, active-agents count, session count, bytes moved each direction, top external destinations with byte share, top destination countries with byte share, time-of-day histogram. Saturation percentages are computed against the agent count active in the same window.

What "external" means here

Destinations outside the agent’s tenant-configured private networks (RFC1918 plus any tenant-supplied private CIDRs at /account/networks). Internal traffic stays internal in the report; the external column is what surfaces in the data-flow attribution.

Buyer pitches

"We want a real-time vulnerability map of our fleet that doesn’t require deploying another agent." "We’re paying for software 30% of our fleet has installed but never opens." "We need to know if any of our internal tools is calling out to the public internet." "Auditors keep asking which versions of which binary on which machines, and we keep guessing."

First-seen radar

Available on Business and Enterprise

What it is

A daily-updated feed of every (executable, destination, code-signer) the fleet has never seen before, ranked by spread velocity — how many agents the new artifact is already on, divided by how long it’s been observed. Click any row to pivot into reputation, AI Governance, or destination details. Optional in-app or email alert when a row crosses a tenant-configurable spread-velocity threshold.

Why it matters

The earliest possible signal for two very different but equally important phenomena. For security: a credential stealer hitting machine #1 today and machine #50 tomorrow is the canonical lateral-movement signature; first-seen + spread velocity catches it before machine #50. For governance: a new SaaS tool spreading via Slack-link from two adopters today to forty next week is the canonical shadow-IT-viral-adoption pattern; first-seen catches it before procurement notices the surprise renewal invoice.

How it works

An hourly delta against a rolling 90-day "known set" of executables, destinations, and signers. New entries are inserted into analytics.first_seen with first-seen timestamp and initial agent count; spread velocity is recalculated on each pass.

Honest limitation

Works at any size but the signal is sharper above ~50 agents. At 10 agents the radar will surface every new arrival; at 500 it surfaces only the ones spreading meaningfully. Tunable threshold lets you dial down the false positives.

Vendor concentration map

Available on Business and Enterprise

What it is

A stacked-bar chart and Sankey diagram of outbound bytes per cloud / SaaS vendor (AWS, Azure, GCP, Cloudflare, Salesforce, Notion, Slack, Zoom, Anthropic, OpenAI, etc.) over time. Sliceable by department / location / cost-center tag. Includes a "what if" calculator: pick a vendor, see what fraction of your outbound traffic disappears.

Why a buyer cares

Three audiences want this answered, none can answer it today without manual work. (1) Business continuity: "If AWS us-east-1 vanished, what stops?" (2) Procurement: "Are we paying for three CDNs when our traffic is 92% Cloudflare?" (3) Security: "Why is our marketing department uploading more bytes to Russia than our research department?" One pane, one chart, all three answers.

How it works

Vendor extraction is an enriched copy of the AI Governance vendor catalog generalized to all clouds + SaaS. Per-day rollup of bytes-out per vendor per tag; the UI handles the chart math and filter UX.

What you need to make it work well

Department / location tags on agents. Without tags you still get the fleet-wide vendor breakdown; with tags you get the slice-and-dice that makes the chart actionable. Tags are part of the F0-3 foundation work in the build plan.

Per-machine deviation score

Available on Business and Enterprise

What it is

Every agent gets a daily score representing how unlike the rest of the fleet its destinations and executables look. The top 1% of deviant agents surfaces as "review this machine" with a small ranked list of contributing factors ("3 unique destinations not seen by any other agent in the last 30 days", "running an unsigned binary nobody else runs", "uploaded 14× the fleet median bytes today").

Why it matters

Compromised machines, insider misuse, and accidentally-enrolled BYOD all manifest as outliers from a healthy fleet baseline. A SOC analyst can’t watch every machine; a deviation score is a cheap pre-filter that lets the analyst spend their time on the dozen or two machines that actually look weird, instead of staring at a global feed.

How it works

Score = weighted sum of (unique destinations, unique exes, unique signers, traffic-volume z-score) computed against the tenant’s rolling 30-day baseline. Weights are tunable per tenant. Contributing factors are stored alongside the score so the dashboard can show why a machine ranks high, not just that it does.

Honest limitation

Works at N≥10 but the ranking sharpens noticeably above 50. We show a "low-confidence" badge under 50 so you don’t over-trust the signal at small scale.

Patch-lag scoreboard

Available on Business and Enterprise

What it is

Per-product histogram of which versions your fleet runs vs current GA, for ~30 common products (browsers, OS components, runtimes, archive utilities, OpenSSH, Office, etc.). "Chrome 110 still on 12 of your 400 machines." Drill in to the list of agents per outdated version.

Why it matters

Vulnerability management without an MDM. Auditors get the version histogram; IT gets the work list; the executive dashboard gets a green-yellow-red bar that tracks over time. Every infosec framework asks for "demonstrate version-lag tracking on commonly-vulnerable software"; this is that demonstration.

How it works

A small JSON catalog ships with ten listing each tracked product, a regex that extracts version from the executable name + signer fields, and current GA. Catalog is curated weekly. The patch-lag rollup walks the SBOM rollup, attributes each row to a product via the catalog, and writes the version-share per day.

Why this isn’t a full patch-management tool

We tell you what’s running and what’s out of date. We don’t deploy patches; that’s your MDM’s job. The honest scope: a vulnerability-management input, not an MDM replacement.

SaaS license reconciliation

Available on Business and Enterprise

What it is

Per-application unique-machine-per-month counts derived from binary observations + destination-vendor data. "You licensed 50 Adobe Acrobat seats; 67 distinct machines opened acrobat.exe last month. Top 17 violators →." Same for Office, Slack, Cursor, Figma, Notion, GitHub Desktop, Zoom, and a curated catalog of ~50 common SaaS apps. Tenant can extend the catalog with their own definitions.

Why a buyer cares

Most companies are simultaneously over-licensed (paying for unused seats) and under-licensed (people using personal accounts or sharing logins). Reconciling to actual fleet usage typically recovers more money than the tier costs. Every Business+ buyer we’ve shown this to has had at least one "wait, we’re paying for HOW many of those?" reaction.

How it works

"App" is inferred from process name + destination vendor (acrobat.exe + adobe.com → Adobe Acrobat). Per-month rollup of unique-agent-per-app. A small admin form lets the tenant input licensed counts, after which the dashboard adds a variance column with a $-impact estimate per row.

Privacy note

We see processes opening network connections; we don’t see who’s logged in to which SaaS app. License reconciliation is per-machine, not per-user. For per-user attribution your IDP / SCIM / SSO tooling is the right surface.

Location-aware network health

Available on Business and Enterprise

What it is

Per-location dashboard showing TCP retransmission rate (link quality) and TCP RST rate (middlebox misbehavior) ranked vs your fleet’s own baseline. Surfaces bad VPN gateways, captive portals killing long-lived sessions, MTU black holes on a specific tunnel, NAT port-exhaustion, and DPI appliances dropping flows they don’t recognize. Drill into per-location agent list, top affected processes, top affected destinations.

Why a buyer cares

"The Tampa office has been complaining about Teams for three weeks; nobody can find a cause" is a universal IT story. Location-aware network health turns it into "Tampa has 14× normal RST rate on outbound 443, all of it concentrated on the new firewall vendor; here’s the per-process breakdown." That’s actionable in an afternoon. The data was already there; we’re just rolling it up by location.

RST vs retransmit — what’s the difference

Retransmits expose link quality (lossy Wi-Fi, congested uplink, marginal cabling). RSTs expose middlebox misbehavior (firewall closing sessions early, captive portal not handling renewals, NAT exhausted, DPI blocking unfamiliar protocols). Different signal, different root causes; both useful.

How it works

Agent flow telemetry already includes RST and retransmit counts per session. The rollup groups by the agent’s location tag (defaulting to inferred ASN/Geo if no tag is set) and computes per-location daily aggregates against a 30-day fleet baseline. Optional alert when a location crosses Nx baseline.

Fleet analytics — Enterprise only

Org-wide executable analysis

Enterprise tier only

What it is

The advanced executable-analytics layer on top of the per-binary executable reputation that ships on every tier. Three detectors: saturation drift (a hash that was signed-Acme on 95 machines is now unsigned on 5), rename detection (a hash that’s always been chrome.exe in Program Files suddenly shows up as svchost.exe in %TEMP%), and low-saturation outliers (binaries that run on <1% of the fleet but aren’t on a known-internal allowlist).

Why a buyer cares

Living-off-the-land malware persists by impersonating known-good binaries. Most EDR tools detect known-bad signatures; cross-fleet hash anchoring detects weird signatures — the same hash appearing under a wrong name, the same name with a wrong hash, the binary that nobody else in your fleet runs. This catches a class of attacks that signature-based tools miss.

How it works

Reuses the SBOM rollup and adds analytics.exec_anomalies with kind ∈ {rename, unsigned_drift, low_saturation_outlier} and severity scoring. Anomaly-inbox UX with ignore / investigate / block actions; ignored binaries become part of your tenant’s known-internal allowlist so the same anomaly doesn’t resurface.

Why it’s Org-only

Signal floor. The detector needs ~50 agents of baseline to be confident in the "expected" set; below that the false-positive rate gets noisy and we’d be eroding trust by shipping it on smaller fleets.

Beaconing detector

Enterprise tier only

What it is

Cross-machine search for consistent low-bandwidth periodic outbound connections — the network signature most C2 traffic prints. Output is a ranked list of (process, destination, period, bytes-per-beacon) candidates with the agents involved. Allowlist UI lets the tenant suppress known-good periodic traffic (Slack heartbeats, cloud-agent telemetry, AV phone-home).

Why a buyer cares

Catches dormant or staged compromises that single-machine EDR misses. C2 traffic typically beacons every N seconds with a few hundred bytes per beacon — below the noise floor of any per-machine alerting, but obvious as a pattern across a fleet. Signature of behavior, not signature of binary; works against novel implants where no static signature exists yet.

How it works

Nightly job over flow telemetry: group flows by (agent, process, destination, period_bin), surface candidates with low inter-arrival jitter and small bytes-per-beacon. Tunable thresholds; tenant-allowlist for known-good. Expect false positives on first run; the allowlist learning curve flattens them out within a couple weeks.

Why it’s Org-only

Needs ≥50–100 agents of baseline noise to false-positive cleanly. Below that scale you’d get either a flood of false positives or a detector that misses real C2 because it can’t distinguish "this binary always beacons" from "this binary suddenly started beaconing on machine N+1".

Data-sovereignty rollup

Enterprise tier only

What it is

Bytes uploaded (and downloaded) per destination country, sliced by department / location tag, over arbitrary time ranges. Filter to "non-EU destinations from EU-tagged agents" or "non-US destinations from agents tagged regulated-team." Stacked-bar chart over time + top-destinations table with vendor and country attribution. Export-ready PDF with formal compliance heading, tenant ID, time range, filter criteria, and a notarized timestamp.

Why a buyer cares

GDPR, Schrems II, sectoral data-residency rules. Today these are answered with surveys ("what services do you use? do they store data in the EU?") because the tools that could answer with measurement either don’t exist or require expensive DPI / proxy infrastructure. Data-sovereignty rollup answers the question with measurement, with no DPI, no MITM, and no content access. Compliance officers will pay for this feature alone.

How it works

Destination country comes from existing GeoIP enrichment. Source tag is the agent’s location/department tag. Per-day rollup of (destination_country, source_tag) byte totals. Filter UI is a small expression composer; PDF export uses the same auditor-ready template as compliance reports.

Honest limitations

This is volume + attribution, not content inspection. We can prove that bytes left for a destination in country X; we can’t prove what was in those bytes. For most data-residency questions ("did we send any traffic at all from EU agents to non-EU destinations") that’s exactly the right scope. For "did we leak this specific document" it’s the wrong tool — you want a DLP product, and we are deliberately not a DLP product.

Identity, compliance, support

Hash-only — the privacy boundary on file reputation

Applies on every tier, including Individual

What it is

For every external file-reputation lookup — MalwareBazaar, VirusTotal, any future third-party source — we send a SHA-256 fingerprint of the binary. We do not send the binary itself. Our code has no upload path to any third-party file-analysis service. The hash is a 32-byte fingerprint that uniquely identifies the binary if you already have the original; it is not the original. Internally-developed code, license-restricted software, and any bytes specific to your environment never leave the machine they run on.

Why this is a deliberate choice

Many endpoint security products auto-upload unrecognized executables to their analysis cloud — sometimes their own, sometimes a third-party sandbox. That’s a defensible choice for some buyers. But it’s the wrong default for a tool that runs on every machine in a fleet. Internal-only tooling, custom-built executables, license-restricted binaries, and contractor-developed software all deserve to stay on the machine they’re running on. Hash-anchored identity gets us to a high-confidence verdict for the overwhelming majority of binaries with zero upload risk; the small remainder where the cluster genuinely can’t answer is a much smaller surface for an admin to make a deliberate, case-by-case decision about.

What about NSRL and SIG?

NSRL runs entirely on-prem — the corpus is mirrored as local SQLite on the reputation provider, so the hash never leaves your reputation infrastructure for that lookup. SIG runs on the agent itself, reading the embedded signature on the binary; nothing about it crosses the network. The hash-only lozenge appears on entries that touch a third-party service (MBZ and VT today) precisely because those are the entries where the privacy promise is load-bearing.

When deeper analysis is warranted

If an admin determines that uploading a specific binary to a specific service is the right call — for an actual investigation of an actual incident — they can drive that upload manually, on purpose, from their own desk, to a service of their choice. That’s a deliberate decision the human makes, not a default our code takes.

AI + human-backed in-app support

Available on every tier; depth varies by tier

What it is

Support lives inside the dashboard. An AI that knows your tenant (your devices, your destinations, your blocklist activity, your AI Governance numbers) answers first, in seconds. If the AI’s answer doesn’t land, one click forwards the full conversation to your tenant administrator, who can run additional AI assistance themselves or escalate to a real human at DataStun. We answer the administrator; the administrator closes the loop with the original user.

Why this model

Email-based support has a 24-hour SLA on a good day. AI answers in seconds, in context, with a privacy-respecting view of the actual problem. When it can’t solve a case, escalation is a click, not a re-explanation. The branded shorthand: AI handles the volume; humans handle the judgment calls.

How tiers differ

Individual: AI in-app support plus the community forum. No human escalation; not enough in the unit economics. Home: AI + human-backed (escalates to the household sponsor; the sponsor can choose to forward to DataStun). Business: full AI → admin → DataStun chain. Enterprise & MSSP: priority routing on the human-escalation queue. No tier ships email or phone support.

What about audit trail

The full thread — AI replies, admin notes, our response — lives in the dashboard. Auditable, searchable, resumable later. No conversation history "lives in someone’s inbox."

SAML / OIDC SSO

Enterprise tier only

What it is

Single sign-on via your existing identity provider: Okta, Microsoft Entra (Azure AD), Google Workspace, Auth0, OneLogin, JumpCloud, Ping, generic SAML 2.0, or generic OIDC. Group claims map to DataStun tenant roles (owner / admin / member). Optional SCIM provisioning so deactivated identity-provider accounts auto-deactivate their DataStun session within minutes.

Why it matters

Every enterprise has an offboarding process. Without SSO, every SaaS tool is a manual offboarding step; with SSO + SCIM, the human leaves the IDP and access dies on every connected tool automatically. SOC 2 CC6.1 and CC6.2 (logical access controls) are essentially impossible to demonstrate without SSO; for an enterprise buyer, SSO is the table-stakes feature, not a differentiator.

How it works

Standard IDP-initiated and SP-initiated flows. Per-tenant configuration in Settings → Identity. Group claim mapping is a small admin UI; SCIM endpoint is exposed under the per-tenant dashboard URL. Multiple IDP integrations on a single tenant supported (e.g. employees on Entra, contractors on Auth0).

Why it’s Org-only

SSO with the breadth Enterprise buyers expect (multiple IDP types, SCIM, group-claim-to-role mapping, just-in-time provisioning) is real engineering and ongoing maintenance. Business tier ($5/agent/mo) doesn’t support that overhead at the price point.

Compliance reports (SOC 2, HIPAA, ISO 27001)

Enterprise tier only

What it is

Pre-built compliance evidence reports in the dated, signed format auditors expect. Evidence binders for SOC 2 CC6.6 (logical access transmission protections), CC6.7 (data restriction), CC7.2 (logical asset monitoring); HIPAA §164.308 (administrative safeguards) and §164.312 (technical safeguards); ISO 27001 A.12.4 (logging and monitoring) and A.13.1 (network security). Export as PDF with notarized timestamp + tenant ID + criteria.

Why it matters

Most security tools generate data; auditors need evidence. The gap between "we have logs" and "here is a binder demonstrating that logs are kept, monitored, and restricted to authorized parties" is typically two to four weeks of analyst work per audit cycle. Pre-built reports eliminate that work, and the dated/signed format is auditor-defensible without further interpretation.

How it works

Each compliance framework gets a curated report template that pulls from the tenant’s existing data (blocklist enforcement, executable reputation, exposed services, agent inventory, audit log, SSO + role assignments). Report generation runs on demand; output is a multi-page PDF with each control mapped to its evidence section.

What this isn’t

Not a GRC platform. We produce the evidence; we don’t track your control narratives, your risk register, or your auditor’s open items. For full GRC the right surface is Vanta, Drata, Secureframe, or similar; we feed evidence into them and into the auditor’s direct review.

Dedicated account manager

Enterprise tier only

What it is

A named human at DataStun who knows your tenant, your fleet, and your priorities. Quarterly business reviews, roadmap input, an escalation channel for cases that don’t fit standard-priority routing. Not a sales rep — a technical AM who can route a question to the right engineer, and who tracks your specific deployment over time.

Why it matters

At scale, the difference between "a vendor we use" and "a vendor we partner with" is whether someone on the vendor side knows you. Quarterly reviews catch drift in usage patterns, surface unrealized value, and let your team raise issues without filing tickets. Roadmap input means features your team needs land sooner.

How it works

One AM per Org-tier customer (or one shared AM for smaller Org accounts, scoped during onboarding). Standing quarterly meeting; ad-hoc escalation channel via the AI + human-backed support flow with priority routing.

Diagnostic add-ons

Hop Starvation

Add-on, $10 / agent / month. Available on Business and Enterprise.

What it is

TTL-based packet lifetime enforcement. Cap the travel distance of packets to any destination so they die before they can reach the public internet — distributed across every endpoint and gateway, no middlebox, no single point of failure.

Why it matters

Stops lateral movement, data exfiltration, and nation-state reach at the packet layer, not the application layer. Your Oracle database can talk to its local switch and nothing beyond. The blast radius of a compromised host is reduced to "what it can reach in N hops" instead of "the entire internet."

Speed Test

Add-on, $5 / agent / month. Available on Business and Enterprise.

What it is

N² network performance testing across your fleet — latency, throughput, jitter, packet loss, and traceroute path, in each direction, between every pair of agents. 500 agents = 124,750 measurement pairs. Scheduled baselines plus on-demand probes.

Why it matters

Answers "why is the network slow?" with real measurements instead of a shrug. Catches asymmetric routing, VPN under-performance, and ISP shortfalls with evidence you can forward to your vendor.

Advanced Packet Diagnostics (APD)

Add-on, $49 / incident (first incident $25). Available on Business and Enterprise.

What it is

On-demand packet capture from any agent against a specific destination. Up to 10,000 packets per run, with a correlation report that lines the capture up against reputation, route, and Hop Starvation activity (if active).

Why it matters

The evidence grade that used to require a network engineer with a tap, delivered in minutes from wherever your agent is. Pay per incident; no subscription.

Headers only by default, deliberate. Full-payload capture is an opt-in, audit-logged action; the privacy-sensitive default never sends content.

Missing an entry, or have a feature you want explained at this depth? Tell us and we’ll add it.

Glossary — every feature, in detail.

Jump to a feature

Threat detection & enforcement

Real-time blocklist enforcement

What it is

Why it matters

How it works

Why it’s included on every tier

Public threat intel

What it is

Why it matters

How it works

What we don’t use

Commercial-derived threat data

What it is

Why it matters

How it works

Why it’s gated

Executable reputation

What it is

Why it matters

How the four chips fit together

One sequencing rule worth knowing

SIG — code-signing verification

What it is

Why it matters

How the trusted-publisher shortcut works

What SIG can’t tell you

NSRL — NIST known-good catalog

What it is

Why it matters

What an NSRL miss does not mean

How the lookup runs

MBZ — MalwareBazaar known-bad catalog

What it is

Why it matters

What an MBZ miss does not mean

External verification

VT — VirusTotal aggregated AV verdict

What it is

Why it matters

What VT can’t tell you

External verification

Exposed infrastructure services

What it is

Why it matters

Why it’s included on every tier

Tenant infrastructure

Per-tenant dashboard URL

What it is

Why it matters

How it works

Agent rules + custom blocklists

What it is

Why it matters

How it works

AI Governance

AI Governance dashboard

What it is

Why it matters

How it works

What this isn’t

Fleet analytics — Business+

Fleet SBOM — inventory + usage + data-flow

What it is

Why it matters

How it works

What "external" means here

Buyer pitches

First-seen radar

What it is

Why it matters

How it works

Honest limitation

Vendor concentration map

What it is

Why a buyer cares

How it works

What you need to make it work well

Per-machine deviation score