AI Malware Analysis Needs Evidence Packets
Zane St. John bought a cheap Android projector, saw suspicious DNS traffic, and used Claude Code as a reverse-engineering assistant to inspect the device’s preinstalled apps.1
The interesting part is not that an AI agent helped with malware analysis. That claim will become boring quickly. The interesting part is the artifact shape: observed network behavior, package names, decompiled code paths, command output, notes, and indicators that a human could inspect. Malware analysis with agents only becomes trustworthy when the output looks less like an answer and more like a case file.
AI malware analysis needs evidence packets. Agents can accelerate unpacking, decompilation, search, summarization, and hypothesis generation. Analysts still need hashes, tool versions, commands, extracted indicators, source paths, uncertainty labels, and claim-to-evidence trails before trusting the conclusion.
TL;DR
Microsoft Research describes Project Ire as an autonomous malware-classification agent that reverse engineers software and produces a chain of evidence before a validator decides whether enough support exists for a malware verdict.2 Zane’s Android projector investigation shows the same pattern at a smaller scale: an agent can help an individual analyst move through APKs, logs, strings, and suspicious code paths.1
The safe product lesson is narrow. Treat an AI malware analyst as a workbench, not as an authority. The workbench can extract, organize, and connect evidence. It should not contact live infrastructure, write exploitation clients, execute unknown payloads on a normal workstation, or replace human judgment about impact. The useful output is an evidence packet a reviewer can reproduce.
Key Takeaways
For security teams: - Ask agents for evidence packets, not verdicts. - Keep sample identity, command logs, tool versions, extracted indicators, and claim support together. - Require human approval before any dynamic execution, network contact, or credential-bearing analysis.
For agent builders: - Default malware-analysis workflows to read-only static analysis. - Separate extraction, hypothesis, verification, and reporting into distinct steps. - Preserve raw artifacts and source locations so a human can audit the chain.
For product teams: - Do not sell “autonomous malware analysis” as magic. - Show what the agent inspected, what it inferred, what it did not verify, and what a human still has to decide. - Build review packets before building dramatic dashboards.
What The Android Projector Case Proves
St. John’s investigation began with observed behavior: DNS requests from the projector before normal use.1 That matters because the source of suspicion came from the device, not from the model. The agent entered after the analyst already had a question worth investigating.
The workflow then moved through ordinary reverse-engineering surfaces:
| Surface | Why it matters |
|---|---|
| DNS observations | Showed the device talking before the user asked it to. |
| Android package names | Helped narrow which preinstalled apps deserved inspection. |
| APK decompilation | Turned bundled code into searchable source-like output. |
| Strings and endpoints | Revealed configuration, network destinations, and update behavior. |
| Notes and summaries | Kept the investigation from becoming a pile of raw files. |
The article names common Android reverse-engineering tools such as adb and jadx.1 Those tools do not make a conclusion true. They make the artifact inspectable. jadx describes itself as a command-line and GUI decompiler that converts Android Dex and APK files into Java source and can decode Android resources.3 Apktool describes itself as a tool for reverse engineering Android APK files, including manifest, resources, smali, and rebuild workflows.4
The agent’s advantage sits in the middle. It can search unfamiliar packages, summarize code, propose likely areas to inspect, and maintain a todo list. The analyst still needs to verify each claim against the original artifact.
AI Turns Reverse Engineering Into Case Management
Traditional malware analysis already produces a case file. The file may include hashes, sample origin, strings, domains, IP addresses, mutexes, registry keys, file paths, screenshots, disassembly notes, sandbox output, and a final verdict.
Agents change the speed and volume of that work. They can read more files, write more notes, and produce more hypotheses than a single analyst would manually type. Without a stronger output contract, that speed creates a trust problem. A confident summary can hide a bad inference, a missed branch, or a hallucinated API name.
Microsoft’s Project Ire points toward the better shape. Microsoft says the system autonomously analyzes and classifies software and builds a chain of evidence for its findings.2 The design includes tools for reverse engineering plus a validator that checks whether the evidence supports the verdict.2 That validator idea matters more than the brand name. Malware analysis needs a separate judge for the evidence, not only a fluent narrator of the conclusion.
Use the same split in smaller workflows:
| Step | Agent role | Human or policy gate |
|---|---|---|
| Acquire | Record sample source and hash. | Confirm authorization and containment. |
| Extract | Unpack static artifacts. | Approve toolchain and sample handling. |
| Inspect | Search code, manifests, strings, and resources. | Check source locations. |
| Hypothesize | Propose suspicious behavior and risk. | Demand supporting evidence. |
| Verify | Map each claim to an artifact. | Reject unsupported claims. |
| Report | Write indicators and impact notes. | Decide action and disclosure. |
The agent can do a lot. The gate decides what deserves belief.
Android Has Useful Static Surfaces
Android malware analysis has a practical advantage: APKs expose several static surfaces before anyone executes the app.
The Android security documentation lists risk categories such as cleartext communications, dynamic code loading, insecure broadcast receivers, hardcoded secrets, and permissions-related mistakes.5 MITRE ATT&CK for Mobile includes techniques such as Broadcast Receivers and Download New Code at Runtime, which gives analysts a vocabulary for mapping observed behavior to attacker tradecraft.6
That makes a static-first evidence packet valuable:
| Android artifact | Evidence to capture |
|---|---|
| APK hash | SHA-256, source, collection date, and filename. |
| Manifest | Package name, permissions, services, receivers, providers, exported components, and SDK targets. |
| Decompiled code | File path, class, method, and line or symbol around the claim. |
| Resources | URLs, domains, API paths, configuration values, certificates, and assets. |
| Native libraries | Library names, architecture, exported symbols, and unpacking notes. |
| Network observations | Domains or IPs observed, timestamp, tool, and whether contact happened passively or actively. |
| Behavior mapping | ATT&CK Mobile technique only when evidence supports it. |
| Uncertainty | What the agent did not inspect or could not prove. |
The table avoids an important mistake: it does not ask the model to decide “malware or not” first. It asks the system to preserve the evidence that would make a verdict reviewable later.
The Evidence Packet
A useful AI malware-analysis packet should fit a predictable shape:
| Section | Required contents |
|---|---|
| Scope | Who authorized the analysis, what sample or device was examined, and what actions were forbidden. |
| Sample identity | Hashes, filenames, sizes, timestamps, source path, and chain-of-custody notes. |
| Toolchain | Tool names, versions, command lines, and environment boundaries. |
| Static findings | Manifest facts, package names, suspicious strings, endpoints, resources, and code locations. |
| Dynamic findings | Only if authorized: environment, network isolation, logs, screenshots, and observed behavior. |
| Indicators | Domains, IP addresses, package names, file paths, certificate data, and other observable artifacts. |
| Claim map | Each conclusion paired with the exact artifact that supports it. |
| Unverified work | Samples not unpacked, code paths not followed, network behavior not reproduced, and assumptions. |
| Recommended action | Block, monitor, remove, escalate, disclose, or continue analysis, with confidence level. |
The claim map is the heart of the packet:
| Claim | Evidence | Confidence |
|---|---|---|
| App uses dynamic code loading | Decompiled code path plus Android risk category citation. | Medium until dynamic behavior is reproduced. |
| App contacts suspicious domain | Passive DNS observation plus string or config reference. | High if both sources match. |
| App persists through receiver | Manifest receiver plus code path handling boot or system broadcast. | Medium unless observed in a lab. |
| App is malicious | Multiple supported behaviors, context, and human review. | Never from model summary alone. |
That last row protects the analyst. Malware verdicts carry consequences. A false positive can damage a vendor or confuse an incident response. A false negative can leave a user exposed. The model should not get a shortcut around evidence.
What The Agent Should Refuse
Malware work needs refusal boundaries even when the goal is defensive.
An agent should refuse or require explicit human authorization before:
- contacting live command-and-control infrastructure;
- writing a client for a suspected backdoor or updater;
- downloading second-stage payloads from attacker-controlled infrastructure;
- running an unknown sample outside an isolated lab;
- using real user credentials, personal accounts, or production networks during analysis;
- publishing live indicators that may identify a victim before responsible disclosure;
- turning a defensive investigation into exploitation instructions.
OpenAI’s local-shell documentation warns that allowing agents to run arbitrary shell commands can be dangerous and recommends sandboxing or strict allow and deny lists before forwarding commands to a shell.7 Anthropic’s Claude Code best-practices guide emphasizes verification criteria and context management for agent work.8 Malware analysis needs both: command limits before action and evidence limits before belief.
The refusal boundary should appear in the task itself:
Analyze this APK statically.
Do not execute it.
Do not contact remote infrastructure.
Do not write exploit or client code.
Return only evidence with file paths, commands, and confidence labels.
Mark every unsupported claim as unverified.
That kind of instruction does not make the workflow safe by itself. It gives hooks, sandboxes, and reviewers something concrete to enforce.
The Human Still Owns The Verdict
An AI agent can save hours in a malware-analysis session. It can move from a pile of APKs to a short list of suspicious packages. It can summarize classes, unpack intent filters, identify config strings, and produce a report draft. Those gains matter.
The agent should not own the verdict.
The analyst owns:
- authorization to analyze the sample;
- decision to run anything dynamically;
- interpretation of intent and impact;
- communication with affected vendors, users, or platforms;
- remediation and disclosure decisions;
- final language in the report.
That split keeps the agent useful. The model does the tiring connective work. The human keeps the ethical, legal, and contextual responsibility.
How To Build The Workflow
Start with a small static-analysis loop:
- Hash the sample and record where it came from.
- Extract manifest, resources, strings, and decompiled code into a read-only work directory.
- Ask the agent to create a finding list with source locations.
- Ask a second pass to challenge each finding and mark unsupported claims.
- Build the evidence packet.
- Decide whether the packet justifies dynamic lab analysis.
The agent prompt should require structured output:
For every finding, include:
- claim
- artifact path
- command that produced the artifact
- source excerpt or symbol
- confidence
- what would falsify the claim
- whether dynamic analysis is required
That output looks less exciting than “the projector has malware.” It is much more useful.
The Evidence Gate applies directly. A claim without evidence should not move into the final answer. Review Packets Are the New Final Answer applies too: the deliverable is not the prose summary, but the packet that lets another person verify the work.
FAQ
Can AI agents do malware analysis?
Yes, within limits. Agents can help with static analysis, summarization, decompilation navigation, indicator extraction, and report drafting. They should not become the final authority for malware verdicts without reproducible evidence and human review.
Is it safe to use Claude Code or Codex on malware?
Only inside a controlled defensive workflow. Do not run unknown samples on a normal workstation, do not contact live infrastructure, and do not give the agent credentials or unrestricted shell/network access. Static analysis in an isolated work directory is the safer starting point.
What should a malware-analysis evidence packet include?
At minimum: hashes, sample source, tool versions, commands, manifest facts, extracted indicators, code locations, a claim-to-evidence map, confidence labels, and a list of unverified work.
Does an AI verdict count as evidence?
No. The model’s statement is an interpretation. Evidence comes from artifacts: hashes, logs, commands, code paths, manifests, observed network behavior, and reproducible analysis steps.
Should agents map findings to MITRE ATT&CK?
Yes, when the evidence supports the mapping. A technique label without artifact support becomes decoration. Use ATT&CK Mobile as vocabulary, not as a substitute for proof.6
Close
AI does not remove the analyst from malware analysis. It changes what the analyst should demand.
The weak output is a confident verdict. The strong output is a reproducible packet: sample identity, commands, artifacts, indicators, claim support, uncertainty, and next action.
Agents can make reverse engineering faster. Evidence packets make it trustworthy.
References
-
Zane St. John, “Reverse Engineering Android Malware with Claude Code,” published February 5, 2026. Source for the Android projector case, suspicious DNS starting point, use of
adbandjadx, Claude Code-assisted APK inspection, and defensive reverse-engineering workflow shape. ↩↩↩↩ -
Microsoft Research, “Project Ire: Autonomously Identifying Malware at Scale,” published August 2025. Source for Project Ire’s autonomous reverse-engineering framing, chain-of-evidence design, tool use, and validator stage. ↩↩↩
-
jadx project, “jadx README,” GitHub repository documentation, accessed May 18, 2026. Source for jadx’s purpose as a Dex-to-Java decompiler with command-line and GUI usage and Android APK/resource support. ↩
-
Apktool, “Apktool,” official documentation, accessed May 18, 2026. Source for Apktool’s stated role as a tool for reverse engineering Android APK files and its manifest/resource/smali decoding workflow. ↩
-
Android Developers, “Mitigate Security Risks in Your App,” accessed May 18, 2026. Source for Android risk categories including cleartext communications, dynamic code loading, hardcoded secrets, and insecure broadcast receivers. ↩
-
MITRE ATT&CK, “Mobile Matrix,” accessed May 18, 2026. Source for ATT&CK Mobile technique vocabulary including Broadcast Receivers and Download New Code at Runtime. ↩↩
-
OpenAI, “Local shell,” OpenAI API documentation, accessed May 18, 2026. Source for local-shell risk framing and sandbox or allow/deny-list guidance before agents run shell commands. ↩
-
Anthropic, “Best Practices for Claude Code,” Claude Code documentation, accessed May 18, 2026. Source for context-window, verification-criteria, and CLI-tool workflow guidance used in the agent-analysis framing. ↩