R-018 April 2026 · 8 min

Reasoning about an unknown sample.

How Mara approaches a piece of malware it has never seen before: what to look at, what to suspend judgement on, and where to begin.

Most malware is not novel. Most malware is a slight variation on something seen before, dressed up just enough to evade a signature and just little enough not to surprise anyone who has been doing this for a while. The interesting question is not how to catch the easy cases; those are caught. It is how to think calmly about a sample when you genuinely do not yet know what it is.

Mara's approach is unglamorous on purpose. It begins where a careful analyst begins: with the facts that survive any interpretation.

Begin with what is true regardless of intent.

—File type and structural integrity. What the bytes claim to be, and whether the structure agrees.
—Imports, sections, certificate, compile timestamp. Observable surface, not yet meaning.
—Strings and embedded resources. Promising leads, always suspect.
—Network indicators, where present, and only after deciding whether to ever resolve them.

Mara lists these out. It does not yet say what the sample is. It says what is known. The model has been trained to resist the most common analyst mistake, narrating a hypothesis as if it were a finding, by simply not having a finding yet.

Then form hypotheses, in the plural.

From the surface, Mara proposes a small set of plausible explanations and ranks them, each with the evidence for and the evidence against. Three or four is usually right. One is overconfident; ten is performative. The discipline is to keep more than one alive long enough for the analysis to actually discriminate between them.

“An analyst is paid to be uncertain in the right way, for the right amount of time.”

Test against behaviour, not against confidence.

Static analysis ends, eventually, where the sample wants you to think it ends. From there the question is what the sample actually does. Mara connects to a sandbox of your choosing, reads the trace, and asks whether what it sees is consistent with each hypothesis it still holds open, including the boring ones it would rather rule out. A theory that survives boredom is usually right.

// transcript

you   Look at the dropped payload from yesterday's IR — is it the same family?

mara  Three things first. Same packer, yes. Same loader stub layout, mostly — one
      new resolver. Same C2 protocol header. So: same author or shared toolkit.
      Not enough to call it the same campaign.

      Two ways this comes apart:
        (a) shared crimeware kit, two unrelated operators
        (b) one operator iterating

      The differences I'd expect for (a): different infrastructure clusters,
      different lure documents, different timezone. Want me to check?

The transcript above is illustrative, not a quote. But the shape of the conversation is the shape Mara is trained to produce: facts first, hypotheses second, discriminators third, and a question back to the human when the next step is a decision that should not be made by a model.

Mara is a research preview from venode. Feedback, corrections and disagreements welcome, hello@venode.ai.

Reasoning about an unknown sample.

Begin with what is true regardless of intent.

Then form hypotheses, in the plural.

Test against behaviour, not against confidence.

Attribution under uncertainty.

A note on evals.

Working in the open.