Threat Model

It is important to identify and understand that metadata is ubiquitous in communication protocols, it is indeed necessary for such protocols to function efficiently and at scale. However, information that is useful to facilitating peers and servers is also highly relevant to adversaries wishing to exploit such information.

For our problem definition, we will assume that the content of a communication is encrypted in such a way that an adversary is practically unable to break, and as such we will limit our scope to the context of a communication (i.e. the metadata).

We seek to protect the following communication contexts:

  • Who is involved in a communication? It may be possible to identify people or simply device or network identifiers. E.g., “this communication involves Alice, a journalist, and Bob a government employee.”.
  • Where are the participants of the conversation? E.g., “during this communication Alice was in France and Bob was in Canada.”
  • When did a conversation take place? The timing and length of communication can reveal a large amount about the nature of a call, e.g., “Bob a government employee, talked to Alice on the phone for an hour yesterday evening. This is the first time they have communicated.”
  • How was the conversation mediated? Whether a conversation took place over an encrypted or unencrypted email can provide useful intelligence. E.g., “Alice sent an encrypted email to Bob yesterday, whereas they usually only send plaintext emails to each other.”
  • What is the conversation about? Even if the content of the communication is encrypted it is sometimes possible to derive a probable context of a conversation without knowing exactly what is said, e.g. “a person called a pizza store at dinner time” or “someone called a known suicide hotline number at 3am.”

Beyond individual conversations, we also seek to defend against context correlation attacks, whereby multiple conversations are analyzed to derive higher level information:

  • Relationships: Discovering social relationships between a pair of entities by analyzing the frequency and length of their communications over a period of time. E.g. Carol and Eve call each other every single day for multiple hours at a time.
  • Cliques: Discovering social relationships between a group of entities that all interact with each other. E.g. Alice, Bob and Eve all communicate with each other.
  • Loosely Connected Cliques and Bridge Individuals: Discovering groups that communicate to each other through intermediaries by analyzing communication chains (e.g. everytime Alice talks to Bob she talks to Carol almost immediately after; Bob and Carol never communicate.)
  • Pattern of Life: Discovering which communications are cyclical and predictable. E.g. Alice calls Eve every Monday evening for around an hour.