Skip to content
← RegistryDossier · 5 steps · 4 edges

MITM unencrypted RTP → call eavesdropping

Most internal SIP deployments still use RTP without SRTP. From the same VLAN, ARP-spoof the IP phone + PBX, capture RTP, decode in Wireshark to .wav.

Filed by AD Knowledge Base
§ Kill-chainDrag · zoom · scroll

§ Context

Assumed environment: foothold on the office LAN. SIP/RTP runs unencrypted between desk phones and PBX. ARP / DHCP guard not configured on the switch.

§ Steps

  1. 01
    Exfil sensitive audioExfiltration
    T1041Exfiltration Over C2 Channel
  2. 02
    LAN footholdInitial Access
    T1078Valid Accounts
  3. 03
    ARP-spoof phone + PBXCredential Access
    N-ARP-SPOOFARP Spoofing / Cache Poisoning
  4. 04
    Wireshark RTP → audioCollection
    T1056Input Capture
  5. 05
    Capture RTP streamsCollection
    VOIP-RTP-CAPTURERTP Stream Capture

§ References

§ Frequently asked

What is the "MITM unencrypted RTP → call eavesdropping" attack path?
Most internal SIP deployments still use RTP without SRTP. From the same VLAN, ARP-spoof the IP phone + PBX, capture RTP, decode in Wireshark to .wav. It chains 5 steps drawn from real-world offensive-security techniques.
What starting position does this attack require?
The first step is Exfil sensitive audio (T1041) — a exfiltration primitive. Assumed environment: foothold on the office LAN.
What is the final impact of this kill-chain?
The final step lands on Capture RTP streams (VOIP-RTP-CAPTURE), which falls under Collection. From here, an operator typically pivots into post-exploitation or maintains persistence.
How can defenders detect or prevent this attack?
Detection and prevention vary per step. Refer to each linked MITRE ATT&CK entry under "References" — every technique on that page lists defensive controls, detection telemetry, and known threat-actor usage.

§ Related dossiers