Hackers about hacking techniques in our IT Security Magazine

VoIP Hacking Techniques

VoIP Hacking Techniques

by Mirko Raimondi


The Public Switched Telephone Network (PSTN) is a global system of interconnected, various analog sized phone networks which provides users the capability to carry voice conversations with each other. Initially, the most basic analog network service, called POTS (Plain Old Telephone Service), uses a pair of twisted copper wires in order to connect a residential phone to a central office from where a residential customer can dial out in the PSTN.

What you will learn…

basics of VoIP network protocols

how to attack a VoIP network

how to defend a VoIP network

What you should know…

basics of networking

Initially, the PSTN was a simple one-to-one telephone line connecting phones from one room to another. When telephone business grew up, Private Branch eXchanges (PBX) were designed, and deployed in office settings to provide the increasing of telephone lines and to connect internal callers (over trunk lines) through either the PSTN or eventually to destination callers. When PSTN became digital, a method called Time Division Multiplexed (TDM) was created. TDM transmits and receives independent signals over a common signal path by means of synchronized switches at each end of the transmission line, in this way each signal appears on the line only a fraction of a time in an alternating pattern.

Voice over Internet Protocol (VoIP) is a newer technology that allows phone conversations to be transferred over the computer networks, it transforms analog and digital audio signals in data packets. VoIP usually refers to communications multimedia applications which are transported via Packet-Switched Network (such as Internet) instead of the PSTN. VoIP has seen rapid implementation over the past few years, many users choose the VoIP to leave behind the traditional telephonic providers in order to pay cheaper bills; for companies using VoIP is an easy way for communication between their several branches and for their teleworking employees.

An example of a simple VoIP network can be seen in Figure 1 where VoIP works as a private telephone network and it is transparent to the PSTN. Software Phones (also said Softphones), IP Phones and Analog phones (which must use VoIP adapter) can connect to a PBX, where internal telephone are connected to public lines or other VoIP systems on the Internet. Using VoIP Media Gateway, a VoIP phone can call a legacy phone on the PSTN and vice versa with no problems since Media Gateway translates the IP packets into TDM.

VoIP services are often taken in use but their security threats are analyzed only under specific aspects or not taken in consideration at all. This article analyzes the most common VoIP threats in order to identify existing weaknesses and suggests available countermeasures. For each threats an example of attack is reported and explained since, in author’s opinion, the knowledge of the tools that could be used by attackers is important. In this way the VoIP current situation will be analyzed from attacker’s point of view to discover the most vulnerable parts of the system. The results of this article could be used by system administrators, network engineers and penetration tester in order to examine their VoIP systems.

The author of this paper discharge all responsibilities for an inappropriate use of the information here reported and suggests to try these attack techniques only in controlled environments, like test plants, and with previous authorization of the owner.

VoIP Fundamental Protocols

VoIP telephony uses mainly two protocols in order to set up a call and to transport Audio/Video signal. They’re described in the following subsections.

Real-Time Protocol (RTP)

The Real-time Transport Protocol (RTP) is a standardized packet format used by IP networks in order to deliver audio/video signal. RTP was developed by the Audio/Video Transport working group of Internet Engineering Task Force (IETF) standards organization, it was initially described in IETF RFC 1889 and then superseded by IETF RFC 3550. It was designed for end-to-end, real-time, transfer of stream data and it’s regarded as the primary standard for audio/video transport in IP networks and it is used with an associated profile and payload format.


Figure 1. Classic VoIP network scenario

RTP is used in conjunction with the Real-Time Control Protocol (RTCP) which is used to monitor transmission statistics and Quality of Service aiding synchronization of multiple streams. While RTP is originated and received on even port numbers, the associated RTCP packets use the next higher odd port number.

The protocol provides facilities for jitter compensation (jittering is rather common on a Packet-Switched Network since communication is provided by network Routers), detection of out of sequence arrival in data and allows data transfer to multiple destinations through IP multicast.

Real-time applications require timely delivery of information and can tolerate some packet loss usually than an excessive delay. Thus, in order to achieve this goal the Transmission Control Protocol (TCP) is normally not used by RTP since TCP favors reliability over timeliness, RTP systems are instead usually built on the User Datagram Protocol (UDP).

The audio sampling rate is typically either 8000Hz or 16000Hz and the rate that RTP packets are transmitted is determined by the audio Codec by mean of its Packetization Period. Whether those packets actually arrive at a fixed rate at the receiving endpoint depends on the network performance. RTP packets might be lost by Routers, might arrive at the receiving endpoint out of sequence, or could be even duplicated when they transit through the network.

Hence receiving endpoints are designed with the assumption that RTP packets will not arrive at the precise rate they were transmitted. About this reasons an endpoint incorporate a Jitter Buffer having parameters in order to manipulate the characteristics of time buffering in an attempt to produce the highest Quality of Service during the playback. Jitter Buffer uses RTP header information to accomplish its functions.

Session Initiation Protocol (SIP)

SIP is being developed by the SIPWorking Group, within the IETF, the protocol is published as IETF RFC 2543. SIP is a telephone signaling protocol used by VoIP in order to initiating, managing and terminating voice sessions in Packet Switched Networks. SIP sessions involve one or more participants and can use either unicast or multicast communication. SIP is text-encoded and highly extensible since it may be extended to accommodate features and services such as call control services, mobility and interoperability with existing telephony systems.

That are 4 types of logical SIP entities, each one participates in SIP communication as a client (the entity which initiates the Requests), as a server (the entity which Responds to Requests), or as both. One network device can have the functionality of more than one logical SIP entity. In the following the 4 types of logical SIP entities are reported:

1. USER AGENT (UA): initiate and terminate sessions by exchanging Requests and Responses. UA is an application, which contains both a User Agent Client (UAC) and User Agent Server (UAS). UAC is a client application that initiates SIP requests while UAS is a server application that contacts the user when a SIP request is received and that returns a response on behalf of the user. Devices with UA functions are: workstations, IP-phones, Media Gateways, call agents and automated answering services;

2. PROXY SERVER: intermediary entity that acts as both a server and a client with the purpose of making Requests on behalf of other clients. Requests are serviced either internally or by passing them on (possibly after translation) to other servers. A Proxy interprets and, when it’s necessary, rewrites a Request message before forwarding it;

3. REDIRECT SERVER: server that accepts a SIP Request, maps the SIP address of the called party into zero (if there isn’t known address) or more new addresses and returns them to the client. It does not not pass the Request on to other servers;

4. REGISTRAR: accepts REGISTER Requests in order to updating a location database with the contact information of the user specified in the Request.

There are two types of SIP messages:

1. Request Messages: they’re sent from the client to the server;

2. Response Messages: they’re sent from the server to the client.

In the following Request Messages types are reported:

INVITE: initiates a call and it can changes call parameters, in this case it’s called re-INVITE;

ACK: confirms a final response for the INVITE message;

BYE: used in order to terminate a call;

CANCEL: cancels searches and ringing;

OPTIONS: queries the capabilities of the other side;

REGISTER: used to register with the Location Service;

INFO: sends mid-session information that does not modify the session state.

Response Messages contain numeric codes, there are 2 types of responses and 6 types. In the following the Response types are reported:

1. Provisional: its own class is1xx, this kind of responses are used by the server to indicate a progress state but they can’t terminate SIP transactions;

2. Final: its own classes are 2xx, 3xx, 4xx, 5xx, 6xx, this kind of responses terminate the SIP transactions.

The different types of classes, divided by their prefix number, are reported in the following:

1xx: provisional, searching, ringing and queuing. Two examples of these messages are ‘100 Continue’ and ‘180 Ringing’;

2xx: success. An example is the message is ‘200 OK’;

3xx: redirection and forwarding. Examples are messages ‘301 Moved Permanently’ and ‘302 Moved Temporarly’;

4xx: request failure for client mistakes. The messages ‘400 Bad Request’ and ‘408 Request Time-Out’ are two examples of these messages;

5xx: server failures.

6xx: global failure such as busy, refusal, not available. The messages ‘600 Busy’ and ‘604 Does Not Exist’ are two examples.

SIP messages are composed of 3 parts:

1. Start Line: each SIP message begins with this part. The Start Line conveys the message type (method type in Requests and Response code in responses) and the protocol version. The Start Line may be either Request-line (request message that includes a Request URI, which indicates the user or service to which this request is being addressed. Unlike the “To” field) or Status-line (response message which holds the numeric Status-code and its associated textual phrase);


Figure 2. Trivial SIP session

2. Header Fields: used to convey message attributes and to modify message meaning. Like an HTTP request from a browser is made using an URL, SIP uses an e-mail like addresses which typical format is: user/phone@domain/ip. They can span multiple lines. Some SIP headers such as Via, Contact, Route and Request-Route can appear multiple times in a message or, alternatively, can take multiple comma-separated values in a single header occurrence;

3. Body: this is the content of the message and is used to describe the session to be initiated, this may include audio and video codec types that, sampling rates, etc.; It alternatively may be used to contain opaque textual or binary data of any type which relates in some way to the session. Message bodies can appear both in Request and in Response Messages. Possible body types include: Session Description Protocol (SDP) and Multipurpose Internet Mail Extensions (MIME).

Figure 2 shows a trivial SIP session, registered by mean of a Network Analyzer called Wireshark, that reports an interaction between a UAC and a UAS which is established and then terminated. UAC has IP address and UAS has In particular packet 421 is an INVITE Request Message sent to the user 1000. Then, the Response Message packets 423 and 424 belonging to class 1xxx, said respectively a call continuation and the ring back tone. After about 10 seconds the called user answer is stated by packet 647 which reports a Response Message OK belonging to the class 2xxx, now the telephone call is established. The telephone call duration is about 40 second, then the caller hang up the telephone, it is stated by packet 4985 which reports a BYE Request Message in order to close the call.

Figure 3 reports a detail of the packet number 421 which is registered again by mean of Wireshark. It’s an INVITE Request Message where Start Line, Header Fields and Body are clearly visible.

Overview of Common VoIP Attacks

In the following, an overview of common VoIP attacks is reported. Each attacks is executed by mean of a dedicated hacking tool on Linux OS platform. Before to develop and explain the attacks, let’s have a look to the test plan realized by the author in order to develop VoIP exploitation examples.


Figure 3. SIP INVITE details

Test Plant Characteristic

A basic Local Area Network scenario was developed in order to execute and explain VoIP attacks reported in this article. Network devices and platform involved in this test plant are described in the following:

UAS – Ubuntu 12.04.3 Server with Asterisk PBX – IP address, UDP port 5060;

UAC #1- Ubuntu 12.04.3 Server with Zoiper softphone – IP address, UDP port 37268 – extension 1000 – password authentication: mypasswd1;

UAC #2 – Windows 7 OS with X-Lite softphone – IP address, UDP port 5060 – extension 1234 – password authentication: youpasswd;

UAC #3 – Linux Mint OS with ZoIPer softphone – IP address, UDP port 47723 – extension 2000 – password authentication: mypasswd2;

Attacker – Linux Black Ubuntu – IP address;

Network Device – DELL Switch 2748

Information Gathering

In previous section the features of network devices was reported by the author in order to help the reader to understand the following example, but in the reality the network administrator would like to hide that information in order to make harder any attack. In this way an attacker, with its only strengths, must to discover all information about the network features before to start any kind of attack, this is always the first phase of any attack and is called Information Gathering: the attacker gathers information about network devices in order to learn as much information as he can. In particular the attacker could be interested about: network hosts, network servers, PBXs types and versions, VoIP Media Gateways, SIP clients types and versions.

Several free tools could be used by an attacker to accomplish this action: SMAP, SIPSAK, SIPSCAN and SVMAP. The author will use SVNMAP, it belongs to a suite of SIP tools called SIPVICIOUS (others tools of this suite will be treated in the following sections). Some SVMAP capabilities are reported in the following list:

scan identify and fingerprint a single target IP, an IP range or even an entire subnetwork;

network interface and local port selection for outgoing packets;

identify SIP devices and PBX servers on default and non-default ports;

scan just one host on different ports, looking for a SIP service on that host or just multiple hosts on multiple ports;

take previous scan results as input, allowing you to only scan known hosts running SIP;

use different scanning methods (OPTIONS, REGISTER, INVITE, etc.);

get all the phones on a network to ring at the same time (using INVITE as method);

randomly scan internet ranges resume previous scans.

SVMAP allows specifying the request method that will used for scanning (which is by default the OPTIONS method), you can specify a different method to scan with, such as REGISTER and INVITE (Attention please! INVITE method can be noisy and generate a “ring” at the other end). The list of usable methods is reported in the following:

INVITE: a client is being invited to participate in a call session;

ACK: confirms that the client has received a final response to an INVITE request;

BYE: terminates a call and can be sent by either the caller or the callee;

CANCEL: deletes any pending request;

OPTIONS: queries the capabilities of servers;

REGISTER: registers the address listed in the To header field with a SIP server;

PRACK: provisional acknowledgement;

SUBSCRIBE: subscribes for an Event of Notification from the Notifier;

NOTIFY: notify the subscriber of a new Event;

PUBLISH: publishes an event to the Server;

INFO: sends mid-session information that does not modify the session state;

REFER: asks recipient to issue SIP request (call transfer);

MESSAGE: transports instant messages using SIP;

UPDATE: modifies the state of the session without changing the state of the dialog.



Figure 4. Network Scanning with SVMAP

Furthermore SVMAP offers debug and verbosity options and allows scanning the SRV records for SIP on the destination domain. SVN records are a type of DNS entry that specify information on a service available in a domain, typically they’re used by clients who want to know the location of the service within a domain.

Figure 4 reports a scan of the entire network executed by mean of SVNMAP with the fingerprint enabled, as you can see in the picture the scan has found three SIP client devices (two softphones ZoIPer and a X-Lite softphone, as reported in previous section) and one SIP server (Asterisk PBX, again as reported in the previous section). Notice that devices 105 and 108 are two UACs which open UDP non default SIP ports. This kind of scan does not use the default method REGISTER but instead use INVITE which sends an INVITE SIP message to each client scanned, it is not a very silent method since entails one ring on each UAC.

Since the countermeasures to avoid Information Gathering are the same as that to avoid the Extensions Enumeration, they’ll be reported in the next section.

Extensions Enumeration

Extensions Enumeration is an important VoIP attack used in order to identify the live SIP extensions. SVNWAR is a free SIP extension line scanner and it will be used by the author in order to accomplish this kind of attack. SVNWAR belongs again to SIPVICIOUS suite and works similar to traditional wardialers by guessing a range of extensions or a given list of extensions. Some SVMAP capabilities are reported in the following list:

identify extensions on PBXs and through SIP proxies;

scan for large ranges of numeric extensions;

scan for extensions using a file containing a list of possible extension names;

use different SIP request methods for scanning since not all PBX servers behave the same;

resume previous scans.


Figure 5. Extensions Enumeration with SVWAR

Figure 5 shows a scan for user extensions from 1000 to 1500 obtained with the default Request method (REGISTER). As you can see by the picture, the result are the user extensions registered on the PBX and each UAC needs the authentication password in order to set up a call.

Avoid Information Gathering and Extensions Enumeration is not an easy task, you can’t deny SIP messages such as INVITE, OPTION, REGISTER, etc., since they’re essential to set up a VoIP call; you can just think to stop this message when they are received in rapid succession. Another countermeasure that could be taken by a network administrator is to setup a firewall on UAS by mean of Access Control Lists (ACLs), in this way the UAS can accept just INVITE sent by devices with reliable IP address. Since ACLs don’t avoid ARP spoofing attack and Caller ID spoofing attack (they’ll be treated in the following sections), in order to get an harder network protection, Switches must be configured in a right manner: all unused ports should be disabled and used ports must be configured with port-security option in order to avoid intruder devices in the network.


Figure 6. Spoofing UAS with ARPSPOOF


Figure 7. Spoofing UAC#1 with ARPSPOOF


Eavesdropping is the act of secretly listening a VoIP conversation of others without their consent, this could be done by mean of “packet capture” which is the process of intercepting and logging traffic by mean of Network Analyzers.

As already reported in previous sections, a Network Analyzer is a computer program (such as Wireshark) or a piece of computer hardware that can intercept and log traffic passing over a particular types of networks, such as either an Ethernet or a Wireless. As data streams flow across the network, the sniffer captures each packet and, if needed, decodes the packets showing the values of various fields according to the appropriate RFC or other specifications. Packet capture can be used by attackers over VoIP networks in order to capture SIP Requests and RTP data sent from UAC to UAS and back. In this section call Eavesdropping is obtained by mean of a Man In the Middle (MITM) attack which means that the attacker makes independent connections with the victims and relays messages between them, making them believe that they’re talking directly to each other over a private connection but the entire conversation is instead controlled by the attacker. In order to obtain MITM, the attacker can sends fake (“spoofed”) Address Resolution Protocol (ARP) messages in the Local Area Network (LAN), their aim is to associate the attacker’s Media Access Contro (MAC) address with the IP address of the PBX, in this way any traffic meant for that IP address to be sent to the attacker instead, this technique is said ARP spoofing.

Figures 6 and 7 report the ARP spoofing technique executed by author by mean of ARPSPOOF tool, the first figure reports the spoofing of the UAS (PBX) and the latter the spoofing of the UAC#1 (Linux Mint Box). With these two commands, the attacker’s change its MAC address spoofing the victim MAC address and then it sends Gratuitous ARP (GARP) message announcing to UAS and UAC#1 the change. When the commands will be executed, the ARP cache of UAS and UAC#1 will be poisoned and all packets exchanged by UAS and UAC#1 will pass through the attacker’s Linux Box, in this way the attacker can register entirely a conversation.

Figure 8 reports a call trace obtained between UAC#1 and UAS by mean of Wireshark on the attacker’s Linux Box, as you can see by the picture a SIP handshake is followed by RTP traffic. Wireshark stores its call trace in .pcap files (since it’s developed by mean of a library called libpcap) and provides one capability which permits to decode and play RTP voice packets, Figure 9 reports an example of this feature.


Figure 8. Man in the Middle Registration

One countermeasure adopted in order to avoid eavesdropping attack could be again obtained configuring the network Switch in a right manner using static ARP. Since static ARP is not always possible, another way to avoid this attack is to use UAs with platforms which refuse GARP message, for example Linux Solaris OS. Finally the last countermeasure applied in order to avoid Eavesdropping is the voice encryption, if audio signal is encrypted it’ll be impossible to read. Voice encryption can be obtained the means of Secure RTP (SRTP) which is a standard (RFC 3711) providing encryption and authentication of RTP.

Telephone Tampering

Another attack that can be performed by mean of MITM is Telephone Tampering, it is a form of sabotage which concern an intentional modification of carried signal in a way that would make them harmful to the user. RTP is a media protocol which makes VoIP vulnerable to the Tampering, RTP is often sent unencrypted and runs over an unsecure transport protocol called UDP.

Attacker can capture an RTP packet (by the means of MITM attack) and create RTP packet similar to the original but with a greater timestamp and sequence number. In this way the attacker can trick the victim endpoint to reject RTP messages from the legitimate endpoint in favor of the injected packets, since the original packets appear old. As packets have a valid and unchanged SSRC (synchronization source identifier that characterizes the current session), they are accepted as a part of original transmission. Telephone Tampering can have very serious consequences, because caller and called party consider themselves trusted parties.


Figure 9. Wireshark Player

Figure 10 shows an example of the Telephone Tampering attack obtained by mean of RTPINSERTSOUND tool, this can be used to inject a .wav file (selected by the attacker) into the RTP stream, replacing the voice signal from one side with the signal within .wav audio file.


Figure 10. Telephone Tampering with RTPINSERTSOUND


v – stands for verbose output;

i eth0 – interface selected;

a – source IPv4 address;

A – source UDP port;

b – is victim IPv4 address;

B – destination UDP port;

f – spoof factor;

j – jitter factor.

Figure 11 reports the help command belonging to another tool used by attackers in order to get a Telephone Tampering, it’s called RTPMIXSOUND. A countermeasure applied in order to avoid tampering issues is the voice encryption yet. Moreover, a VoIP/SIP firewall could be used “in front” of all the VoIP phones and monitor incoming and outgoing RTP detecting audio insertion/mixing attacks.

Authentication Attacks

In the past SIP used weak authentication where password was sent in plain text, making it easy to obtain for anyone who could get access to SIP messages. Since this authentication was insecure it was deprecated and now, in SIP 2.0, MD5 message-digest algorithm is used for hashing the UAC password. When a UAC wants to authenticate with a UAS, UAS generates and sends a digest challenge to the UAC. The simplest authentication challenge that a UAS can send contains a Realm (used to identify credentials within as SIP message, usually it is the SIP domain) and a Nonce (this is an MD5 unique string generated by the UAC for each registration request, it is made from a time stamp and a secret phrase to ensure a limited lifetime and it can’t be used again) as reported in the following: WWW-Authenticate: Digest algorithm=MD5, realm=”asterisk”, nonce=”3cf75870” Once the UAC receives the digest challenge and the user enters his credentials, the client uses the nonce to generate a digest response and sends it back to the server:

Authorization: Digest username=”1234”realm=”asterisk”,nonce=”3cf75870”, uri=”sip:[email protected]”,response=”cf89107228a444c1e8b761dfb6e669e4”, algorithm=MD5

The UAS will then perform the same process to arrive at its own MD5 hash and if it matches with the one supplied by the UAC, UAS responds with “200 OK” message and UAC has obtained the authentication.


Figure 11. RTPMIXSOUND help command


Figure 12. SIPDUMP use example


Figure 13. SIPDUMP Hash values

Even hashed passwords might not be safe enough to protect against Authentication Attacks since it is possible to crack MD5 hash, especially when short or too simple passwords are used: an attacker could obtain SIP authentication header with a Network Analyzer and perform a dictionary or brute-force attack.

In the following two examples of Authentication Attacks will be reported, the author’s choice about the tool is SIPCRACK (but SIPVICIOUS could be used again with a tool called SVCRACK). Before to show how to crack a SIP authentication password, the author must introduce another tool belonging to SIPCRACK suite called SIPDUMP. SIPDUMP purpose is to get the MD5 authentication challenge values by a SIP session and write them into a separate file, in order to do this task it can work either in a batch modality (with a pre-recorded .pcap file) or in a on-line modality (by mean of a MITM attack in course).

Figure 12 reports an use example of SIPDUMP, in this case the MD5 values was obtained by mean of a trace file (.pcap) obtained by a previous MITM attack. The trace was obtained by a call between UAC#1 and UAC#2, in particular UAC#2 is calling UAC#1 (which belong to the same network device of UAS, but it does not imply a loss of generality). How you can see by the picture, the MD5 values will be stored in a file called hash.txt which content is reported in Figure 13. Since UAC#2 is calling UAC#1, the victim of this Authentication Attack will be UAC#2. Now, the file called hast.txt can be used in order to crack the authentication password by mean of SIPCRACK tool.

Let’s start to develop a Dictionary Attack, it is obtained comparing the MD5 values of each password belonging to a password list, with the MD5 value selected from the file hash.txt. Figure 14 shows an example of this attack executed by mean of a password dictionary, called ps.txt, which contains more than two millions of alphanumeric passwords; as you can see by the picture, the author has selected the first MD5 value belonging to the SIP INVITE Request Message and the software has correctly cracked the password in just 2 seconds.


Figure 14. Dictionary Attack with SIPCRACK


Figure 15. Brute-Force Attack with SIPCRACK

Finally, let’s have a look how to accomplish the latter Authentication Attack called Brute Force. This name derives by fact that it tries every possible combination of alphanumeric characters in order to discover the correct password. An auxiliary tool called JOHN THE RIPPER will be used in order to help us to build the passwords. Figure 15 shows the attack, as you can see at the top of the picture the author has initially made a FIFO PIPE called j2s and he has used this PIPE as carrier in order to pass the passwords generated by JOHN THE RIPPER (which is generating alphanumeric password with a max length of 8 characters) to SIPCRACK. Since SIPCRACK during the previous attack has already cracked the first MD5 value stored in hash.txt, the target can be only the latter MD5 value belonging to the BYE Request Message. The author has interrupted the attack for sake of briefly since this kind of attack can takes long time, hours or even days.

One countermeasure that a network administrator could take in account is to use strong passwords, but the real only countermeasure in order to completely avoid this kind of threat is to employ a Public Key Infrastructures between UAS and UAC.

Denial of Service (DoS) Attack

A Denial of Service (DoS) attack on VoIP network can render it useless by causing a damage to the systems availability, it is one of the most dangerous attack since VoIP endpoints often are not equipped to protect themselves against this attack. Generally DoS attacks sends a lot of data (invalid or broken packets) by flooding the network to consume device resources, which could be physical (CPU usage) or logical (protocol features exploitations) in order to overwhelm it with a lot of requests while processing those packets. At the same time valid packets are not getting to the system, resulting in interrupted conversations and halted call processing because VoIP uses complex protocols for communications and even small delays in processing packets could cause serious damages in conversations. There are several different basic types of DoS attack that occur over the IP network.

1. Flood DoS: an attacker launches a very large number of packets to a victim device which gets it busy processing malicious packets while dropping or delaying legitimate packets. This attack can be performed in a way of a Distributed DoS (DDoS), where multiple systems are used to generate a massive flood of packets;

2. Implementation flaw DoS: an attacker creates malformed packets (they could be very long or syntactically incorrect) in order to cause the target to fail;

3. Application-level DoS: manipulate feature of the VoIP service in order to create an attack (for example, hijacking the registration for an IP phone can cause loss of any inbound calls to that phone);

4. Platform DoS: an attacker can create DoS by targeting a critical underlying support service (for example a fall in a network protocol implementation of the target OS).

INVITEFLOOD tool can be used to flood a target with INVITE Request Messages, it can be used against both UAS and UAC. As long the tool keeps flooding the PBX it will prevent users from making phone calls. Figure 16 reports an attack accomplished by the author with this tool, the number of INVITE packets was set to 100 in order to flood the victim. While issuing the attack the victim device will be unusable since it’ll need significantly longer time to establish a connection. Moreover you can flood the PBX with an inexistent extension; thus making it generate a “404 not found” just to keep it busy. Figure 17 reports a registration of packets received by the victim obtained again by mean of Wireshark, you can see a lot of INVITE Request Message was sent to the victim.


Figure 16. DoS with INVITEFLOOD


Figure 17. DoS packets registration


Figure 18. RTPFLOOD help command


Figure 19. TEARDOWN help command

RTPFLOOD is another tool used to flood a victim with UDP packets containing RTP data. In order to obtain a successful attack using RTPFLOOD, you need to know the RTP listening port used by the victim device (for example X-Lite softphone UDP default port is 8000). Figure 17 reports RTPFLOOD help command.

TEARDOWN is a tool used to terminate a call by sending a Bye Request Message, before using TEARDOWN you must to capture a valid SIP OK Response Message in order to use it “From” and “To” tags and a valid caller ID value. Figure 18 reports the help command that belongs to TEARDOWN.

In order to avoid DoS attacks, a network administrator can include a logical network partitioning called Voice VLAN. The basic concept behind Voice VLAN is that you can to dedicate a separate VLAN with a separate subnet for Voice traffic, this keeps contention between data and voice to a minimum and is easier to manage. Another solution could be a stateful firewalls with application inspection capabilities, policy enforcement to limit flooded packets, and out-of-band management in order to permit to the network administrator to reply to the network events at the attack moment by mean of a network monitoring.

Spoofing Caller ID

The caller ID is fairly easy to spoof in SIP, you just need to change the SIP INVITE Request Message from header. In order to spoofing the caller ID several tool can be used, for example SVWAR, a tool already used in a previous section and belonging to SIPVICIOUS suite. The author’s choice for this attack is again INVITEFLOOD, but in this example it is not used in order to flood the VoIP phone but to fake the Caller ID. Figure 20 shows this kind of attack, as you can see by the picture INVITEFLOOD sends one INVITE Request Message to the victim in order to spoof a Caller ID (-a “spoofed”) and making the victim phone rings. Figure 21 reports the caller ID “spoofed” displayed as the “Incoming Call” by X-Lite.


Figure 20. Caller ID spoofing with INVITEFLOOD

Figure 22 shows finally the Message Header of the packet captured by Wireshark, as you can see “spoofed” is the fake Caller ID reported in the Message Header and in this way the wrong information hides original caller information and might mislead the receiver.


Figure 21. X-Lite rings displaying a spoofed ID


Figure 22. Spoofed SIP INVITE

The only countermeasures that are effective involve authentication of the sender and/or the From: header. When coupled with the use of Public Key Infrastructures between UAS and UAC, digest authentication can be used securely to authenticate the UAC. This approach enhances authentication, but only provides hop-by-hop security, and it breaks down if any participating proxy does not support Public Key Infrastructures and/or cannot be trusted.


The aim of this article was developing a reliable VoIP hacking methodology overview that could be used against a VoIP network. Attack vectors including Information Gathering, Extensions Enumeration, Eavesdropping, Telephone Tampering, Authentication Attacks, Denial of Service, Identity Spoofing are re-ported and explained by mean of real examples accomplished by embedded tools. Moreover, the countermeasures reported in this article should be used by system administrators, penetration tester or network engineers to mitigate possible security threats.

On the Web

https://www.ietf.org/rfc/rfc3550.txt – RTP Request For Comment

https://www.ietf.org/rfc/rfc3261.txt – SIP Request for Comment

https://www.ietf.org/rfc/rfc3711.txt – SRTP Request for Comment

https://www.asterisk.org/ – Asterisk PBX website

https://www.x-lite.it/ – X-Lite SoftPhone website

https://www.zoiper.com/ – Zoiper website

https://www.wireshark.org/ – Wireshark website

https://blog.sipvicious.org/ – SIPvicious Blog

https://www.openwall.com/john/ – John The Ripper web page

About the Author


Mirko Raimondi obtained his Master’s degree in Computer Science from the University of Milan – Computer Science Department. He worked as a Software Engineer at ITALTEL – an Italian leader company in telecommunications industry – where he was being the project leader of Netmatch-S Lite Edition, a VoIP Session Border Controller based on the virtual platform and running on commercial hardware. In test plant of ITALTEL he realized testing scenarios by mean of Cisco L2/L3 devices and he has a CCNA-security in course. Currently, he works in automotive industry, where he has realized an audio/video/meta-data multiplexer in order to hide GPS data in mov _les. He’s interested in VoIP telecommunications, network security, steganography methods and computer forensics. You can contact him either through LinkedIn: https://it.linkedin.com/pub/mirko-raimondi/14/182/58a 


May 22, 2014