|
VoIP Protocols
When shopping for VoIP services and equipment, you will often see
references to H.323 and SIP. These are the two most common protocols
used for handling VoIP calls, but there are also many others.
What Is a Protocol
When we speak of protocols, we are referring to a set of rules
that must be followed in order to allow two or more communication
devices to 'talk' to each other. In the Internet and computer worlds,
there are many different protocols which have been established.
The Internet has protocols for various purposes depending on the
type of data that is being transmitted and its relative importance.
Protocols can be layered -- used with each other to form a set of
protocols which must be recognized at every point along the Internet
pathways.
The basic protocol for the Internet is the Internet
Protocol (IP). This allows computers to send data back and forth,
but offers very little guarantee that the data will arrive intact. Other
layers are used on top of IP in order to guarantee data integrity or
speed of delivery. VoIP depends on rapid delivery of data
packets, but is not overly concerned if a few of the packets are dropped
en route. When data integrity is important (for example when
transmitting program files) a protocol like TCP (Transmission Control
Protocol) is used on top of IP. However, it is too slow for VoIP.
SIP
SIP stands for Session Initiation Protocol. It is becoming the standard
for VoIP, and most VoIP service providers and soft phones
use or at least offer this protocol.
SIP defines standards for a number of different services including
caller identification, conference calls, call forwarding, and user
mobility. SIP addresses are similar to IP (Internet Protocol) addresses
and so can be used on web sites for 'Call Me Now' links.
As well as being able to handle voice, it is also suitable for
transmitting multimedia such as video or music.
When used for VoIP, SIP assigns each user a unique address. This
address is independent of actual physical location, so the same SIP
address can be used by one user anywhere in the world. To initiate an
SIP call, the caller sends an "invite" request to the person he wishes
to speak to. The invite request is part of the SIP standard, and is
handled transparently by the software or hardware that the caller is
using.
As the other party is being searched for, response codes are sent to the
call initiator. There are separate codes for searching, ringing, and
success, as well as codes to indicate server failures or that the other
party is not available.
Once the call has finished, a "Bye" command is issued to terminate the
connection.
H.323
Like SIP, H.323 can be used for transmitting multimedia data. It was
developed with multimedia data transmission in mind, something that
makes it ideal for VoIP. It also has a number of features for
interacting with PSTN (Public Switched Telephone Network). For example,
included in its specifications is the ability to send and receive faxes
-- something that poses technical difficulties with SIP.
It was originally developed for multimedia streams over a LAN, and was
widely accepted in this role. The standards of H.323 have received wide
acceptance and the specification continues to evolve. It is related to a
suite of protocols which individually handle things like security, call
signalling, and determining the capabilities of each party.
Even though H.323 was developed before SIP, it seems to be losing ground
as a standard VoIP protocol. The main reason for this is the adoption of
SIP by the 3rd Generation Partnership Project (3GPP) – the organization
responsible for setting standards for 3rd generation mobile
communication devices. In addition SIP is also much simpler than H.323.
Other VoIP Protocols
As explained in previous articles in this series, Internet data
transmissions are composed of several layers. The network layer consists
of the Internet Protocol (IP) which establishes a connection between two
computers. The transport layer provides the rules for sending the data
-- for example, whether the data needs to arrive intact or whether data
can be missed. The application layer determines how the data will be
processed once it arrives at its destination.
Most data travelling over the Internet uses the Transmission Control
Protocol (TCP) for the transport layer because it guarantees data
delivery and integrity. VoIP does not need the kind of delivery
guarantee which TCP provides, so most VoIP transmissions use the
faster User Datagram Protocol (UDP) as the transport layer.
Once VoIP data arrives at its destination, the application layer
interprets it and presents it to the user. The most commonly used
application layers for VoIP are SIP (see part 1) and RTP.
RTP
Real-time Transport Protocol (RTP) was originally designed for
delivering multimedia content over the Internet. It is often used for
streaming (delivering in real-time) audio and video content such as
music and movies.
RTP always uses UDP (User Datagram Protocol) as the transport layer. It
can be used in conjunction with both SIP and H.323 for delivering voice
data in a consistent and reliable manner. It provides services for
identifying the type of data, its sequence, and whether or not each
packet has been delivered.
QOS
Quality of Service (QOS) in VoIP refers to the likelihood that voice
data will be delivered quickly and up to a certain standard -- clear and
without background noise. It is used for VoIP, multimedia streaming, and
applications which require a high degree of reliability.
Essentially, QOS is provided by ensuring that enough bandwidth has been
reserved for a particular application. There are two ways to do this --
providing more than enough bandwidth to meet all needs at all times, or
reserve bandwidth when it is needed. The second option is more practical
because there is no way to foresee exactly what network demands will be
at any given time.
VoIP most often uses RSVP (Resource ReSerVation Protocol) to
reserve bandwidth, although other solutions including VLAN (Virtual
Local Area Network) and VPN (Virtual Private Network) are being used by
some VoIP service providers.
RSVP
Resource ReSerVation Protocol (RSVP) is used to manage Quality of
Service (QOS). RSVP is used to request a minimum bandwidth and latency
from every Internet router between two endpoints. Those that comply will
reserve resources for the data stream.
The Internet has mechanisms in place to monitor the signal path between
any two points. When a reservation request is received, the routers
along the path examine the state of the signal paths and decide whether
they can accommodate it. Once the reservation is accepted, the routers
have to carry that data as specified. To do this they reserve the
resources necessary to guarantee bandwidth. After receiving an RSVP the
data path is monitored to make sure the data travels along the path as
expected. If not, the reserve request will timeout after a certain
period of time so that resources are not unnecessarily used up.
The Internet Protocol Suite as applied to VoIP.
All data travelling over the Internet is made up of packets that contain
a payload as well as extra information that determine where and how that
payload will be delivered. In VoIP the payload is the actual
voice data. The packet also consists of several other 'layers' that aid
in the speedy delivery of the data which allows real time conversations
to take place over the Internet.
The Internet Protocol Suite (IPS) is composed of 5 layers which
encapsulate the actual payload. The layers contain information about how
the payload is to be delivered – for example, if all the data has to be
delivered or not – and how it will be treated on delivery. There are
three layers that make connections between computers and two physical
layers that data must pass through en route to its destination. These
physical layers are part of the computer.
The top layer of the IPS is the Application Layer. The VoIP
soft-phone controls the Application Layer – in VoIP a common
application layer is SIP (Session Initiation Protocol). It specifies the
type of connection the caller wants to make (voice, video or instant
messaging for example), and identifies the other party with a unique
number similar to an IP address (the 4 part number that identifies every
computer on the Internet).
The Transport Layer in any Internet connection determines the format for
delivering data. Web pages usually use TCP (Transmission Control
Protocol) which guarantees data delivery – sometimes at the expense of
speed. VoIP depends more on speed than data integrity, so TCP is
not usually used. Instead RTP (Real-time Transport Protocol) in
conjunction with UDP (User Datagram Protocol) are used to control data
flow. RTP identifies the payload and provides sequencing information so
that the data can be reassembled correctly when it reaches its
destination. UDP provides a fast method of delivery but by itself cannot
determine data sequence or delivery information.
Next is the Network Layer, which for the Internet (and other networks)
is the Internet Protocol (IP). IP is used to set up a path, but offers
no guarantees for data delivery or integrity. For dependable data
delivery the upper layers (transport and application) are needed.
The two physical layers of the Internet Protocol Suite are the Data Link
layer and the Physical layer. Ethernet is used as the Data Link layer in
VoIP. It provides a means to transmit data reliably by
controlling and synchronizing the flow. The physical layer provides the
pathway that transmits bits to the Data Link layer. In VoIP the
physical layer is the twisted-pair cable that connects the network card,
routers, modems, Analog Telephone Adaptors (ATAs) and IP phones.
If we follow the data path of a voice packet, it originates in the sound
card which converts the voice into digital data. The audio stream is
compressed by the VoIP software and divided up into packets which
contain information about where the data is to be delivered. This data
is transmitted from the computer to the Internet through the
twisted-pair cable attached to the modem.
The data packets may take several different routes to their destination
because of the ever-changing conditions of the Internet. On arrival, the
voice data has to be reassembled in the correct order and converted to
an analog signal which the receiving party can hear. All this should
happen in less than half a second no matter where in the world the two
parties are located.
|