Secure enterprise SIP communication
Enterprise communication systems are typically deployed within private networks, with Session Border Controllers (SBCs) or voice gateways installed at the network edge to facilitate external communication. Therefore, in most scenarios, enterprise communications remain highly secure. However, a growing number of businesses are now deploying SIP servers in the cloud, while an increasing volume of SIP terminals within enterprises are accessing these corporate SIP servers from external networks. This shift has exposed part (or all) of enterprise communication systems to public networks, making security concerns increasingly severe.
The security of enterprise SIP communication involves many aspects of the network system, such as firewalls. Focusing solely on the SIP communication itself, it must be encrypted to prevent the exposure of communication information to other network users. Encrypted SIP communication consists of two parts: (1) SIP message (signaling) encryption, and (2) voice stream (RTP) encryption, as illustrated in the figure below:

Certainly, enterprises can deploy VPNs to encrypt the entire network system — not just communication systems but also office systems and more. Encrypted SIP communication can also be established over a VPN. However, setting up an enterprise VPN involves relatively high costs and complex systems. This article focuses solely on encrypted SIP communication and does not cover other network security technologies such as VPNs.
SIP message encryption is achieved through “SIP over TLS.” Both cloud-based miniSIPServer, on-premises miniSIPServer, and miniSIPPhone support SIP over TLSv1.2 / TLSv1.3. Please refer to the online documentation for details, as this article will not elaborate further on this topic.
Voice streams are encrypted through SRTP transmission. The master key and master salt for SRTP are transmitted and negotiated via the SDP (RFC4568) in SIP messages. Therefore, only when SIP messages are encrypted can the critical information of SRTP be ensured not to be leaked. Simply encrypting voice streams with SRTP while transmitting SIP messages in plaintext cannot guarantee the overall security of SIP communication.
RFC4568 defines several cryptographic suites. Currently, we have chosen to support the default AES_CM_128_HMAC_SHA1_80 and do not yet support other encryption suites.
The SRTP protocol family includes numerous extensions. Currently, miniSIPServer and miniSIPPhone support the most fundamental RFC3711 protocol, which is also the basic SRTP protocol supported by the vast majority of SIP devices (including servers, PBXs, SBCs, and endpoints). DTLS-SRTP is not currently supported, primarily due to the following considerations: (1) SIP over TLS already ensures the security of the master key & salt, achieving an effect equivalent to that of DTLS; (2) although internet technologies like WebRTC widely adopt DTLS-SRTP, most SIP devices do not support it, which would lead to interoperability issues in the realm of enterprise SIP communication.
miniSIPServer and miniSIPPhone can enable SRTP by default without requiring additional configuration. Some SIP devices need explicit configuration to select SRTP. For example, when configuring an account in MicroSIP, the “Media Encryption” setting must be configured as follows:
