ESP encrypts all critical information, encapsulating the entire inner TCP/UDP datagram within an ESP header. ESP is an IP protocol in the same sense that TCP and UDP are IP protocols (OSI Network Layer 3), but it does not have any port information like TCP/UDP (OSI Transport Layer 4). This is a difference from ISAKMP which uses UDP port 500 as its transport layer.
Why can’t an ESP packet pass through a PAT device?
It is precisely because ESP is a protocol without ports that prevents it from passing through PAT devices. Because there is no port to change in the ESP packet, the binding database can’t assign a unique port to the packet at the time it changes its RFC 1918 address to the publically routable address. If the packet can’t be assigned a unique port then the database binding won’t complete and there is no way to tell which inside host sourced this packet. As a result there is no way for the return traffic to be untranslated successfully.
How does NAT-T work with ISAKMP/IPsec?
NAT Traversal performs two tasks:
Detects if both ends support NAT-T
Detects NAT devices along the transmission path (NAT-Discovery)
Step one occurs in ISAKMP Main Mode messages one and two. If both devices support NAT-T, then NAT-Discovery is performed in ISKAMP Main Mode messages (packets) three and four. THe NAT-D payload sent is a hash of the original IP address and port. Devices exchange two NAT-D packets, one with source IP and port, and another with destination IP and port. The receiving device recalculates the hash and compares it with the hash it received; if they don’t match a NAT device exists.
If a NAT device has been determined to exist, NAT-T will change the ISAKMP transport with ISAKMP Main Mode messages five and six, at which point all ISAKMP packets change from UDP port 500 to UDP port 4500. NAT-T encapsulates the Quick Mode (IPsec Phase 2) exchange inside UDP 4500 as well. After Quick Mode completes data that gets encrypted on the IPsec Security Association is encapsulated inside UDP port 4500 as well, thus providing a port to be used in the PAT device for translation.
To visualize how this works and how the IP packet is encapsulated:
Clear text packet will be encrypted/encapsulated inside an ESP packet
ESP packet will be encapsulated inside a UDP/4500 packet.
NAT-T encapsulates ESP packets inside UDP and assigns both the Source and Destination ports as 4500. After this encapsulation there is enough information for the PAT database binding to build successfully. Now ESP packets can be translated through a PAT device.
When a packet with source and destination port of 4500 is sent through a PAT device (from inside to outside), the PAT device will change the source port from 4500 to a random high port, while keeping the destination port of 4500. When a different NAT-T session passes through the PAT device, it will change the source port from 4500 to a different random high port, and so on. This way each local host has a unique database entry in the PAT devices mapping its RFC1918 ip address/port4500 to the public ip address/high-port.
What is the difference between NAT-T and IPSec-over-UDP ?
Although both these protocols work similiar, there are two main differences.
When NAT-T is enabled, it encapsulates the ESP packet with UDP only when it encounters a NAT device. Otherwise, no UDP encapsulation is done. But, IPSec Over UDP, always encapsulates the packet with UDP.
NAT-T always use the standard port, UDP-4500. It is not configurable. IPSec over UDP normally uses UDP-10000 but this could be any other port based on the configuration on the VPN server.