CDN with TCP anycast lite

This is an idea for how to construct a better CDN. I don’t know if anyone has thought of this before. It’s a boring technical discussion, so it goes under the fold.

A CDN is supposed to route client requests to the “nearest” node in the network. There are generally two common ways to identify the nearest server:

When the client makes the initial DNS request, have it be served from a DNS server that uses IP anycast. This means that there are many different BGP announcements for the DNS server’s IP address, generally one for each CDN node. The DNS server reached by the client then returns the unicast IP address for the CDN node right next to it. This method returns the CDN node nearest to the client’s DNS server, as determined by BGP.
Give everyone the same IP address, which points to a webserver that looks up the user in a GeoIP database. The client is directed to the geographically nearest CDN node with an HTTP redirect.

The HTTP redirect based method works poorly when the GeoIP database has no or inaccurate information about the client’s IP address or when geographical distance corresponds poorly to network distance. Anycast DNS routing usually works well, but is bad when the client is far from his recursive DNS resolver. This can happen if a geographically dispersed ISP has a central DNS cache, or when users use OpenDNS or Google Public DNS. For example, there have been stories of large Apple update downloads being slow because many Google Public DNS users reach the same Akamai node.

TCP anycast is a third, much less common, way to run a CDN. As far as I know only CacheFly and MaxCDN use it. With anycast TCP, the geographically dispersed content servers are accessed through a single IP address within an anycast announcement. When you make a request via HTTP or RTMP, BGP will make sure your request reaches the nearest CDN node, and content servers in that node will serve your content back.

The problem with TCP anycast is that TCP is a stateful protocol and the servers will get confused if, through the vagaries of BGP, subsequent packets within the same TCP connection reach a different CDN node. The presentation I linked to above claims that TCP anycast works very well, even for long-lived connections. However, it is still not widely adopted. CacheFly probably employs some very talented network engineers. It just seems easy to get wrong.

My proposal is a simpler variant of TCP anycast. It involves an HTTP redirect, which some may find unacceptable, but which could work for a lot of CDN users:

All users get the same IP address, in an anycast announcement.
That IP address points to a webserver in the nearest CDN node. It immediately redirects at the HTTP level to a different hostname/IP address that is node specific. For example, if I request http://cdn.guan.dk/, I will reach the anycast web server at the New York node, which will redirect me to http://nyc.cdn.guan.dk/, which points to an IP address in a unicast announcement.
The content server at nyc.cdn.guan.dk serves up content without interruption.

This setup uses TCP anycast to determine the client’s location and which CDN node to use, but does not rely on routes to be stable over a long-lived TCP session, only the time it takes to redirect the user, hopefully less than 100 ms and about 8 to 10 packets.

Guan’s blog

CDN with TCP anycast lite