IP and Overlay Multicast jinyang@cs.nyu.edu Why multicast? Many hosts want to receive the same stream of data packets. Internet radio/TV/conference Stock quote information Multiplayer network Games Multicast model: * communication oriented multiple receivers want to get the same packet. Granularity is at packet-level. In contrast, contention distribution is file/data oriented. Multiple receivers want to download the same file. Downloaders buffer the entire file for others to upload. Granularity is at file-level. Whether multicast or content distribution largely depends on how promptly application needs to receive data. * one source, multiple receivers vs. multiple source, multiple receivers What kind of applications need support of multiple sources? distributed gaming/conferencing (each node needs to notify all members of results) Naive approach has the sender send one copy of data to each of the n receivers. * sender's outgoing links will be congested. It needs nX b/w to push out those packets. 128Kbps mp3 streams x 10,000 listeners ==> 1.28Gbps * wasteful of overall network resource. n * d copies of data packets in total, where d is the average path length. * (good) Only the sender needs to keep (a reasonable amount) of state to keep track of live members. IP Multicast: * routers keep track of membership, construct multicast trees for packet replication and forwarding. * efficient (low latency, no duplicate packets on links) Overlay multicast: * hosts keep track of membership, construct multicast trees for packet replication and forwarding. * No changes to Internet infrastructure (routing, forwarding). Easy to deploy. * Sacrifice performance - more packets are duplicated (higher link stress) - higher latency An example A C \ / R1-R2 / \ B D Paper I: Multicast routing in datagram networks and extended LANs Addressing: Groups are addressed by IP addresses with prefix 224.0.0.0/4 (1110 0000 ...) Service interface: Hosts join multicast groups. All packets with destination IP address for a group are delivered to all members. Goal: Construct source specific multicast trees (In contrast to shared trees, source-specific trees are more efficient at routing, but requires much more router state) Ethernet is a broadcast medium; multicast on a single Ethernet segment requires only filtering out unwanted packets How to multicast over the Internet? * Augmenting distance vector - Basic DV: each router keeps and exchanges tuples for each destination. - How to do broadcast without loops? > Shortest paths from source S to all destinations form a tree! (let's call it shortest path broadcast tree) > RPF: A router duplicates/forwards a multicast packet if it arrives via the shorted path back to S Why does it work? Only forward packets belong to the shortest path broadcast tree. Requires symmetric path (why? An example of RPF fails when paths are not symmetric) Problem : Both X and Y forward multicast packets onto segment a S . . . . . . X Y | | ----------------- a | Z | ------- b > RPB: Identify a single parent router for each shared link and source S; the router with min. cost to S. Additional state: each router needs to have one bitmap identifying child links for each routing entry. - If the group is sparse, how to prune unnecessary broadcast packets to achieve multicast? > Each router figures out what multicast groups are being subscribed to Hosts on LAN periodically send IGMP packets with groups they belong to > Prune broadcast tree on demand (only for active sources and groups) > A router sends non-membership-report for (S, G) if 1. all its child links are leaves and non have members of the group G 2. it has received non-membership-reports from all its child routers Additional state: each router stores non-membership-reports for all active (S,G) pairs * Augmenting link state - Basic LS: recall each router floods network with changes in link state information such as up/down status. - Routers monitor subscribed groups on each LAN; changes in subscribed groups trigger link state flooding. - Routers use Dijkstra's algorithm to find shortest-path multicast tree from any source to any group Expensive calculation; why? one tree costs O(e) time ==> potentially O(e * n * g) to calculate one tree for each source, group pair. solution: calculate on demand for active source,group pairs at a router. Paper II: Overcast - End hosts, instead of routers, form the multicast trees. End hosts collaboratively keep information about the multicast group they are in. What is Overcast used for? - A data (high-res video recording, or possibly live stream) distribution system - Intended for businesses to distribute data from a single source to (hundreds?) geographically distributed offices. - Not designed for interactive applications and less suited for live streaming. Setup - A business designates (deploys) a number of PCs at each of its offices running Overcast software. > Overcast nodes tend to be stable w/ reasonably high bandwidth - Source "pushes out" contents along multicast trees to all nodes. - Clients access contents/streams using Web browsers with traditional HTTP protocol. - Groups are named using URLs: hostname in the URL identifies root of multicast tree (fits the single source distribution model) path in the URL identifies the particular group using the multicast tree. Goal: construct a source specific multicast tree that maximizes each node's bandwidth from the source. How to construct the distribution tree? - Overcast places a new node as far away from the root as possible without sacrificing b/w to the root. With this strategy, nodes tend to choose parents that are "closeby" in the underlying network topology. - Upon join, a node recursively relocates itself under one of its current sibling if b/w to that sibling is at least as high as to the current parent. (Nodes explicitly measure b/w by transferring 10K data) - A node periodically probes its parent, grandparent and siblings to re-evaluate its positions in the tree. - Nodes should rarely fail (because they are well-managed, dedicated servers), if they do A child node relocates under grandparent if his parent fails. (A node knows the entire overlay path to the root, hence it knows all its grandparents) How to redirect clients to one node in the tree? - root needs to keep track of membership in the multicast tree (Up/down protocol) Each Overcast node keeps track of all its descendants > Each node periodically checks in with its parent > A failure to check in causes the parent to generate a death certificate for that child > Each check-in reports changes in the membership (new "death certificates" and "birth certificates" for a node's descendants). > Each certificates has an associated sequence number to distinguish between new and old information. - The root is in charge of re-directing all clients' HTTP requests --> single point of failure > replicate the root > root replicas form a linear chain so all replicas know all members of the tree. Discussions: Why not using bittorrent for disseminating video files to all nodes? Why not using Overcast to replace bittorrent for file sharing? Paper III: Splitstream Motivation: * Internal nodes in the (binary) multicast tree send 2X bit rate; leaves send 0X bit rate. Not a problem for infrastructure-based system like Overcast, but less desirable if hosts are ordinary users' desktops because - not all peer nodes contribute upload bandwidth - internal nodes failures cause loss of streaming packets Splitstream: form a "forest" of multicast trees, splitting data E.g. 9 trees, each send 1 bit per byte, plus one tree for parity ==> any 8 bits suffice to reconstruct the byte Nodes join all 9 trees such that they are leaves in some trees, internal in others Failure of one internal node causes loss of only one stripe. Goal: reduce total upstream utilization for each node to 1x bit rate Basic multicast using a Pastry DHT (SCRIBE) * Addressing: A group is identified by a random DHT key k_g (e.g. SHA-1(URL)) * Tree construction: - All nodes in the group form a Pastry ring - Union of all DHT routes from each group member to the root node responsible for k_g forms the multicast tree - Use reverse path forwarding to forward packets along this multicast tree. How to ensure internal nodes in one tree are leaves in other trees? Property of one Pastry tree: Examples to review Pastry routing and show properties of Pastry multicast tree ... All interior nodes share the same first digit as the root If all 16 roots have different first digit, then each node is internal node for at most one multicast tree Other issues; * Each multicast tree is not necessarily balanced (some internal nodes have more children than others) * Different nodes have different upload b/w. Solution: * When a node reaches its upload capacity, it rejects the new join request or punt an existing child. * All nodes that have less children than what its upload b/w allows join a spare capacity group tree. It can be proven that the probability of join failure is small if there's some spare capacity in the system. Discussions: Average delay for receiving 8 stripes is the worse average delay incurred in all 8 trees. Incentives for peer nodes to contribute proper amount of upload b/w? (Nodes must contribute at least as much upload b/w as it downloads for the whole system to work) Conclusions: IP or Overlay multicast, which is better? Depends on the situation: * IP multicast (not widely deployed today) - scalability concerns routers need to keep state for every active multicast group! Why is it scary? > multicast group addresses are not easily aggregatable. > multicast sessions are dynamic than link-level toplogy. - providing congestion control, reliability, flow control and security on top of IP multicast is difficult - it requires changing the infrastructure! Do ISPs have incentives to do so? + efficient + more reliable * Overlay multicast - need to deploy additional infrastructure (Overcast style) - or use less reliable end hosts - less efficient; higher delay, link stress + more generally available: requires no change to existing IP infrastructure + easy to implemenet features like flow control, security, layered multicast (fast hosts get higher-resolution of videos)