Internet epidemic outbreaks and defenses (These notes are partly based on Stefan Savage's slides) The "species" populating the current Internet is vulnerable * poor genetic diversity; heavily inbred * poor hygiene practice * instantaneous disease transmission * slow immune response (10-1000 times slower) Worse yet, diseases are "designed" * trivial to create * highly profitable Modern threats * target all systems at once * exploit software vulnerabilities and naive users Epidemic enablers * unrestricted connectivity (IP everywhere) * software homogeneity and user naivete one bug = millions of vulnerable hosts a few percent of naive users = millions of vulnerable hosts * effective anonymity (little risk) * defenses?? Epidemic's economic incentives * spam, credit card theft, DDos exortion * there's even market place compromised hosts! Worms exploit software vulnerability Let's examine Russ Cox's vulnerable server and its exploit - What's the root cause of the vulnerability? buffer overflow - How does the exploit abuse this vulnerability? hijack control flow by smashing return address on stack execute arbitary code on stack - What does the exploit achieve? - Does this exploit have a recognizable signature? If so, what is it? - Can you write different exploits for the same root cause? What about the signatures? - Will write all your servers in Java solve this problem? Note: This is just an example exploit. More sophisticated ones exist: - do not necessarily have to smash return address (e.g. clobber function pointers) - do not have to execute code on stack (return-to-libc attack) In addition to gaining control of hosts, worms exploits replicate itself and propagate: - Code Red, Nimda, Slammer etc. - just like an infectious epidemics How fast? - da/dt = K*a*(1-a) a = e^K(t-T)/(1+e^K(t-T)) goes to 1 extremely fast The S-shaped graph What affects the absolute values of x-axis (i.e. K)? - How frequently are infections attempted? * link speed is the limit - How likely does an infection attempt succeed with new victim? * hit-list (quick bootstrap, can be precomputed) * permutation scanning (avoid duplicate attempts) * topological scanning Bottom line: Staniford et. al. estimates 30sec-15 minutes Reality: Slammer (2003) infected 90% of vulnerable hosts in <10 minutes (100K infected) What can be done? - Prevention Reduce the population of vulnerable hosts - Containment Reduce infection rates while # of infected hosts is still small Prevention - better software quality to eliminate vulnerability via testing, checking, code review etc. Challenges: soundness, completeness, scalability, usability - heterogeneous software? - better updates (most worms exploit known vulnerability) automated patch aotomatic patch creation? network filtering: a quick temporary fix: e.g. drop TCP packet to port 33333 and > 100 bytes - hygiene enforcement only let hosts online if they are "well cared" Containment: Reactive defense * Which detection strategy? Online vs. Offline Offline detection methods: * Network telescopes monitors large range of IP addresses * Honeynet executes suspected traffic on "infectable" honsts in controlled environment * Which blocking strategy? - address filtering block malicious source IP easy to implement but must react to EACH infected host - content filtering block traffic based on content signature only need to react to a specific exploit * Who deploys the defense? - Hosts: - Network: enterprise network, ISP Goal of earlybird - Network deployment (Enterprise or small ISP level) - Content filtering based on signatures - Online automatic signature generation based on live traffic(for small reaction time) EarlyBird Intuition: - Worm contents are invariant, prevalent - Worms target at a specific service - Worm contents comes from and goes to many distinct hosts Pseudocode of a naive implementation ProcessPkt(payload, srcIP, dstIP, dstPort) { p = hash(payload+dstPort); counter[p]++; src[p] = src[p] \union srcIP; dst[p] = dst[p] \union dstIP; if (counter[p] > CountThres && size(src[p]) > hostThres && size(dst[p]) > hostThres) alarm "worm detected, signature is payload" } Limitations of naive implementation? * What if the invariant signature is less than the full packet? Index all substrings of fixed length \beta = 40 Rabin fingerprint allows efficient incremental hash computation - one multiplication, addition and mask per byte Hence it takes O(S) to compute all S-\beta+1 substrings of \beta instead of O(S*\beta) * Memory consumption too high 250,000 pkts/sec (1Gbps link, 500 byte packet size) 250,000 * 1 byte counter = 250K bytes memory consumption per second * CPU consumption too high 4 us seconds to process each pkt? Reducing required state * Break processing into 2 steps, 1. estimate prevalence, 2. estimate address dispersion only when prevalent - Only keep track of address dispersion for prevalent contents - Measure prevalent contents over a small interval ==> results: few contents are prevalent (0.6%) Estimate prevalence using small state: (Multi-stage filtering) - Multiple hash functions for multiple counter arrays(e.g. 4) - Update smallest counter of the 4. - Why multiple hash function? Estimate address dispersion (Scalable bitmap counter) Basic idea: hash address into a b-bit bitmap, if # of bits set is > max, then # of distinct addresses is b*log(b/# of zero bits) How to count to arbitarily large number? - Automatic scale bitmap to cover only a portion of the hash space - Keep a sliding window of bitmaps to smooth the transition Reducing CPU: * Sampling (only process a fraction of substrings) Random? Sample at certain offsets? Value-sampling: sample based on the hash of the substring (e.g. if leading 6 bits are all zero) How well does EarlyBird perform? - no known false negative) - False positives Common protocol headers: HTTP/SMTP, P2P protocol Non-worm epidemic-like activity: SPAM, Bittorrent Limitations of EarlyBird? - Variant content - Slow/stealthy worms - Network evation (IP fragment, TCP fragmentation etc.)