What we've learnt so far - implicit list Limitations of implicit list? * Hint: what's the most expensive operation? How expensive? * the problem is that the list contains both allocated and free blocks Idea: How about making a list of just free blocks? // hdr_t is larger now, but the extra fields only exists for free blocks typedef struct { size_t size; char allocated; char *next; char *prev; } hdr_t; #define HSZ sizeof(hdr_t) #define AHSZ (sizeof(hdr_t)/2) void *heap_start; void *heap_end; static hdr_t *flist; void init() { hdr_t h; assert((&h.next-&h)== AHSZ); h->size = (heap_end - heap_start); //assuming size is sufficiently large h->allocated = 0; list_insert(&flist, h); } void malloc(size_t size) { header_t *h= flist; while (h && h->size < size + AHSZ ) { h = h->next; } if (!h) { h = ... ask for more from OS list_insert(&flist, h); } return allocate(h, size); } void * allocate(hdr_t *h, size_t size) { list_remove(&flist, h); size_t remaining = h->size - size - AHSZ; if (remaining > HSZ) { h->size = size + AHSZ; h->allocated = 1; header_t *fchunk = (header_t *)((void *)h + h->size); fchunk->size = remaining; fchunk->allocated = 0; list_insert(&flist, fchunk); } else { h->allocated = 1; } return (void *)h + AHSZ; } void free(void *p) { header_t *h = (header_t *)((void *)p - AHSZ); h->allocated = 0; coalesce_with_right(h); } void coalesce_with_right(header_t *h) { header_t *next = (header_t *)((void *)h + h->size); while ((void *)next < heap_end && !next->allocated) { list_remove(&flist, next); h->size += next->size; next = (header_t *)((void *)next + next->size); } list_insert(&flist, h); return n; } Question: how do we colaesce with both previous and next block? * Add a footer Draw picture and explain how do I get the pointer to the previous block Other limitations of our simple implementation: * we did first-fit to find the block - other choices include next-fit (start where the previous search left off) and best-fit (find the block with the fewest bytes left) * we did not obey alignment rules * we did not pack header information to save space What's the slowest event? - searching for the free block Our third design addresses this limitation Segregated list * Different size classes, each size class has its own free list * Usually have separate small size classes and large size classes - large classes have power of two sizes (not always) An example segregated list design: TCMalloc >100 small size classes ( < 4KB) 256 large size classes, the largest one holds blocks >= 1MB in size malloc(n): if n < 4KB, - look in the appropriate freelist of a small size class, do not split the block. - if not found, ask OS for more if n >= 4KB - look in the appropriate freelist, if empty, go to the next larger-sized one - if still not found, ask OS for more - allocate, and put the remaining block in the appropriate freelist free(p): - figure out the size class of pointer p - if is a small block, insert into freelist of appropriate freelist - if is a large block, then try to coalesce TCMalloc has a fancier way of obtaining size of pointer p? * our implicit/explicit implementation encodes size in header attached to each block - wasteful for small blocks (many consecutive blocks have the same size), alignment is a pain * TCMalloc keep it in a separate data structure - array indexed by page number. * for small chunks, all chunks in a page belong to the same size class * for large chunks, one page belongs to one chunk - this gets too large on 64-bit, use a 3-level "page-table-like" structure. Common memory bugs: * Using Malloc-ed stuff without initialization * Overwriting due to allocation with wrong-size int *p = (int *)malloc(100); for (int i = 0; i < 100; i++) { p[i] = 0; } * Double free int *p = (int *)malloc(100*sizeof(int)) ... free(p) .... <-- potentially, int *q = (int *)malloc(10*sizeof(int)) free(p) * writing freed block free(p) ... <-- another malloc potentially happens here p[2] = 10 * memory leak - e.g. only free the head node of a linked list malloc/free bugs can be exploited! Example: void foo() { p = malloc(1024); q = malloc(1024); strcpy(p, ...); //buffer overflow free(p); } p +----------------+ | size | +----------------+ | allocated | +----------------+ | next ptr | +----------------+ | prev ptr | +----------------+ | user data | +----------------+ q +----------------+ | size | +----------------+ | allocated | +----------------+ | next ptr | +----------------+ | prev ptr | +----------------+ | user data | +----------------+ attacker overwrites hdr_q->allocated to 0 overwrites hdr_q->next = , hdr_q->prev = prev->next contains address of foo's return address> in our coalece code, we invoke list_remove(&flist, q); q->prev->next = q->next; attacker can perform an evil write to arbitary memory location stack canary won't be able to defend against this type of attack