What we've learnt so far: explicit free list draw a large block (header, next pointer, footer) draw an allocation event draw another allocation event draw a free event Compared to implicit list, it is faster in allocation Downside? * What's the search policy? best fit? (slow) first fit ? (fragmentation) Our third design addresses this limitation Segregated list * Different size classes (several hundred size classes) * Each size class has its own free list * Usually have separate small size classes and large size classes - large classes have power of two sizes (not always) An example segregated list design: TCMalloc >100 small size classes ( < 4KB) 256 large size classes, the largest one holds blocks >= 256 * 4KB in size malloc(n): if n < 4KB, - look in the appropriate freelist of a small size class, do not split the block. - if not found, ask OS for more if n >= 4KB - look in the appropriate freelist, if empty, go to the next larger-sized one - if still not found, ask OS for more - allocate, and put the remaining block in the appropriate freelist free(p): - figure out the size class of pointer p - if is a small block, insert into freelist of appropriate freelist - if is a large block, then try to coalesce How to figure out size class of pointer p and size/status of its adjacent blocks (for coalescing)? * one can keep it in block's header/footer (as we've discussed in the last lecture) - wasteful for small blocks (many consecutive blocks have the same size), alignment is a pain * TCMalloc keep it in a separate data structure - array indexed by page number. each element of the array contains size/status info - this gets too large on 64-bit, use a 3-level "page-table-like" structure. Modern languages support garbage collection: Java, Go, Python etc. automatic reclaimation of space (app never has to free) Requirement: * Can learn of all "pointers" in allocated block * Cannot hide "pointers" (example hiding, C cast pointer to int) How does GC know which block is free? View memory as a graph of root nodes and heap nodes - each node represents a block - a node X points to Y if X contains a pointer to Y A node is reachable if there's a path from some root node to it Non-reachable nodes are garbage Simple GC algorithm: Mark-n-sweep Starting from root nodes, mark every reachable node as "marked". Scan every node, free the non-marked nodes Common memory bugs: * Using Malloc-ed stuff without initialization * Overwriting due to allocation with wrong-size int *p = (int *)malloc(100) for (int i = 0; i < 100; i++) { p[i] = 0; } * Double free int *p = (int *)malloc(100*sizeof(int)) ... free(p) .... <-- potentially, int *q = (int *)malloc(10*sizeof(int)) free(p) * writing freed block free(p) ... <-- another malloc potentially happens here p[2] = 10 * memory leak - e.g. only free the head node of a linked list