Concurrency lecture II Multi-threaded programs: - achieve speedup (utilizing multiple CPU cores, do CPU processing and I/O at the same time) Challenges: - execution of different interleave in arbitary manner (many lead to incorrect results) - we need to synchronize among threads to control allowed interleaving. Synchronization primitives: * Mutual exclusion (locking) pthread_mutex_init pthread_mutex_lock pthread_mutex_unlock Locks protect a critical section (only one thread is allowed in a CS) precondition: invariant holds pthread_mutex_lock violate invariants restore invariants pthread_mutex_unlock postcondition: invariant holds In practice, you (mentally) associates a lock with a (set of) shared variables * Lock granularity typedef struct { char *name; int val; }account; account *accounts[100]; void transfer(int x, int y, int amount) { accounts[x]->val -= amount; acounts[y]->val += amount; } If no synchronization, what happens? T1: read account x = 100 T2: read account x = 100 T1: write account x = 90 T2: write account x = 90 T1: read account y = 100 T1: increment account y = 110 T2: read account y = 110 T2: increment account y = 120 Lack of proper synchronization has resulted in many disasters Recently, Flexcoin, Poloniex two bitcoin exchanges that got shutdown because of synchronization bug: In their own words: The attacker successfully exploited a flaw in the code which allows transfers between flexcoin users. By sending thousands of simultaneous requests, the attacker was able to "move" coins from one user account to another until the sending account was overdrawn, before balances were updated. The easy solution: pthread_mutex_t m; //pthread_mutex_init(&m,NULL) void transfer(int x, int y, int amount) { pthread_mutex_lock(&m); accounts[x]->val -= amount; acounts[y]->val += amount; pthread_mutex_unlock(&m); } The lock is associated with the entired array of accounts ---> coarse-grained no concurrency Fine-grained locking typedef struct { char *name; int val; pthread_mutex_t m; }account; void transfer(int x, int y, int amount) { pthread_mutex_lock(&accounts[x]->m); pthread_mutex_lock(&accounts[y]->m); accounts[x]->val -= amount; acounts[y]->val += amount; pthread_mutex_unlock(&accounts[x]->m); pthread_mutex_unlock(&accounts[y]->m); } Problem: deadlock T1:transfer(X, Y, 10) T2:transfer(Y, X, 20) T1: acquired X's lock T2: acquired Y's lock T1: blocked waiting for Y's lock to be released T2: blocked waiting for X's lock to be released Neither can make progress! Solution: ensure locks are acquired in order e.g. acquire locks in the order based on account numbers void transfer(int x, int y, int amount) { if (x < y) { pthread_mutex_lock(&accounts[x]->m); pthread_mutex_lock(&accounts[y]->m); }else{ pthread_mutex_lock(&accounts[y]->m); pthread_mutex_lock(&accounts[x]->m); } accounts[x]->val -= amount; acounts[y]->val += amount; pthread_mutex_unlock(&accounts[x]->m); pthread_mutex_unlock(&accounts[y]->m); } In general: deadlocks are much easier to debug than races program halts at the moment when deadlocks happens. * Scheduling shared resources Locking is a simple kind of resource scheduling -- one thread at a time. What about more complicated scheduling policy? - need a mechanism to block a thread until some condition is true Conditional variables pthread_cond_init(pthread_cond_t *c,..); pthread_cond_wait(pthread_cond_t *c, pthread_mutex_t *m); - atomically unlock m and sleep until conditional variable is signalled - when this function returns, m is locked. pthread_cond_signal(pthread_cond_t *c); - wake up one waiting thread pthread_cond_broadcast(pthread_cond_t *c); - wakes up all waiting threads typedef struct { int *val; //can buffer one value }channel; channel c; void chan_send(int *v) { if (c->val == NULL) { c->val = v; }else{ //wait } } int * chan_recv() { if (c->val) { int *v = c->val; c->val = NULL; return v; }else { //wait } } - explain that it's inefficient to use sleep(1) for waiting - add pthread_cond_wait and signal void send(int *v) { pthread_mutex_lock(&m); while (c->val != NULL) pthread_cond_wait(&c, &m); c->val = v; pthread_cond_unlock(&m); } int * recv() { pthread_mutex_lock(&m); if (c->val) { int *v = c->val; c->val = NULL; pthread_cond_signal(&c); pthread_mutex_unlock(&m); return v; }else { //wait } } Notes on the above code: The general pattern is: T1: lock(&m); while (condition != true) cond_wait(&c,&m) ... do stuff... unlock(&m) T2: lock(&m) condition = true cond_signal(&c) unlock(&m) Fist, always use cond_wait in this pattern while (condition is not true) pthread_cond_wait(&c, &m) **Always** use "while" instead of "if". - T1 waits - T2 signals (but T1 does not run immediately) - T3 acquires mutex, make the condition false (this does not happen in the channel example, but might be the case in more sophisticated programs) - T1 starts running (with the condition now is false) Another advantage of using 'while'--> use cond_broadcast instead of cond_signal would also be correct here. Second, why the interface for cond_wait take a mutex argument and atomically unlocks it? Suppose code is: while (condition != true) { mutex_unlock() cond_wait() } Problem: T1: mutex_unlock() T2: make condition true, cond_signal (signal is dropped because noone is waiting) T2: cond_wait() Signal is lost! Third: code doing cond_signal needs to lock/unlock suppose code is: condition = true //make condition true must be protected by locks cond_signal T1: lock T1: while (condition != true) T2: condition = true T2: cond_signal //signal is dropped T1: waits Signal is lost!