跳轉到

Transaction

The Meaning of ACID

1. Atomicity

The ability to abort a transaction on error and have all writes from that transaction discarded.

2. Consistency

You have certain statements about your data (invariants) that must always be true. For example, in an accounting system, credits and debits across all accounts must always be balanced. It's the application's responsibility to define its transactions correctly so that they preserve consistency.

3. Isolation

A mechanism to prevent race condition.

Concurrently executing transactions are isolated from each other: they cannot step on each other's toes. The database ensures that when the transactions have committed, the result is the same as if they had run serially (one after another), even though in reality they may have run concurrently.

4. Durability

The promise that once a transaction has committed successfully, any data it has written will not be forgotten, even if there is a hardware fault or the database crashes. Perfect durability does not exist: if all your hard disks and all your backups are destroyed at the same time, there's obviously nothing your database can do to save you.

Read Phenomena of RDBMS

1. Dirty Read (Read Uncommitted)

It happens when a transaction reads data written by other concurrent transaction that has not been committed yet. This is terribly bad, because we don’t know if that other transaction will eventually be committed or rolled back. So we might end up using incorrect data in case rollback occurs.

img

2. Non-repeatable Read

When a transaction reads the same record twice and see different values, because the row has been modified by other transaction that was committed after the first read.

img img

3. Phantom Read

The same query is re-executed, but a different set of rows is returned, due to some changes made by other recently-committed transactions, such as inserting new rows or deleting existing rows which happen to satisfy the search condition of current transaction’s query.

img img

Strategies to Achieve Isolation

  1. SX Lock
  2. MVCC (Multi-version Concurrency Control)

img

Isolation Level

1. Read Committed

img img

2. Repeatable Read

img img img

3. Serializable

img img

實務建議

  1. 在 Read Committed 中使用 Conflict Promotion 來預防 Non Repeatable Read

  2. Conflict promotion 就是檢查時,在 select 指令尾部加入 for share,讓所有讀取過的 record 被加上 S lock,直到 TX 結束

  3. 這跟 Repeatable Read 效果相同,只是我們人手操作,讓 RDBMS 不會把非關鍵的資料也加上 S lock
  4. MVCC 的資料庫不一定有 for share,改用 for update 去拿 X lock 也有相同效果

  5. 在 Read Committed 中使用 Conflict Promotion + Conflict Materialization 來預防 Phantom Read

  6. Conflict Materialization:

  7. 如果兩個 table 存在 parent-child 關係(例子:flight 和 flight_misc_cost)
  8. 所有對 child record 的 Read 都要為其 parent record 加上 S lock
  9. 所有對 child record 的 Write 都要為其 parent record 加上 X lock
  10. 所以,對 child table 加入新 record 時,便會跟其他正在讀取該範圍(指同一 parent)的 TX 的 S lock 發生碰撞,讓其 insert 被阻擋
  11. 極少數的 Phantom Read 發生於沒有 parent table,或是 select predicate 不包含 parent id,這時便需要故意創造一個人工的 parent table 去防範 Phantom Read
  12. 大部份系統,只用上 Read Committed 中使用 Conflict promotion + Conflict materialization 便足夠了,不建議用更高階的 isolation

img

  1. 盡可能把 checking 和 data processing 放在同一句 statement 內

  2. 例子: update flight set vacancy = vacancy – 1 where vacancy > 0 and flight_no = @flight_no

  3. 在做超高流量的搶票系統時,別用 insert 做 conflict collision

  4. Bad idea:

  5. Insert into ticket......然後依靠 ticket 的 PK / UK 來預防相同座位被 2 人買到
  6. Good idea:
  7. Update seat set user_id = ?, sold = true where position = ? and user_id = ? and sold = false
  8. 在搶票時,跑 insert 時的 page split 會拖低系統 throughout,update 不會
  9. 沒賣出的票(sold = false)在售票期結束後刪掉便好

Reference