At commit time, a process creates a redo record (containing commit opcodes) and copies that redo record into the log buffer. Then, that process signals LGWR to write the contents of log buffer. LGWR writes from the log buffer to the log file and signals user process back completing a commit. A commit is considered successful after the LGWR write is successful. In a nutshell, after posting LGWR to write, user or background processes wait for LGWR to signal back with a 1-second timeout. The user process charges this wait time as a 'log file sync' event.
The Root Causes of 'log file sync' Waits
1.LGWR is unable to complete writes fast enough for one of the following reasons:
a.Disk I/O performance to log files is not good enough.
b.LGWR is starving for CPU resource.
c.Due to memory starvation issues, LGWR can be paged out.
d.LGWR is unable to complete writes fast enough due to file system or Unix buffer cache limitations.
2.LGWR is unable to post the processes fast enough, due to excessive commits.
3.IMU undo/redo threads. With private strands, a process can generate few Megabytes of redo before commiting. LGWR must write the generated redo so far, and processes must wait for 'log file sync' waits, even if the redo generated from other processes is small enough.
4.LGWR is suffering from other database contention such as enqueue waits or latch contention.
5.Various bugs.
Identify the Root Cause
1.First make sure the 'log file sync' event is indeed a major wait event. Compare 'log file sync' wait time with CPU time.
2.Identify and break down LGWR wait events, and query wait events for LGWR.
Find sid for LGWR process.
SQL> SELECT sid 
SQL> FROM v$session
SQL> WHERE type='BACKGROUND' and program like '%LGWR%';
SQL> SELECT event, time_waited, time_waited_micro
SQL> FROM v$session_event
SQL> WHERE sid=
SQL> ORDER BY time_waited;
 It is worth noting that v$session_wait is a cummulative counter from the instance startup, and hence, can be misleading.
 
 
No comments:
Post a Comment