When you’re looking at V$ACTIVE_SESSION_HISTORY, there are two columns that are capable of confusing the heck out of you if you’re not careful because they are named so similarly.  I decided to delve into precisely what these columns mean.

TIME_WAITED

TIME_WAITED is the most commonly used one, and seems to offer the most valuable information.

What happens for V$ACTIVE_SESSION_HISTORY is, by default each of the “Active” sessions are snapshotted every second.  Note that “Active” doesn’t necessarily have anything to do with the V$SESSION.STATE column.  Oracle considers a session to be active if the user call to the RDBMS kernel falls into any of the following categories:

  • Parse/Execute/Fetch Operations
  • Waiting for I/O to complete
  • Waiting on either a message or a buffer from a remote instance in a RAC cluster
  • On CPU
  • Not waiting for recursive session
  • If we’re dealing with a parallel slave, not waiting on the PX_IDLE event
  • Any other wait that does not fall in the Idle wait class

(above list summarized from Oracle Wait Interface: A Practical Guide to Performance Diagnostics & Tuning, page 253)

As mentioned before, a record is inserted into the ash buffer every second for each active session, including information such as…”is this session on CPU?” or “is this session on a wait event?” (waiting to consume CPU).  If the session is on a wait event, Oracle inserts a record into V$ACTIVE_SESSION_HISTORY (specifically, into the underlying X$ASH table) with a value of “0” for the TIME_WAITED column. Oracle will continue to insert records into V$ACTIVE_SESSION_HISTORY as long as the process remains on that same wait event.  When the wait event ends, Oracle goes back and updates the last ASH sample’s TIME_WAITED column that it inserted to reflect the full amount of time waited for by the wait event.

As an example, suppose we’re updating a table, and we run into some row-lock contention that lasts 5.5 seconds.

The first second, we would see a record inserted like this:

SAMPLE_TIME EVENT TIME_WAITED
2017-05-22 11:05:01.114000 enq: TX – row lock contention 0

Second second (is there an echo in here?) would look like this:

SAMPLE_TIME EVENT TIME_WAITED
2017-05-22 11:05:01.114000 enq: TX – row lock contention 0
2017-05-22 11:05:02.114000 enq: TX – row lock contention 0

Third second:

SAMPLE_TIME EVENT TIME_WAITED
2017-05-22 11:05:01.114000 enq: TX – row lock contention 0
2017-05-22 11:05:02.114000 enq: TX – row lock contention 0
2017-05-22 11:05:03.114000 enq: TX – row lock contention 0

Fourth second:

SAMPLE_TIME EVENT TIME_WAITED
2017-05-22 11:05:01.114000 enq: TX – row lock contention 0
2017-05-22 11:05:02.114000 enq: TX – row lock contention 0
2017-05-22 11:05:03.114000 enq: TX – row lock contention 0
2017-05-22 11:05:04.114000 enq: TX – row lock contention 0

Fifth second:

SAMPLE_TIME EVENT TIME_WAITED
2017-05-22 11:05:01.114000 enq: TX – row lock contention 0
2017-05-22 11:05:02.114000 enq: TX – row lock contention 0
2017-05-22 11:05:03.114000 enq: TX – row lock contention 0
2017-05-22 11:05:04.114000 enq: TX – row lock contention 0
2017-05-22 11:05:05.114000 enq: TX – row lock contention 0

Now, at 5.5 seconds (11:05:05.5) the lock releases, and Oracle remembers that it took 5.5 seconds on the wait event.  But it’s not yet time to take a sample yet.  So on the sixth second, the insert would be performed again, and the previous TIME_WAITED value would be updated.

SAMPLE_TIME EVENT TIME_WAITED
2017-05-22 11:05:01.114000 enq: TX – row lock contention 0
2017-05-22 11:05:02.114000 enq: TX – row lock contention 0
2017-05-22 11:05:03.114000 enq: TX – row lock contention 0
2017-05-22 11:05:04.114000 enq: TX – row lock contention 0
2017-05-22 11:05:05.114000 enq: TX – row lock contention 5548899
2017-05-22 11:05:06.114000 (null) 0

(Time waited is in microseconds…or millionths of a second).

Note that because this is handled in this way, you could use the SUM() functionality with the TIME_WAITED to get a rough estimate of how long a query spent on the various wait times.  Also, note that this allows for time to go missing…for example, if there was a quarter-second wait event after the enq:TX – row lock contention event, that wait event would not show up in ASH, and it’s time would not be accounted for.

But that’s basically what TIME_WAITED does.  Craig Shallahamer has wonderful content on his blog here about TIME_WAITED: http://blog.orapub.com/20150827/what-is-the-oracle-ash-time-waited-column.html

Plus I borrowed some content from a presentation called “Sifting Through the ASHes” by Graham Wood: http://www.oracle.com/technetwork/database/manageability/ppt-active-session-history-129612.pdf

 

WAIT_TIME

This column took some serious digging around.  What I’ve basically learned is this:

  • This column maps to V$SESSION.WAIT_TIME (which, apparently this column is deprecated in favor of the new column WAIT_EVENT_MICRO and the STATE columns)
  • If this column is 0, that means we’re waiting on a wait event of sorts
  • If the column is > 0, the value is the duration of the last wait (documentation says in hundredths of a second, it looks like it’s actually in microseconds/millionths of a second)
  • If the column is -1, the last wait was less than 1/100th of a second
  • If the column is -2, you likely have TIMED_STATISTICS set to false
  • If a process is on straight CPU for multiple seconds, the value of this column may not be updated…it doesn’t necessarily get reset between samples (which makes sense if you think about it)

So what does this really mean?  Well…if WAIT_TIME is 0, you’re waiting.  If it’s not 0, you’re on CPU (which, you can already get this information from V$ACTIVE_SESSION_HISTORY.SESSION_STATE).  So that seems pretty useless.  What else could you use this for?

To me, it makes sense that you can “prove” that no other wait events occured (at least, that lasted longer than a microsecond) between two samples if the WAIT_TIME for one row is equal to the TIME_WAITED of the previous sample and the value is greater than the interval between sample times (which is always 1 second, or 1 million microseconds).

For example, here are a group of samples:

SAMPLE_TIME SQL_ID EVENT WAIT_TIME SESSION_STATE TIME_WAITED
2017-05-22-11:11:10.1400 furptayc5j1hv (null) 101556 ON CPU 0
2017-05-22-11:11:11.1400 furptayc5j1hv log file switch completion 0 WAITING 0
2017-05-22-11:11:12.1400 furptayc5j1hv log file switch completion 0 WAITING 1511720
2017-05-22-11:11:13.1400 furptayc5j1hv log file switch (checkpoint incomplete) 0 WAITING 0
2017-05-22-11:11:14.1400 furptayc5j1hv log file switch (checkpoint incomplete) 0 WAITING 2409806
2017-05-22-11:11:15.1400 furptayc5j1hv (null) 2409806 ON CPU 0

If you look at the last two rows, you’ll notice that WAIT_TIME of the very last row is the same as the TIME_WAITED on the second to last row.  If WAIT_TIME is the amount of time waited on the last wait event, and the WAIT_TIME value is bigger than 1 million (since 1 million microseconds is the snapshot interval), then to me it seems safe to say that log file switch was the last wait event that occurred before the session went back on CPU, and that there were no wait events between the log file switch (checkpoin incomplete) wait event at 11:11:14.14 and being on CPU at 11:15:14.

Is that all that WAIT_TIME tells us?

Eh. Pretty much.  Most of the documentation seems to iterate over and over again that the “main” piece of information you can acquire from WAIT_TIME is whether or not a session is “ON CPU” (depending whether it’s 0 or > 0).  Which, you can get that information from V$ACTIVE_SESSION_HISTORY.SESSION_STATE.  Most of the time, when you’re dealing with ASH samples, though, you’re using rough estimations, and you don’t need to prove precision as far as “all the wait events that occurred between two samples.”

So in the grand scheme of things from a pragmatic perspective, this column seems particularly useless, although it is interesting to know what the column’s values represent.