ASH Visualizations: R, ggplot2, Gephi, Jit, HighCharts, Excel ,SVG
DB Optimizer 22 Jan 2012, 9:23 am CET
There is more and more happening in the world of visualization and visualizing Oracle performance specifically with v$active_session_history.
Of these visualizations, the one pushing the envelope the most is Marcin Przepiorowski. Marcin is responsible for writing S-ASH , ie Simulated ASH versions 2.1,2.2 and 2.3. See
- http://oracleprof.blogspot.com/2011/11/new-release-of-s-ash-v23.html
- https://github.com/pioro/orasash
Here are some examples of what I have seen happening out there in the web with these visualizations grouped by the visualization tool.
Gephi
The first example is using Gephi. The coolest example of Gephi I’ve
seen is Greg Rahn’s
analysis of the voting for Oracle World mix sessions.

Here is Marcin’s example using Gephi with ASH data:
JIT or Javascript InfoVis Toolkit
TOP 5 SQL_ID, SESSIONS and PROGRAMS joined together with additional joins to points not included in top 5. ex. TOP 5 SQL ID with list of sessions and programs TOP 5 SESSIONS with list of sql_id’s and programs TOP 5 PROGAMS with list of sessions and sql id’s
R
Frits Hoogland gives a great blog entry on getting started with R and then using R to analyze 10046 tracefiles (sort of ASH on steroids)
Here from Greg Rahn again is one of the cooler examples of R. In this case it’s the ASH data of a parallel query execution showing the activity of the different processes:
HighCharts
Here is an example basically reproducing the Top Activity screen with highcharts, a sqlscript on v$session and a cgi script to feed the data to the web page:
SVG
OEM 12c shows the new face of Enterprise Manager with the load maps
APEX / EMlite
(quick apex example: http://ba6.us/node/132)
Excel
Karl Arao has been doing a good bit of ASH visualization using his own Excel work as well as Tanel’s Excel Perf Sheet ( see a video here)
http://karlarao.wordpress.com/2009/06/07/diagnosing-and-resolving-gc-block-lost/
Buffer Busy Waits and Disk file operations I/O
DB Optimizer 19 Jan 2012, 1:25 am CET
Database is getting high waits on buffer busy waits. Here is an example period where 5 sessions are completely blocked on buffer busy waits for 4 minutes:
MAXST MINST COUNT(*) SESSION_ID P1 P2 SQL_ID BSID ----- ----- ---------- ---------- ---------- ---------- ------------- ------- 09:54 09:58 26 1526 1812 278732 3gbsbw6w8jdb3 11 09:54 09:58 25 528 1812 278732 3gbsbw6w8jdb3 11 09:54 09:58 25 1514 1812 278732 3gbsbw6w8jdb3 11 09:54 09:58 26 777 1812 278732 3gbsbw6w8jdb3 11 09:54 09:58 25 33 1812 278732 3gbsbw6w8jdb3 11
All are waiting on the same file and block held by one session and all are executing the same statement which is a select for update
The blocker is executing the same SQL statement and spends those 4 minutes waiting for “Disk file operations I/O”:
MINST MAXST EVENT COUNT(*) SQL_ID BSID ----- ----- ------------------------------ ---------- ------------- ------- 09:54 09:58 Disk file operations I/O 26 3gbsbw6w8jdb3
What are “Disk file operations I/O”?
From the docs , http://docs.oracle.com/cd/E18283_01/server.112/e17110/waitevents003.htm#insertedID40
Disk file operations I/O
This event is used to wait for disk file operations (for example, open, close, seek, and resize). It is also used for miscellaneous I/O operations such as block dumps and password file accesses.
| FileOperation | Type of file operation |
| fileno | File identification number |
| filetype | Type of file (for example, log file, data file, and so on) |
select p1,p3, count(*) fromdba_hist_active_sess_historywhere event ='Disk file operations I/O'group by p1,p3/ P1
P3 COUNT(*)---------- ---------- ---------- 2
1 193 2
3 14 4
4 1 2
2 4459 3
4 160 1
18 103select distinct filetype_id, filetype_name from DBA_HIST_IOSTAT_FILETYPE order by 1; FILETYPE_ID FILETYPE_NAME ----------- ------------------------------ 0 Other 1 Control File 2 Data File 3 Log File 4 Archive Log 6 Temp File 9 Data File Backup 10 Data File Incremental Backup 11 Archive Log Backup 12 Data File Copy 17 Flashback Log 18 Data Pump Dump File
What about FileOperation=2? Again, not sure but guesses on Oralce-L point to file open.
If these guess are correct then the main issue is file opens on datafiles.
For the datafiles what are the I/O latencies looking like compared to the Disk file operations I/O ?
BHOU EVENT_NAME AVG_MS CT ---- ------------------------------ --------- ------------ 1054 Disk file operations I/O 2.00 13,547 1130 Disk file operations I/O 1.52 10,658 1200 Disk file operations I/O 1.57 9,846 1230 Disk file operations I/O 2.45 8,704 1300 Disk file operations I/O 3.84 9,526 1330 Disk file operations I/O 2.39 11,989 1400 Disk file operations I/O 1.68 14,698 1430 Disk file operations I/O 2.89 14,863 1500 Disk file operations I/O 860.85 10,577 1530 Disk file operations I/O 12.97 11,783 1600 Disk file operations I/O 623.88 10,902 1630 Disk file operations I/O 357.75 12,428 1700 Disk file operations I/O 294.84 10,543 1730 Disk file operations I/O 12.97 10,623 1800 Disk file operations I/O 461.91 14,443 1830 Disk file operations I/O 12.83 18,504 1900 Disk file operations I/O 443.37 9,563 1930 Disk file operations I/O 237.39 11,737 2000 Disk file operations I/O 542.44 13,027 2033 Disk file operations I/O 6.11 8,389 2100 Disk file operations I/O 16.85 10,561 2130 Disk file operations I/O 306.17 9,873 2200 Disk file operations I/O 20.83 11,335 2230 Disk file operations I/O 12.92 10,158 2300 Disk file operations I/O 13.42 11,025 2330 Disk file operations I/O 15.01 10,883 0000 Disk file operations I/O 5.33 8,533 1054 db file scattered read 1.50 92,394 1130 db file scattered read 1.33 73,243 1200 db file scattered read 1.82 122,988 1230 db file scattered read 2.53 255,474 1300 db file scattered read 4.26 288,144 1330 db file scattered read 2.47 308,045 1400 db file scattered read 2.60 91,684 1430 db file scattered read 3.56 176,324 1500 db file scattered read 4.95 621,658 1530 db file scattered read 5.11 227,565 1600 db file scattered read 5.86 472,804 1630 db file scattered read 9.44 224,984 1700 db file scattered read 9.40 165,238 1730 db file scattered read 7.78 349,003 1800 db file scattered read 6.93 252,761 1830 db file scattered read 7.79 151,760 1900 db file scattered read 5.48 165,369 1930 db file scattered read 3.09 200,868 2000 db file scattered read 3.45 136,647 2033 db file scattered read 5.17 136,330 2100 db file scattered read 11.16 103,799 2130 db file scattered read 10.44 118,025 2200 db file scattered read 20.02 127,638 2230 db file scattered read 13.66 157,210 2300 db file scattered read 10.95 98,493 2330 db file scattered read 8.39 149,606 0000 db file scattered read 4.16 230,075 1054 db file sequential read 3.04 1,152,102 1130 db file sequential read 7.75 165,262 1200 db file sequential read 6.74 23,876 1230 db file sequential read 5.30 10,026 1300 db file sequential read 3.34 496,681 1330 db file sequential read 1.58 1,253,208 1400 db file sequential read 8.86 239,247 1430 db file sequential read 12.91 191,376 1500 db file sequential read 19.97 73,061 1530 db file sequential read 17.80 43,662 1600 db file sequential read 12.41 144,741 1630 db file sequential read 8.99 411,254 1700 db file sequential read 8.03 540,138 1730 db file sequential read 9.26 422,317 1800 db file sequential read 19.16 155,787 1830 db file sequential read 6.01 641,517 1900 db file sequential read 4.79 573,674 1930 db file sequential read 2.72 824,991 2000 db file sequential read 1.59 504,650 2033 db file sequential read 1.88 324,741 2100 db file sequential read 24.32 74,026 2130 db file sequential read 16.05 67,545 2200 db file sequential read 15.52 219,928 2230 db file sequential read 9.87 259,956 2300 db file sequential read 15.18 122,362 2330 db file sequential read 9.97 94,124 0000 db file sequential read 14.19 50,264
Tough to see what is going on
Let’s lay it out in a bubble chart. Latency in ms is on the Y-Axis,
time on the X-Axis (AWR snapshots every half hour) and size of the
ball is the amount of requests:

The Disk file operations I/O doesn’t seem to correlate with disk activity nor latency. When disk activity is high, on the left, with much data coming off spindle (ie 8ms average) Disk file operations I/O is fast, ie 2ms. But at 20:00, there is medium I/O activity but much of it is coming from cache, ie 2ms average and the Disk file operations are slow.
The database is not using ASM. The number of datafiles is 900. Open file descriptors is set at 8192.
Might be time to run some DTrace on the file operations to get a better idea of what is going on.
Looking at AWR repositories for performance issues
DB Optimizer 17 Jan 2012, 2:54 am CET
Def v_secs=3600 -- bucket size
Def v_days=1 -- total time analyze
Def v_bars=5 -- size of one AAS in characters
Def v_graph=80
col aveact format 999.99
col graph format a80
col fpct format 9.99
col spct format 9.99
col tpct format 9.99
col aas format 9.99
col pct1 format 999
col pct2 format 999
col first format a15
col second format a15
Def p_value=4
select to_char(start_time,'DD HH24:MI'),
--samples,
--total,
--waits,
--cpu,
(total/&v_secs) aas,
--round(fpct * (total/&v_secs),2) aas1,
fpct*100 pct1,
decode(fpct,null,null,first) first,
--round(spct * (total/&v_secs),2) aas2,
spct*100 pct2,
decode(spct,null,null,second) second,
-- substr, ie trunc, the whole graph to make sure it doesn't overflow
substr(
-- substr, ie trunc, the graph below the # of CPU cores line
-- draw the whole graph and trunc at # of cores line
substr(
rpad('+',round((cpu*&v_bars)/&v_secs),'+') ||
rpad('o',round((io*&v_bars)/&v_secs),'o') ||
rpad('-',round((waits*&v_bars)/&v_secs),'-') ||
rpad(' ',&p_value * &v_bars,' '),0,(&p_value * &v_bars)) ||
&p_value ||
-- draw the whole graph, then cut off the amount we drew before the # of cores
substr(
rpad('+',round((cpu*&v_bars)/&v_secs),'+') ||
rpad('o',round((io*&v_bars)/&v_secs),'o') ||
rpad('-',round((waits*&v_bars)/&v_secs),'-') ||
rpad(' ',&p_value * &v_bars,' '),(&p_value * &v_bars),( &v_graph-&v_bars*&p_value) )
,0,&v_graph)
graph
-- spct,
-- decode(spct,null,null,second) second,
-- tpct,
-- decode(tpct,null,null,third) third
from (
select start_time
, max(samples) samples
, sum(top.total) total
, round(max(decode(top.seq,1,pct,null)),2) fpct
, substr(max(decode(top.seq,1,decode(top.event,'ON CPU','CPU',event),null)),0,15) first
, round(max(decode(top.seq,2,pct,null)),2) spct
, substr(max(decode(top.seq,2,decode(top.event,'ON CPU','CPU',event),null)),0,15) second
, round(max(decode(top.seq,3,pct,null)),2) tpct
, substr(max(decode(top.seq,3,decode(top.event,'ON CPU','CPU',event),null)),0,10) third
, sum(waits) waits
, sum(io) io
, sum(cpu) cpu
from (
select
to_date(tday||' '||tmod*&v_secs,'YYMMDD SSSSS') start_time
, event
, total
, row_number() over ( partition by id order by total desc ) seq
, ratio_to_report( sum(total)) over ( partition by id ) pct
, max(samples) samples
, sum(decode(event,'ON CPU',total,0)) cpu
, sum(decode(event,'ON CPU',0,
'db file sequential read',0,
'db file scattered read',0,
'db file parallel read',0,
'direct path read',0,
'direct path read temp',0,
'direct path write',0,
'direct path write temp',0, total)) waits
, sum(decode(event,'db file sequential read',total,
'db file scattered read',total,
'db file parallel read',total,
'direct path read',total,
'direct path read temp',total,
'direct path write',total,
'direct path write temp',total, 0)) io
from (
select
to_char(sample_time,'YYMMDD') tday
, trunc(to_char(sample_time,'SSSSS')/&v_secs) tmod
, to_char(sample_time,'YYMMDD')||trunc(to_char(sample_time,'SSSSS')/&v_secs) id
, decode(ash.session_state,'ON CPU','ON CPU',ash.event) event
, sum(decode(session_state,'ON CPU',10,decode(session_type,'BACKGROUND',0,10))) total
, (max(sample_id)-min(sample_id)+1) samples
from
dba_hist_active_sess_history ash
where
-- sample_time > sysdate - &v_days
-- and sample_time < ( select min(sample_time) from v$active_session_history)
dbid=&DBID
group by trunc(to_char(sample_time,'SSSSS')/&v_secs)
, to_char(sample_time,'YYMMDD')
, decode(ash.session_state,'ON CPU','ON CPU',ash.event)
) chunks
group by id, tday, tmod, event, total
) top
group by start_time
) aveact
order by start_time
/
The output of the above query should look like
TO_CHAR( AAS PCT1 FIRST PCT2 SECOND GRAPH -------- ----- ---- --------------- ---- --------------- ------------------------------------------- 16 00:00 7.02 52 db file sequent 27 CPU ++++++++++oooooooooo4ooooooooooooooo-- 16 01:00 3.80 57 CPU 36 db file sequent +++++++++++oooooooo 4 16 02:00 5.51 38 db file sequent 21 CPU ++++++oooooooooooo--4--------- 16 03:00 2.89 69 db file sequent 20 CPU +++ooooooooooo 4 16 04:00 .34 45 db file sequent 28 CPU o 4 16 05:00 .78 58 db file sequent 24 CPU +ooo 4 16 06:00 .59 41 db file sequent 39 CPU +oo 4 16 07:00 .19 57 CPU 21 db file sequent + 4 16 08:00 .18 57 CPU 28 db file scatter + 4 16 09:00 .50 69 db file sequent 24 CPU +oo 4 16 10:00 1.79 79 db file sequent 13 CPU +ooooooo 4 16 11:00 2.51 76 db file sequent 12 CPU +ooooooooooo 4
The “graph” on the right shows the load over time each line is an hour by default. The “+” represent CPU, “o” represent I/O and “-” represent a wait. The columns “FIRST” and “SECOND” represent the top two things happening on the database.
set pagesize 100
col event_name format a30
col avg_ms format 99999.99
col ct format 999,999,999
select
btime, event_name,
(time_ms_end-time_ms_beg)/nullif(count_end-count_beg,0) avg_ms,
(count_end-count_beg) ct
from (
select
e.event_name,
to_char(s.BEGIN_INTERVAL_TIME,'DD-MON-YY HH24:MI') btime,
total_waits count_end,
time_waited_micro/1000 time_ms_end,
Lag (e.time_waited_micro/1000)
OVER( PARTITION BY e.event_name ORDER BY s.snap_id) time_ms_beg,
Lag (e.total_waits)
OVER( PARTITION BY e.event_name ORDER BY s.snap_id) count_beg
from
DBA_HIST_SYSTEM_EVENT e,
DBA_HIST_SNAPSHOT s
where
s.snap_id=e.snap_id
--and e.wait_class in ( 'User I/O', 'System I/O')
-- and e.event_name in ( 'db file sequential read',
-- 'db file scattered read',
-- 'db file parallel read',
-- 'direct path read',
-- 'direct path read temp',
-- 'direct path write',
-- 'direct path write temp')
and e.event_name in ( 'db file sequential read')
and e.dbid=&DBID
and e.dbid=s.dbid
order by e.event_name, begin_interval_time
)
where (count_end-count_beg) > 0
order by event_name,btime
/
The above query has commented out all but ‘db file sequential read’ (ie single block reads) to make the output easier to read, but if there are other I/O waits then the query should be changed to get the latencies of those other types of I/O waits. The output of the above query should look something like
BTIME EVENT_NAME AVG_MS CT --------------- ------------------------------ --------- ------------ 15-JAN-12 15:04 db file sequential read 12.66 854,744 15-JAN-12 16:00 db file sequential read 9.21 1,242,296 15-JAN-12 17:00 db file sequential read 13.47 889,314 15-JAN-12 18:00 db file sequential read 6.16 1,109,994 15-JAN-12 19:00 db file sequential read 1.86 215,607 15-JAN-12 20:00 db file sequential read 2.12 185,727 15-JAN-12 21:00 db file sequential read 1.60 1,281,880 15-JAN-12 22:00 db file sequential read 7.16 327,645 15-JAN-12 23:00 db file sequential read 7.45 778,819 16-JAN-12 00:00 db file sequential read 19.02 674,674 16-JAN-12 01:00 db file sequential read 7.38 651,391 16-JAN-12 02:01 db file sequential read 6.21 1,194,226 16-JAN-12 03:00 db file sequential read 7.33 959,630 16-JAN-12 04:00 db file sequential read 2.59 201,810 16-JAN-12 05:00 db file sequential read 4.05 419,692 16-JAN-12 06:00 db file sequential read 2.71 300,792 16-JAN-12 07:00 db file sequential read .83 141,872 16-JAN-12 08:00 db file sequential read .84 157,132 16-JAN-12 09:00 db file sequential read 2.70 456,984 16-JAN-12 10:00 db file sequential read 4.07 1,234,105 16-JAN-12 11:00 db file sequential read 6.19 1,089,680
The above latencies look ok, ie under 10ms which is fine for I/O from physical spindle but the customer might be use to latencies from UNIX file cache reads or form SAN cache reads.
Where is the I/O coming from?
col type for a16
col "CPU" for 999999.9
col "IO" for 999999.9
select * from (
select
ash.SQL_ID , ash.SQL_PLAN_HASH_VALUE Plan_hash, aud.name type,
sum(decode(ash.session_state,'ON CPU',1,0)) "CPU",
sum(decode(ash.session_state,'WAITING',1,0)) -
sum(decode(ash.session_state,'WAITING', decode(wait_class, 'User I/O',1,0),0)) "WAIT" ,
sum(decode(ash.session_state,'WAITING', decode(wait_class, 'User I/O',1,0),0)) "IO" ,
sum(decode(ash.session_state,'ON CPU',1,1)) "TOTAL"
from dba_hist_active_sess_history ash,
audit_actions aud
where SQL_ID is not NULL
and ash.dbid=&DBID
and ash.sql_opcode=aud.action
group by sql_id, SQL_PLAN_HASH_VALUE , aud.name
order by sum(decode(session_state,'ON CPU',1,1)) desc
) where rownum < 20
/
Output looks like
SQL_ID PLAN_HASH TYPE CPU WAIT IO TOTAL ------------- ---------- ---------------- --------- ---------- --------- ---------- fgzp9yqqjcjvm 707845071 UPDATE 25.0 95 4081.0 4201 8u8y8mc1qxd98 131695425 SELECT 18.0 57 3754.0 3829 cfk8gy594h42s 3743737989 SELECT 2021.0 17 82.0 2120 cnx6ht8bdmf4c 0 PL/SQL EXECUTE 546.0 367 868.0 1781 gyj8wh7vx960y 1736948211 SELECT 197.0 11 1227.0 1435 b6jarjrw5d6tq 2817015872 INSERT 548.0 131 101.0 780
Look at the stats for the top SQL:
col avg_elapsed_sec for 9,999,999,999.999
col disk for 9,999,999.9
col lio for 9,999,999.9
col cpu_sec for 99,999,999.999
col io_time_sec for 9,999,999.999
col ap_time_sec for 9,999,999.999
col cc_time_sec for 9,999,999.999
select
to_char(snap.begin_interval_time,'DD-MON-YYYY HH24') ,
(sum(ELAPSED_TIME_DELTA)/nullif(sum(EXECUTIONS_DELTA),0))/1000000 avg_elapsed_sec,
sum(EXECUTIONS_DELTA) execs,
sum(DISK_READS_DELTA)/nullif(sum(EXECUTIONS_DELTA),0) disk,
sum(BUFFER_GETS_DELTA)/nullif(sum(EXECUTIONS_DELTA),0) lio,
sum(ROWS_PROCESSED_DELTA)/nullif(sum(EXECUTIONS_DELTA),0) rws,
(sum(CPU_TIME_DELTA)/nullif(sum(EXECUTIONS_DELTA),0))/1000000 cpu_sec,
(sum(IOWAIT_DELTA)/nullif(sum(EXECUTIONS_DELTA),0))/1000000 io_time_sec,
(sum(APWAIT_DELTA)/nullif(sum(EXECUTIONS_DELTA),0))/1000000 ap_time_sec,
(sum(CCWAIT_DELTA)/nullif(sum(EXECUTIONS_DELTA),0))/1000000 cc_time_sec,
sum(DIRECT_WRITES_DELTA)/nullif(sum(EXECUTIONS_DELTA),0) dio,
sum(PHYSICAL_READ_REQUESTS_DELTA)/nullif(sum(EXECUTIONS_DELTA),0) reads,
sum(PHYSICAL_WRITE_REQUESTS_DELTA)/nullif(sum(EXECUTIONS_DELTA),0) writes
from dba_hist_sqlstat sql,
dba_hist_snapshot snap
where sql.sql_id='&sql_id'
and sql.dbid=&dbid
and sql.dbid=snap.dbid
and snap.snap_id=sql.snap_id
group by to_char(snap.begin_interval_time,'DD-MON-YYYY HH24')
order by 1
/
Output looks like
TO_CHAR(SNAP.B AVG_ELAPSED EXECS DISK LIO RWS CPU_SEC IO_TIME_SEC AP_TIME_SEC CC_TIME_SEC DIO -------------- ------------------ ---------- ------------ ------------ ---------- --------------- -------------- -------------- -------------- ---------- --- 15-JAN-2012 23 3,012.128 1 315,363.0 2,394,156.0 0 37.624 2,889.440 .000 .000 0 16-JAN-2012 05 1,111.465 1 252,377.0 2,625,538.0 0 28.885 1,090.000 .000 .000 0
Get the SQL Text
set long 10000 select sql_text from dba_hist_sqltext where sql_id = '&1' and rownum < 2 /
Look at I/O sizes for multiblock reads. For multiblock reads, it’s good to have an idea of what the I/O sizes are. The following query gives an upper bound. NOTE the averages and even mins can be highly misleading but the max should be a good indicator.
col event for a25 select event,round(min(p3)) mn, round(avg(p3)) av, round(max(p3)) mx, count(*) cnt from dba_hist_active_sess_history --from v$active_session_history where (event like 'db file%' or event like 'direct %') and event not like '%parallel%' and dbid=&DBID group by event order by event /
looks like
EVENT MN AV MX CNT ------------------------- ---------- ---------- ---------- ---------- db file scattered read 2 16 16 892 db file sequential read 1 1 1 105 direct path read 1 1 1 1 direct path write 1 1 1 2 direct path write temp 4 29 31 17
WordPress hacked – reinstall time
DB Optimizer 13 Jan 2012, 9:54 pm CET
OK, so wordpress go hacked. I’ve had problems with this in the past and tried tactical surgery, but this time decided to to a full re-install. My first attempted left me with the wordpress blank screen of death, so here I’m outlining the steps I took that finally got the new version working:
back up wordpress database
http://codex.wordpress.org/WordPress_Backups#Simple_Backup
# go onto hosting box and downloaded new workdpress :
# there are better ways to download wordpress than zip, but this worked for me
lynx -source -dump http://wordpress.org/latest.zip > wordpress.zip
unzip wordpress.zip
mv myblog.com oldmyblog.com
mv wordpress myblog.com
cd myblog.com
cp wp-config-sample.php wp-config.php
vi wp-config.php
# change mysql info
/** The name of the database for WordPress */
define('DB_NAME', 'mydbname');
/** MySQL database username */
define('DB_USER', 'mydbuser');
/** MySQL database password */
define('DB_PASSWORD', 'mydbpassowrd');
/** MySQL hostname */
define('DB_HOST', 'mysql.myhost.com');
# added new keys
# get keys at
# https://api.wordpress.org/secret-key/1.1/salt/
# replacing these lines
# define('AUTH_KEY', 'put your unique phrase here');
# define('SECURE_AUTH_KEY', 'put your unique phrase here');
# define('LOGGED_IN_KEY', 'put your unique phrase here');
# define('NONCE_KEY', 'put your unique phrase here');
# define('AUTH_SALT', 'put your unique phrase here');
# define('SECURE_AUTH_SALT', 'put your unique phrase here');
# define('LOGGED_IN_SALT', 'put your unique phrase here');
# define('NONCE_SALT', 'put your unique phrase here');
# save file
cp .htaccess .htaccess.orig
# the downloaded .htaccess was giving me 404 errors
# so I used the one from the previous blog. Not sure
# if it's fully up to date but at least it got me going
vi .htaccess
# include these lines
# BEGIN WordPress
RewriteEngine On
RewriteBase /
RewriteRule ^index.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
# END WordPress
cp -R ../oldmyblog.com/wp-content/themes wp-content
cp -R ../oldmyblog.com/wp-content/uploads wp-content
cp -R ../oldmyblog.com/wp-content/profile-pics wp-content
reinstall plugins, in my case I use these:
amr-shortcode-any-widget/amr_shortcode_any_widget.php
google-analytics-for-wordpress/googleanalytics.php
profile-pic/profile-pic.php
sbs-blogroll/sbs-blogroll.php
syntax-highlighter-compress/syntax-highlighter-compress.php
w3-total-cache/w3-total-cache.php
wordpress-popular-posts/wordpress-popular-posts.php
wp-widget-cache/widget-cache.ph
For more info see
http://codex.wordpress.org/FAQ_My_site_was_hacked
http://smackdown.blogsblogsblogs.com/2008/06/24/how-to-completely-clean-your-hacked-wordpress-installation/
suggest reinstall and checking wordpress database for these code usages for possible hacks
SELECT * FROM wp_posts WHERE post_content LIKE '%<iframe%'
UNION
SELECT * FROM wp_posts WHERE post_content LIKE '%<noscript%'
UNION
SELECT * FROM wp_posts WHERE post_content LIKE '%display:%'
http://ocaoimh.ie/2008/06/08/did-your-wordpress-site-get-hacked/
check for usage of base64_decode()
Installing Oracle on RHEL
DB Optimizer 13 Jan 2012, 6:52 am CET
A small but time consuming LINUX issue:
When installing on RHEL for both 10 and 11, I get errors that the following libraries are missing
The message is a little misleading as both the i386 and x86_64 versions of libaio have to be intalled
libaio-0.3.106-3.2.i386 libaio-0.3.106-3.2.x86_64 libaio-devel-0.3.106-3.2.i386 libaio-devel-0.3.106-3.2.x86_64 sysstat-7.0.2-1.SEL5_2.i386
The last time I installed Oracle on LINUX was almost a year ago and it took me several hours to track this down. Since then that test machine has been blown away and a long with it my copies of these libraries. Today I tried to install Oracle 11 on LINUX and again I spent another hour or two trying to track these down.
You should just run something like:
http://www.icewalkers.com/rpm/libaio/fedora-core-6/download/libaio-2305.html
http://www.icewalkers.com/rpm/libaio-devel/fedora-core-6/download/libaio-devel-2306.html
This is fedora but seems to work on RHEL. I’m doing some quick tests and hacking around, so I just want to getting up and running asap. For production use, it would be wise to get yum registered and working correctly.
I tar’ed up and gziped the rpms here http://oraclemonitor.com/ftp/
in the file oracle_fed_rpms.tar.gz
10053 Trace file viewer
DB Optimizer 10 Jan 2012, 12:29 am CET
Back in http://dboptimizer.com/2010/07/26/oracle-10053-sql-trace-viewer/
I mentioned an Oracle 10053 trace file viewer. Well now there is
another one with even more options and it’s free. Check out
http://www.lab128.com/free_downloads.html
from the same person who created lab128 the fastest instance Oracle
database monitor in the West (of in the world for that matter).
Here is an example screen shot:
ASH Exploding Across Versions
DB Optimizer 17 Dec 2011, 1:41 am CET
I sometimes forget which versions of ASH have which fields, so here is a bit of a graphic and textual cheat sheet. (see https://sites.google.com/site/oraclemonitor/dba_hist_active_sess_history for some queries on ASH)
10.2.0 10.2.0.3 11.1.0 11.2.0 (http://download.oracle.com/docs/cd/E11882_01/server.112/e10820/dynviews_1007.htm#I1030299) SAMPLE_IDSAMPLE_IDSAMPLE_IDSAMPLE_ID SAMPLE_TIMESAMPLE_TIMESAMPLE_TIMESAMPLE_TIME SESSION_IDSESSION_IDSESSION_IDSESSION_ID SESSION_SERIAL#SESSION_SERIAL#SESSION_SERIAL#SESSION_SERIAL# SESSION_TYPESESSION_TYPESESSION_TYPESESSION_TYPE USER_IDUSER_IDUSER_IDUSER_ID QC_INSTANCE_IDQC_INSTANCE_IDQC_INSTANCE_IDQC_INSTANCE_ID QC_SESSION_IDQC_SESSION_IDQC_SESSION_IDQC_SESSION_ID SERVICE_HASHSERVICE_HASHSERVICE_HASHSERVICE_HASH PROGRAMPROGRAMPROGRAMPROGRAM MODULEMODULEMODULEMODULE ACTIONACTIONACTIONACTION CLIENT_IDCLIENT_IDCLIENT_IDCLIENT_ID QC_SESSION_SERIAL#QC_SESSION_SERIAL# CONSUMER_GROUP_IDCONSUMER_GROUP_ID
MACHINE
PORT SQL_IDSQL_IDSQL_IDSQL_ID SQL_CHILD_NUMBERSQL_CHILD_NUMBERSQL_CHILD_NUMBERSQL_CHILD_NUMBER SQL_OPCODESQL_OPCODESQL_OPCODESQL_OPCODE SQL_PLAN_HASH_VALUESQL_PLAN_HASH_VALUESQL_PLAN_HASH_VALUESQL_PLAN_HASH_VALUE XIDXIDXIDXID FORCE_MATCHING_SIGNATUREFORCE_MATCHING_SIGNATUREFORCE_MATCHING_SIGNATUREFORCE_MATCHING_SIGNATURE
PLSQL_ENTRY_OBJECT_IDPLSQL_ENTRY_OBJECT_IDPLSQL_ENTRY_OBJECT_ID
PLSQL_ENTRY_SUBPROGRAM_IDPLSQL_ENTRY_SUBPROGRAM_IDPLSQL_ENTRY_SUBPROGRAM_ID PLSQL_OBJECT_IDPLSQL_OBJECT_IDPLSQL_OBJECT_ID
PLSQL_SUBPROGRAM_IDPLSQL_SUBPROGRAM_IDPLSQL_SUBPROGRAM_ID
TOP_LEVEL_SQL_IDTOP_LEVEL_SQL_ID
TOP_LEVEL_SQL_OPCODETOP_LEVEL_SQL_OPCODE
SQL_PLAN_LINE_IDSQL_PLAN_LINE_ID
SQL_PLAN_OPERATIONSQL_PLAN_OPERATION
SQL_PLAN_OPTIONSSQL_PLAN_OPTIONS
SQL_EXEC_IDSQL_EXEC_ID
SQL_EXEC_STARTSQL_EXEC_START
IN_CONNECTION_MGMTIN_CONNECTION_MGMT IN_PARSEIN_PARSE IN_HARD_PARSEIN_HARD_PARSE IN_SQL_EXECUTIONIN_SQL_EXECUTION
IN_PLSQL_EXECUTIONIN_PLSQL_EXECUTION
IN_PLSQL_RPCIN_PLSQL_RPC
IN_PLSQL_COMPILATIONIN_PLSQL_COMPILATION
IN_JAVA_EXECUTIONIN_JAVA_EXECUTION
IN_BINDIN_BIND
IN_CURSOR_CLOSEIN_CURSOR_CLOSE
IN_SEQUENCE_LOAD
TOP_LEVEL_CALL#
TOP_LEVEL_CALL_NAME IS_SQLID_CURRENT SQL_OPNAME EVENTEVENTEVENTEVENT EVENT_IDEVENT_IDEVENT_IDEVENT_ID EVENT#EVENT#EVENT#EVENT# SEQ#SEQ#SEQ#SEQ# P1P1P1P1 P2P2P2P2 P3P3P3P3 WAIT_TIMEWAIT_TIMEWAIT_TIMEWAIT_TIME SESSION_STATESESSION_STATESESSION_STATESESSION_STATE TIME_WAITEDTIME_WAITEDTIME_WAITEDTIME_WAITED CURRENT_OBJ#CURRENT_OBJ#CURRENT_OBJ#CURRENT_OBJ# CURRENT_FILE#CURRENT_FILE#CURRENT_FILE#CURRENT_FILE# CURRENT_BLOCK#CURRENT_BLOCK#CURRENT_BLOCK#CURRENT_BLOCK# P1TEXTP1TEXTP1TEXTP1TEXT P2TEXTP2TEXTP2TEXTP2TEXT P3TEXTP3TEXTP3TEXTP3TEXT WAIT_CLASSWAIT_CLASSWAIT_CLASSWAIT_CLASS WAIT_CLASS_IDWAIT_CLASS_IDWAIT_CLASS_IDWAIT_CLASS_ID
CURRENT_ROW#CURRENT_ROW# TM_DELTA_TIME TM_DELTA_CPU_TIME TM_DELTA_DB_TIME DELTA_TIME DELTA_READ_IO_REQUESTS
DELTA_WRITE_IO_REQUESTS DELTA_READ_IO_BYTES DELTA_WRITE_IO_BYTES DELTA_INTERCONNECT_IO_BYTES PGA_ALLOCATED TEMP_SPACE_ALLOCATE BLOCKING_SESSIONBLOCKING_SESSIONBLOCKING_SESSION_STATUSBLOCKING_SESSION_STATUS BLOCKING_SESSION_STATUSBLOCKING_SESSION_STATUSBLOCKING_SESSIONBLOCKING_SESSION BLOCKING_SESSION_SERIAL#BLOCKING_SESSION_SERIAL#BLOCKING_SESSION_SERIAL#BLOCKING_SESSION_SERIAL#
BLOCKING_INST_ID
BLOCKING_HANGCHAIN_INFO FLAGSFLAGS REMOTE_INSTANCE#REMOTE_INSTANCE#
TIME_MODEL IS_AWR_SAMPLE
ECID CAPTURE_OVERHEAD REPLAY_OVERHEAD IS_CAPTURED IS_REPLAYED
| 10.1 | ||||
| SAMPLE_ID | SAMPLE_TIME | SESSION_ID | SESSION_SERIAL# | SESSION_TYPE |
Right Deep, Left Deep and Bushy Joins
DB Optimizer 10 Dec 2011, 12:22 am CET
At UKOUG someone asked me if DB Optimizer’s VST diagrams could deal with left deep verses right deep execution plans. What is right deep verses left deep? Good question. In join trees (not VST) the object on the left is acted upon first then the object on the right. Below are left deep and right deep examples of the same query, showing
- query text
- join tree
- join tree modified to more clearly show actions
- VST showing the same actions
Confio Ignite
DB Optimizer 1 Dec 2011, 2:08 pm CET
People have been asking me recently, “what is the best enterprise database monitoring software?”
Of course for Oracle there is OEM but what if one wants to monitor multiple database platforms in the same interface?
In enterprise database monitors I look for a dashboard based on wait time and CPU statistics. I want to see CPU and wat statistics correlated in a way that I can easily see the load on the database and the break down of the time by wait class and CPU. The main databases are instrumented with wait events such as Oracle, Sybase, SQL Server and DB2 (DB2 wait interface support is in 9.7, before 9.7 the statistics were less useful).
I only know of 3 cross platform enterprise database monitoring products that follow such a methodology: Confio Ignite, Quest Performance Analyzer and Veritas Indepth/I3.
Veritas I3 product has fallen off my radar. I haven’t heard much from them and the last time I talked to them they said that installing the product required a consultant. That’s a show stopper for me. Quest’s as well is falling off my radar, not for lack of technology but more a lack of focus by the company. The product “Performance Analyzer” doesn’t even show up in the top 10 hits on Google. My guess is that they have rolled the product under the hood of Foglight and sell it as a Foglight option, which means more money and more complexity which are all drawbacks. As far as technology goes, “Performance Analyzer” was pretty cool and had nice dash boards but probably the biggest drawback was the product required binaries to be installed on each and every target (at least for Oracle) which can turn into a maintenance nightmare. Well, who else is out there? The other contender is Confio. One of Confio’s great advantages is that they only do (at least for most of their history) monitoring. There is clear focus and enthusiasm. It’s such a refreshing change from Quest and Veritas (and if you are Oracle only, then OEM).
So let’s look at Confio. One of the newest and most exciting things at Confio is the feature of monitoring VMware statistics. VMware has a hundreds of statistics ( I once ran esxtop in batch mode and got 24,000 columns, yes 24,000). The statistics are show in nice graphs in ESX vSphere and vCenter, but the statistics in the graphs have to be chose from lists and the number of lines in the graphs can become overwhelming, but the worst part is the lack of correlation between statistics of different types such as CPU, I/O and network which are on different graphs. Finally there is no way to correlate the VMware statistics with the Oracle database, until now with Confio.
Here is a Confio enterprise dashboard where I can see my databases. The databases are grouped in this image by those on VMware and those not:
If I drill down onto a database running on VMware I get not only the classic load chart on the database but also correlated graphs from the OS and VMware:
The top graph is the load on the Oracle database grouped by top SQL based on active time in the database (which includes wait time and CPU. Below this graph are 3 other graphs:
- DB Instance
- Signal Waits Percent
- Instance CPU Utilization
- VM/OS
- VM CPU Usage
- VM CPU Ready Time
- O/S CPU Queue Length
- Physical Host
- Host CPU Usage
Now with these graphs on the same page I can easily make correlations. I can see a spike in my SQL load at noon.
I can correlate this spike in database load with the three graphs.
1. The “DB Instance” graph shows a spike in the CPU used by Oracle as seen in the “DB Instance” graph. The “DB Instance” graph also shows “Signal Waits Percent” which is a fancy way of saying Oracle is waiting for CPU – ie there is CPU contention.
2. The “VM/OS” graph shows CPU usage going up and “CPU Ready Time” going up. The statisitics “CPU Ready Time” is an important statistics for VMware, yet it’s not well documented. The statisitc “CPU Ready Time” is shows how much time the VM wanted CPU from the host but couldn’t get it.
3. The “Physical Host” graph shows that there was a spike in CPU used at the HOST level across all the VMs on that host.
Additionally there are event notifications of changes on the system such as adding a new VM to the host. Note as well the grey circles with arrows. Pass your mouse over the event icon to get information about the event.
I find the ability to see all the layers in one view, see the important statistics only and be able to correlate all these statistics invaluable.
On top of the additional VMware monitoring option as seen anbove, Confio offers the classic view of the database load view through different aggregation groupings :
- SQL
- Waits
- Programs
- Machines
- DB Users
- O/S Users
- Files
- Plans
- Objects
- Modules
- Actions
Clicking the Object tabs gives a different perspective
The above charts are large granularity but one can zoom down to as small as 10 second intervals:
Ignite also notifies when it finds performance issues:
Drilling down on alerts will point out such useful things as a SQL statement that has had an execution plan change for the worse:
Summary
The above are a few of my first impressions of Confio’s Ignite. Ignite seems to fill a clear need in the industry for enterprise cross database platform monitoring including the ability to monitor VMware.
If you are on Oracle only, then it is a cheaper alternative to OEM and if you have OEM already the Ignite is a good complement. One attractive feature of Ignite is that all the data is collected into a centralized database allowing one to easily run custom queries and query across multiple databases.
S-ASH 2.3 available
DB Optimizer 27 Nov 2011, 6:16 pm CET
see:
http://oracleprof.blogspot.com/2011/11/new-release-of-s-ash-v23.html
“New features: - new easier installation process - new metrics history (IO and system) - improved support for RAC / multi database in one repository - Oracle scheduler is used to manage jobs
This version of OraSASH has been tested with Oracle 10g and 11g R1 and R2 as a target database and Oracle 11gR2 Express Edition as repository.
regards, Marcin”
Solaris IPS documentation
DB Optimizer 21 Nov 2011, 11:39 pm CET
Oracle’s Image Packaging Service (IPS) is documented here: http://www.oracle.com/technetwork/server-storage/solaris11/technologies/ips-323421.html
From here there is a step by step example of creating a package here: http://wikis.sun.com/download/attachments/240520636/oracle-solaris-11-ips-dev-guide.pdf
Here are some of the steps
mkdir proto
mkdir proto/opt
# add the package files in opt
pkgsend generate proto | pkgfmt > mypkg.p5m.1
cat << EOF > mypkg.mog
set name=pkg.fmri value=mypkg@1.0,5.11-0
set name=pkg.summary value="This is our example package"
set name=pkg.description value="This is a full description of
all the interesting attributes of this example package."
set name=variant.arch value=$(ARCH)
set name=info.classification
value=org.opensolaris.category.2008:Applications/Accessories
link path=usr/share/man/index.d/mysoftware target=opt/mysoftware/man
drop>
EOF
pkgmogrify -DARCH=`uname -p` mypkg.p5m.1 mypkg.mog | pkgfmt > mypkg.p5m.2
pkgdepend generate -md proto mypkg.p5m.2 | pkgfmt > mypkg.p5m.3
pkgdepend resolve -m mypkg.p5m.3
pkglint mypkg.p5m.3.res
pkgrepo create /scratch/my-repository
pkgrepo -s /scratch/my-repository set publisher/prefix=mypublisher
Now with these steps done, the package is ready to be published. Here is the command as stated in the documentation.
pkgsend -s /scratch/my-repository/ publish -d proto mypkg.p5m.4.res
What happens? The pkgsend command just hangs. After using “truss” it’s easy to see that pkgsend is waiting for input from stdin.
24440: read(0, 0xFEF79C20, 1024) (sleeping...)
Hitting control D, “^D”, gives:
pkgsend: The URI '/scratch/my-repository/' contains an unsupported scheme ''.
which leads to https://defect.opensolaris.org/bz/show_bug.cgi?id=17143 point to a slightly different syntax:
pfexec pkgsend -s file:///scratch/my-repository/ publish -d proto mypkg.p5m.4.res
hitting ^D gives:
'The specified FMRI, 'pkg:/mypkg.p5m.4.res', has an invalid version.
There is not much information on this error, but some here https://defect.opensolaris.org/bz/show_bug.cgi?id=1348
In every case, pkgsend is trying to read from standard in. The documentation does say that pkgsend will read from standard in when the manifest is not on the command line: http://download.oracle.com/docs/cd/E23824_01/html/E21796/pkgsend-1.html
publish [-b bundle ...] [-d source ...] [-s repo_uri_or_path] [-T pattern] [--no-catalog] [manifest ...]
Publishes a package using the specified package manifests to the target package repository, retrieving files for the package from the provided sources. If multiple manifests are specified, they are joined in the order provided. If a manifest is not specified, the manifest is read from stdin.
But on the command line used, the manifest is not specified on the command line.
Scanning the command like options turns up:
Usage:
pkgsend [options] command [cmd_options] [operands]
Packager subcommands:
pkgsend open [-en] pkg_fmri
pkgsend add action arguments
pkgsend import [-T pattern] [--target file] bundlefile ...
pkgsend include [-d basedir] ... [-T pattern] [manifest] ...
pkgsend close [-A | [--no-index] [--no-catalog]]
pkgsend publish [-d basedir] ... [-T pattern] [--no-index]
[--fmri-in-manifest | pkg_fmri] [--no-catalog] [manifest] ...
pkgsend generate [-T pattern] [--target file] bundlefile ...
pkgsend refresh-index
Options:
-s repo_uri target repository URI
--help or -? display usage message
Environment:
PKG_REPO
and there is an interesting command line option that looks a bit reminiscent of the last error “–fmri-in-manifest“. Giving it a shot:
pfexec pkgsend -s file:///scratch/my-repository/ publish -d proto --fmri-in-manifest mypkg.p5m.4.res pkg://mypublisher/mypkg@1.0,5.11-0:20111121T141936Z PUBLISHED
And it works!
Procedural SQL*Plus and Password Encryption
DB Optimizer 18 Nov 2011, 1:33 am CET
One small but bothersome issue I’ve had for 20 years is how to drive a program like SQL*Plus with a shell script to make it procedural. One approach is to just run single commands to SQL*Plus causing a connection and exit for every SQL statement. Connecting and disconnecting is costly. Ideally, I just want to open up a connection and send commands to the connection and get the response. Of course languages like java, perl, python all have Oracle connection APIs but what if I just want a simple shell script and don’t have access to perl or python with the Oracle APIs or access to java? Can’t I just do it in shell? Can’t I just connect SQL*Plus to a named pipe and echo my commands into the pipe? Well yes but there is an annoying obstacle. After echoing the first command the pipe, which SQL*Plus dutifully executes, SQL*Plus then exits. I’ve never seen anyone actually succeed with this approach out in the field though once I figured it out. I first ran into this problem about 20 years ago and didn’t solve it at the time. A few years later, I found out how and have been using it ever since. The trick is to have a process run a “tail -f” of an empty file into the pipe. With this second process tailing, SQL*Plus doesn’t exit. Since I first started using this I’ve never looked into exactly why. It think that when SQL*Plus tries to read from the pipe alone after the first command has been sent, the OS says, no more data , and SQL*Plus exits. With the second process doing the “tail -f”, then the OS tells SQL*Plus, waiting for more data to send you, and SQL*Plus waits. Would love a more detailed explanation.
#!/bin/ksh
SID=orcl
PORT=1521
function usage
{
echo "Usage: $(basename $0) [sid] [port]"
echo " username database username"
echo " username database password"
echo " host hostname or IP address"
echo " sid optional database sid (default: orcl)"
echo " port optional database port (default: 1521)"
exit 2
}
[[ $# -lt 3 ]] && usage
[[ $# -gt 5 ]] && usage
[[ $# -gt 0 ]] && UN=$1
[[ $# -gt 1 ]] && PW=$2
[[ $# -gt 2 ]] && HOST=$3
[[ $# -gt 3 ]] && SID=$4
[[ $# -gt 4 ]] && PORT=$5
MKNOD=/etc/mknod
DEBUG=0
OPEN=sqlplus.open
PIPE=sqlplus.pipe
CLEAN=sqlplus.clean
sh ./$CLEAN > /dev/null 2>&1
echo "" > $CLEAN
echo "rm $OPEN" >> $CLEAN
echo "rm $PIPE" >> $CLEAN
rm $OPEN $PIPE > /dev/null 2>&1
touch $OPEN
cmd="$MKNOD $PIPE p"
eval $cmd
tail -f $OPEN >> $PIPE &
OPENID="$!"
echo "kill -9 $OPENID" >> $CLEAN
# run SQLPLUS silent unless DEBUG is 2 or higher
SILENT=""
if [ $DEBUG -lt 2 ]; then
SILENT="-s"
fi
CONNECT="$UN/$PW@(DESCRIPTION=
(ADDRESS_LIST=
(ADDRESS=
(PROTOCOL=TCP)
(HOST=$HOST)
(PORT=$PORT)))
(CONNECT_DATA=
(SERVER=DEDICATED)
(SID=$SID)))"
cmd="sqlplus $SILENT "$CONNECT" < $PIPE > /dev/null &"
cmd="sqlplus $SILENT "$CONNECT" < $PIPE &"
echo "$cmd"
echo "PIPE $PIPE"
eval $cmd
SQLID="$!"
echo "kill -9 $SQLID" >> $CLEAN
Example execution
$ ./sqlplus_pipe.sh scott tiger 192.168.1.2 PIPE sqlplus.pipe $ echo 'select * from dual;' > sqlplus.pipe D - X $ echo 'exit' > sqlplus.pipe $ sh sqlplus.clean
The above code will create a SQL*Plus connection reading it’s input from a pipe. It also creates a cleanup file to remove the pipe file and kill the tail process. If creating multiple connections then the file names will have to be pre/post-pended with some string to keep them separate. could be a timestamp. Could info about which target.
Now what does this have to do with password encryption? Well there are packages that handle password encryption. Oracle has as cool thing called wallets, that can be setup so that SQL*Plus can connect without a password. Oracle’s Doc http://download.oracle.com/docs/cd/B19306_01/network.102/b14266/cnctslsh.htm Here is a quick setup: http://www.oracle-base.com/articles/10g/SecureExternalPasswordStore_10gR2.php Unfortunately Oracle’s wallet method requires files that can’t be redistributed. These file dependencies legally have to be installed by the end user, where as Oracle instant client can be redistributed. So what if I’m redistributing a tool that uses instant client? Then the wallet solution is out of the question, at least for easy installs. Now what if I create my own binary to handle password encryption, like “Oracle Password Repository” http://opr.sourceforge.net/ This doesn’t help that much for protection, because a user can actually get the text password. What I want is a way to encrypt passwords and hide the unencrypted passwords from the user. Sure I want them to connect, then they have access to the database, but I want to prevent them from walking off with a file full of clear text passwords. One solution, like the Oracle wallet with the “cwallet.sso” files, is to have a binary that creates the connections for SQL*Plus over a pipe and then pass the pipe to the users. Bit of a hack, especially for an interactive users, but for scripts that run SQL*Plus it not only centralizes the passwords, but it encrypts and helps prevent the user from getting access to and walking away with a set of clear text passwords. Just for kicks here is what truss looks like echoing ‘select * from dual’ into the pipe with and without having a second process tailing (nothing) into the pipe:
fstat64(1, 0x08044440) = 0 write(1, "n", 1) = 1 write(1, " Dn", 2) = 2 write(1, " -n", 2) = 2 write(1, " Xn", 2) = 2 write(1, "n", 1) = 1 read(0, 0x0813FAD4, 5120) (sleeping...)
without
write(1, "n", 1) = 1 write(1, " Dn", 2) = 2 write(1, " -n", 2) = 2 write(1, " Xn", 2) = 2 write(1, "n", 1) = 1 write(1, " S Q L > ", 5) = 5 read(0, 0x0813F714, 5120) = 0 write(4, " r 06 03t".., 13) = 13 read(4, " 11 06 t01".., 2064) = 17 write(1, " D i s c o n n e c t e d".., 95) = 95 write(1, " W i t h t h e P a r".., 78) = 78 write(4, " n 06 @", 10) = 10 close(4)
Why does the same job run slower on an exact clone?
DB Optimizer 15 Nov 2011, 2:59 am CET
A customer was working with a clone copy of their production database for reports, but the cloned database reports were running slower than the same reports on production and they wanted to know why.
Ostensibly the machines were similar and the memory was the same, though of it’s always worth verifying. The first thing I checked was the load profile. If you have OEM you can look at the performance page and/or top activity page to see the load profile. The load profile of production was much more CPU bound and much less I/O bound than the clone and their was clearly a difference.
The customer knew when they had run the report on the clone but was unsure exactly when it was run on production though knew it had been run in the past day.
On the clone box, during the report run the load profile using SQL script to reproduce the performance graph looked like :
07-NOV CLONE TM AAS GRAPH ----- ---- ---------------------------------------------------- 11:00 .2 + 6 12:00 .2 - 6 13:00 .1 6 14:00 .1 6 15:00 .1 6 16:00 14.9 ++++++ooooooooo6ooooooooooooooooooooooooooooooooooo- 17:00 .0 6
The reports were run between 4pm and 5pm (ie 16:00-17:00) and that’s easy to see from the load chart. The “+” represent CPU, “o” represent I/O, and “-” other wait time. Now the reports on production were suppose to be run around 7am but the client wasn’t sure. Here is what the load profile looked like on production looked like
07-NOV PRODUCTION time AAS GRAPH ----- --- --------------------------------------------------------------------- 00:00 1.5 ++++o 01:00 1.1 +++ 02:00 .4 + 03:00 .2 + 04:00 .4 ++ 05:00 1.5 ++ooo 06:00 1.6 +++oo 07:00 3.2 ++++++++ooo- 08:00 3.6 ++++++++++ooo-- 09:00 6.1 +++++++++++++++++oooooo---- 10:00 4.7 +++++++++++++++++ooo 11:00 5.3 +++++++++++++++++++ooo- 12:00 0.0 +++++++++++++++++++++++++++++++++oooooooo--------------------------- 13:00 0.5 ++++++++++++++++++++++++oooooooooooooooooooo 14:00 19.2++++++++++++++++++++++++++++++++oooooooooooooooooooooooooooooooooooooo 15:00 6.5 +++++++++++++++oooooo
The customer thought the report had been run at 8am on production and at 8am there is little I/O wait and some CPU. Looking at the top SQL from the clone during the report verses the production database at 8am didn’t show any of the same top SQL. At this point I ask the customer to send me AWR exports of the two databases. I imported the AWR exports giving the clone database DBID=1 and production DBID=2
The first thing I check was the I/O latency in the AWR repository data. Both production and the clone showed I/O averaging 6ms, so latency was unlikely to be the problem. Clone Latency during the report:
BTIME EVENT_NAME AVG_MS CT --------------- ------------------------------ --------- ---------- 07-NOV-11 16:00 db file scattered read 6.48 4,246 07-NOV-11 16:00 db file sequential read 8.15 4,760,454 07-NOV-11 16:00 direct path read 3.50 680,192 07-NOV-11 16:00 direct path write 10.22 14 07-NOV-11 16:00 direct path write temp .62 2
Production looked like (for example)
BTIME EVENT_NAME AVG_MS CT --------------- ------------------------------ --------- ---------- 07-NOV-11 14:00 db file scattered read 2.93 148,226 07-NOV-11 14:00 db file sequential read 6.35 4,961,761 07-NOV-11 14:00 direct path read 2.32 2,706,322 07-NOV-11 14:00 direct path write .63 204 07-NOV-11 14:00 direct path write temp .86 820
Thus the clone looks slightly slower, but not enough to explain I/O load seen on the clone.
Then I check for the top SQL during the report which gave:
select
SQL_ID ,
sum(decode(session_state,'ON CPU',1,0)) "CPU",
sum(decode(session_state,'WAITING',1,0)) -
sum(decode(session_state,'WAITING', decode(wait_class, 'User I/O',1,0),0)) "WAIT" ,
sum(decode(session_state,'WAITING', decode(wait_class, 'User I/O',1,0),0)) "IO" ,
sum(decode(session_state,'ON CPU',1,1)) "TOTAL"
from dba_hist_active_sess_history
where SQL_ID is not NULL
and dbid=1
group by sql_id
order by sum(decode(session_state,'ON CPU',1,1)) desc
SQL_ID CPU WAIT IO TOTAL ------------- ------ ------ ------ ------ 4sfx95sjafc03 31.0 .0 736.0 767
I wasn’t confident that the customer knew when the report was run on production, so I just searched for these SQL_ID values over the last 12 hours. The top query looks like a radically different I/O profile
select
SQL_ID ,
sum(decode(session_state,'ON CPU',1,0)) "CPU",
sum(decode(session_state,'WAITING',1,0)) -
sum(decode(session_state,'WAITING', decode(wait_class, 'User I/O',1,0),0)) "WAIT" ,
sum(decode(session_state,'WAITING', decode(wait_class, 'User I/O',1,0),0)) "IO" ,
sum(decode(session_state,'ON CPU',1,1)) "TOTAL"
from dba_hist_active_sess_history
where SQL_ID is not NULL
and dbid=2
and sql_id = '4sfx95sjafc03',
group by sql_id
order by sum(decode(session_state,'ON CPU',1,1)) desc
/
SQL_ID CPU WAIT IO TOTAL ------------- ------ ------ --------- ------ 4sfx95sjafc03 12.0 .0 39.00 51
Now I wanted verify the stats for each SQL statement to make sure the query was doing the same work:
VDB
SNAP_ID EXECS DISK LIO RWS CPU ELAPSE IO_TIME READS
---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
18798 25648 2746798 21669637 12554 1768040000 7867477859 7430523627 2746594
Prod
18887 116449 52123 3026366 77023 67260000 377033758 313874605 52122
The query is being executed more on Prod and doing less disk and less CPU. Now let’s verify it’s execution plan:
DB SQL_ID PLAN_HASH_VALUE
--------- ------------- ---------------
clone 4sfx95sjafc03 1805550727
Prod 4sfx95sjafc03 3038004817
different execution plans lets look at the two execution plans
(I used a direct query on DBA_HIST_SQL_PLAN but you can also used the package provided by Oracle, for example:
select * from table(dbms_xplan.display_awr('4sfx95sjafc03',1805550727,1)); -- sql_id, plan_hash, DBID
)
Plan Hash
VDB Production
----------- ------------
1805550727 3038004817
OPERATION
---------------------------------- --------------------
SELECT STATEMENT_ SELECT STATEMENT_
SORT_ORDER BY SORT_ORDER BY FILTER_
FILTER_ FILTER_
NESTED LOOPS_ NESTED LOOPS_
NESTED LOOPS_ NESTED LOOPS_
-> TABLE ACCESS_BY INDEX ROWID PS_PAY_CHECK TABLE ACCESS_BY INDEX ROWID PS_PAY_EARNIN
-> INDEX_RANGE SCAN PS1PAY_CHECK INDEX_RANGE SCAN PSEPAY_EARNINGS
-> INDEX_RANGE SCAN PS_PAY_EARNINGS INDEX_UNIQUE SCAN PS_PAY_CHECK
-> TABLE ACCESS_BY INDEX ROWID PS_PAY_EARNINGS TABLE ACCESS_BY INDEX ROWID PS_PAY_CHECK
TABLE ACCESS_BY INDEX ROWID PS_PAY_CHECK TABLE ACCESS_BY INDEX ROWID PS_PAY_CHECK
INDEX_RANGE SCAN PS0PAY_CHECK INDEX_RANGE SCAN PS0PAY_CHECK
So the execution plans have changes. Now why have they changed? That’s a story for another day, but the plan would only change if the data and/or the statistics on the tables had changed.
Why OEM drives me crazy sometimes
DB Optimizer 10 Nov 2011, 10:14 pm CET
OEM just seems too have to many brittle esoteric configuration files and process dependencies. Ideally I just want to connect with system/password and go. Is that too simple to ask for?
Today I tried out OEM and got the general broken page:
And my first reaction was just to give up and move on, but then I noticed the error message sounded some what simple:
ORA-28001: the password has expired (DBD ERROR: OCISessionBegin)
Hmm, maybe this *is* easily fixable. Well guess again. Luckily someone has well documented the fix
After the “fix” EM still wouldn’t work because DBSNMP’s password had expired as well, but at least it gave a password screen to change the password. So to simply launch OEM to get to the performance page after my system password had expired I had to change the passwords for not one but 3 users: SYSTEM, SYSMAN and DBSMP which is crazy enough, but the worst part is that SYSMAN’s password is hardcoded, though encrypted,in a flat file that has to found (good luck) and edited.
PS, to turn off password expiration which is on by default in 11 but was off by default in 10:
alter profile default limit password_life_time unlimited;
EMlight
DB Optimizer 10 Nov 2011, 7:20 am CET
I’ve been meaning to blog about EMlight (http://obzora.com) for a while. I first tried it about 6 months ago, and was impressed. EMlight is a light weight re-write of Enterprise Manager that is dependable and improved. How is it improved? Let’s look at my favorite part, the “Top Activity” screen. The “Top Activity” screen has added a section on I/O latencies which is sorely missing from OEM’s performance page and Top Activity page. The OEM light version also includes a list of the top events (as I did in DB Optimizer). Very cool.
How is EMlight more dependable? With EMlight you just connect to the database and voila it works. It doesn’t require any of the brittle configuration files that OEM depends on nor the agents and processes that OEM depends on.
Here is an example of the To Activity screen in EMlight:
I tested version 1.2. Since then versions 1.3 and 1.4 have been released. I’ve meant to test these but haven’t yet and didn’t want to wait any longer before posting about this promising tool.
Just gave a quick look at 1.4 and looks pretty much the same but obvious improvements such as the “top activity” screen above now has a super responsive slider window to choose the aggregation time range, and you might notice in the above screen shot that CPU is missing in two of the 3 aggregates which is now fixed in 1.4 (I think the fix actually made it into 1.3 originally)
Check out more screen shots here http://obzora.com/screenshots.html
Here are a few other screen shots fro my testing:
Datafiles along with single block I/O latency which is a nice addition
I/O along with I/O latency (I don’t think OEM has every added I/O latency )
Buffer cache contents and advisory:
Archive file date times sizes and rates
Tables space usage and attributes
Top Session, PL/SQL, SQL, Events (kind of like the top
activity page without the timeline)
Importing AWR repositories from cloned databases
DB Optimizer 9 Nov 2011, 12:22 am CET
I sometimes want to import AWR repositories from cloned databases to compare performance differences between the databases which should have the same performance characteristics. Unfortunately the clones have the same DBID this importing their AWR repositories into a central AWR repository would be probelematic as I wouldn’t have a way to tell which database the data came from as the DBID is the same. So here is a procedure to change the DBID. THe procedure requires importing the data into the awr_stage user, changing the DBID, then moving the data into the AWR repository: (see http://gavinsoorma.com/2009/07/exporting-and-importing-awr-snapshot-data/ for info on normal exporting and importing of AWR repositories)
The script uses a awr_change_dbid.sql to defer the constraint on DBID, change the DBID and then put the constraint back
create tablespace AWR datafile '/home/oracle/oradata/AWR_01.dbf' size 200M;
Drop Directory AWR_DMP;
Create Directory AWR_DMP AS '/home/oracle/awr_dumps';
-- create staging user user
drop user awr_stage cascade;
create user awr_stage
identified by awr_stage
default tablespace awr
temporary tablespace temp;
grant connect to awr_stage;
alter user awr_stage quota unlimited on awr;
alter user awr_stage temporary tablespace temp;
-- load data
begin
dbms_swrf_internal.awr_load(schname => 'AWR_STAGE',
dmpfile => 'my_awr_dump', -- file w/o .dmp extension
dmpdir => 'AWR_DMP');
end;
/
-- change dbid
@awr_change_dbid
commit;
-- move data
def schema_name='AWR_STAGE'
select '&schema_name' from dual;
variable schname varchar2(30);
begin
:schname := '&schema_name';
dbms_swrf_internal.move_to_awr(schname => :schname);
end;
/
-- verify the new DBID in repository
col host_name for a30
select distinct dbid, db_name, instance_name, host_name from
dba_hist_database_instance;
W-ASH : Web Enabled ASH
DB Optimizer 31 Oct 2011, 11:04 pm CET
I’m excited about the ease of creating web enabled rich user applications given the state of technology now. JavaScript and JQuery have gone from being disdained as “not a very serious” language to moving towards the lime light of front and center.
Here is a small example.
Download the following file: W-ASH (web enabled ASH)
Go to your apache web server root, in my case on redhat Linux is
# cd /usr/local/apache2 # gzip -d wash.tar.gz # tar xvf wash.tar -rwxr-xr-x 21956 14:08:21 cash.sh -rw-r--r-- 30881 11:52:10 htdocs/ash.html drwxr-xr-x 0 15:40:52 htdocs/js/ -rwxr-xr-x 10958 14:04:42 cgi-bin/json_ash.sh
Now you are almost ready to go. You just need to start the data collection with “cash.sh” (collect ASH)
./cash.sh Usage: usage <username> <password> <host> [sid] [port]
The script “cash.sh” requires “sqlplus” be in the path and that is all. It’s probably easiest to move/copy cash.sh to an ORACLE_HOME/bin, su – oracle, and then kick it off as in:
nohup cash.sh system change_on_install 172.16.100.250 orcl &
The script “cash.sh” will create a directory in /tmp/MONITOR/day_of_the_week for each day of the week, clearing out any old files, so there are only maximum 7 days of data.
To view the data go to your web server address and add “ash.html?q=machine:sid” For example my web server is on 172.16.100.250 The database I am monitoring is on host 172.16.100.250 with Oracle SID “orcl”
http://172.16.100.250/ash.html?q=172.16.100.250:orcl
See video at : http://screencast.com/t/sZrFxZkTrmn
NoCOUG’s 100th Conference Nov 9th!
DB Optimizer 25 Oct 2011, 7:55 am CEST
Download the 100th issue of the NoCOUG Journal at http://bit.ly/rC2gRA.
Can you believe it? 100 conferences! The Northern California Oracle Users Group
(NoCOUG) is celebrating its 25th anniversary in 2011 with its
100th conference. The upcoming conference at the Computer History Museum on
November 9 is NoCOUG’s 100th quarterly conference and the upcoming
issue of the NoCOUG Journal will be the 100th issue.
NoCOUG Conference 100 will feature some of the world’s top Oracle
experts and speakers including Steven Feuerstein
, Craig Shallahamer
, Alex Gorbachev
, and Kyle Hailey
. It will be held at the Computer History Museum in Mountain
View—a fitting location for such an occasion. The museum features
marvelous computing artifacts such as a Hollerith Tabulating
Machine and an actual operational Babbage Difference
Engine—one of two that have been constructed in the past
decade.
The 100th issue of the NoCOUG Journal will feature an interview
with Michael
Stonebraker—the high priest of relational databases, the
research paper by Dr. Edgar Codd
that started the relational revolution in 1970—A Shared Model of
Data for Large Shared Data Banks, and the results of the Second
International NoCOUG SQL Challenge.
Because of the significance of the occasion and the venue NoCOUG is
expecting serveral hundred people that this conference will quickly
reach maximum capacity, so RSVP now here: http://www.nocoug.org/rsvp.html
RSVP by October 31 at http://www.nocoug.org/next.
From
to

Tuning Blog Entries
DB Optimizer 20 Oct 2011, 7:11 pm CEST
Organizing some of my past blog entries on Oracle database performance tuning and analysis. I’ll add more notes as well as a separate section on SQL tuning in particular.
ASH
- SQL execution times from ASH – using ASH to see SQL execution times and execution time variations
- AAS on AWR – my favorite ASH query that shows AAS wait classes as an ascii graph
- CPU Wait vs CPU Usage
AWR
- Wait Metrics vs v$system_event
- Statistic Metrics verses v$sysstat
- I/O latency fluctuations
- I/O wait histograms
- AWR mining
- Diff’ing AWR reports
- Importing AWR repositories
Redo
Etc
SQL
| More |
























