【摘要】本文是一篇很实用的技术文章。介绍了逐步实现自动化的基本步骤、需要获取的数据库诊断信息,对实现自动化脚本集成进行了详细说明,并提供了参考脚本。
【作者】赵海
一、逐步实现自动化的基本步骤
1. 手动巡检
最初阶段,多数DBA习惯于将常用的SQL语句总结记录下来,在日常巡检的时候,会在数据库中执行常用的SQL语句,然后以抓屏方式将结果复制出来,然后逐条检查分析。这种方法只适合于管理少量数据库,一旦数据库数量增加,工作量会非常大。
2. 脚本巡检
升级阶段,具备脚本语言开发功底的DBA会逐步将常用的SQL语句编制为固定SQL脚本,然后通过操作系统可执行的脚本语言(例如:KSH、BASH),通过Shell脚本去调用SQL脚本,并把执行结果写入日志文件,后期通过日志文件去进行检查分析。这种方法可管理较多数据库,但是需要频繁切换到不同的数据库服务器,容易误操作,尤其是操作系统平台不统一的场景。
3. 自动化巡检
二、需要获取的数据库诊断信息
1. 基本信息
基本信息根据不同的数据库环境会有所差异,但是有些最基本的信息是必须的。如数据库名称、实例名称、唯一标识信息、系统版本、地址信息、数据库创建时间、数据库启动时间、数据库版本信息、数据库补丁信息、字符集、实例状态、数据库归档模式等。以下脚本(for Oracle)供参考:
SELECT name AS db_name,
dbid,
created,
platform_name,
db_unique_name,
log_mode,
guard_status,
force_logging,
flashback_on
FROM v$database;
SELECT instance_name,
inst_id,
host_name,
TO_CHAR(startup_time, 'YYYY/MM/DD HH24:MI:SS') AS startup_time,
status,
SYSDATE - STARTUP_TIME AS Running_Time
FROM gv$instance
ORDER BY inst_id;
2. 存储信息
存储信息包括逻辑存储信息和物理存储信息,具体包含数据库的表空间、数据文件、磁盘三个维度。表空间需要收集表空间名称,类型,区管理方式,区初始大小,段的管理方式,总大小,剩余大小,剩余百分比;数据文件需要收集对应的表空间,是否自动扩展,当前值,最大值等;磁盘信息需要收集磁盘组的序号、名称、冗余类型、总大小、剩余大小、使用频繁的数据大小、镜像需要的空间、可用于镜像的大小、offline的磁盘数量等。以下脚本(for Oracle)供参考:
SELECT c.tablespace_name,
c.contents,
c.extent_management ext_mgmt,
c.allocation_type alloc_type,
c.initial_extent / 1024 ext_kb,
c.segment_space_management SSM,
nvl(a.total_gb, 0) total_gb,
decode(c.contents, 'TEMPORARY', nvl(a.total_gb, 0) - c.block_size * b.used_blocks / 1073741824, nvl(b.free_gb, 0)) free_gb,
100 * decode(c.contents, 'TEMPORARY', nvl(a.total_gb, 0) - c.block_size * b.used_blocks / 1073741824, nvl(b.free_gb, 0)) / nvl(a.total_gb, 1) free_pct
FROM (SELECT tablespace_name, nvl(sum(bytes) / 1073741824, 0) total_gb
FROM dba_data_files
GROUP BY tablespace_name
UNION all
SELECT tablespace_name, nvl(sum(bytes) / 1073741824, 0) total_gb
FROM dba_temp_files
GROUP BY tablespace_name) a,
(SELECT tablespace_name,
nvl(sum(bytes) / 1073741824, 0) free_gb,
0 used_blocks
FROM dba_free_space
GROUP BY tablespace_name
UNION all
SELECT tablespace_name,
0 free_gb,
nvl(sum(used_blocks), 0) used_blocks
FROM gv$sort_segment
GROUP BY tablespace_name) b,
dba_tablespaces c
WHERE c.tablespace_name = b.tablespace_name(+)
AND c.tablespace_name = a.tablespace_name(+)
ORDER BY c.contents, free_pct, c.tablespace_name;
SELECT file_id,
file_name,
tablespace_name,
autoextensible,
bytes / 1073741824 as current_gb,
maxbytes / 1073741824 as max_gb
FROM dba_data_files
UNION
SELECT file_id,
file_name,
tablespace_name as ts_name,
autoextensible,
bytes / 1073741824 as cur_gb,
maxbytes / 1073741824 as max_gb
FROM dba_temp_files
ORDER BY tablespace_name, file_id, file_name;
SELECT group_number,
name,
type,
total_mb,
free_mb,
hot_used_mb,
required_mirror_free_mb,
usable_file_mb,
offline_disks
FROM v$asm_diskgroup_stat;
3. 日志信息
日志信息主要是对联机日志、快速恢复日志相关动态信息的统计分析。具体包括联机日志的组号、线程号、序列号、文件路径、大小、状态、归档状态、组内成员数量;归档日志产生日期,产生的大小,和文件数等;每天日志切换的量以及切换的频率。以下脚本(for Oracle)供参考:
SELECT t2.member,
t1.group#,
t1.thread#,
t1.sequence#,
t1.bytes / 1024 / 1024 AS SIZE_MB,
t1.status,
t1.archived,
t1.members
FROM v$log t1, v$logfile t2
WHERE t1.group# = t2.group#
ORDER BY thread#, group#;
SELECT lpad(to_char(first_time, 'yyyymmdd'), 12) "DATE",
trunc(sum(blocks * block_size) / 1024 / 1024) "SIZE(MB)",
count(*)
FROM v$archived_log
WHERE first_time > sysdate - 6
AND creator = 'ARCH'
GROUP BY lpad(to_char(first_time, 'yyyymmdd'), 12)
ORDER BY 1;
SELECT SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH:MI:SS'),1,5) Day,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'00',1,0)) H00,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'01',1,0)) H01,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'02',1,0)) H02,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'03',1,0)) H03,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'04',1,0)) H04,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'05',1,0)) H05,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'06',1,0)) H06,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'07',1,0)) H07,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'08',1,0)) H08,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'09',1,0)) H09,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'10',1,0)) H10,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'11',1,0)) H11,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'12',1,0)) H12,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'13',1,0)) H13,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'14',1,0)) H14,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'15',1,0)) H15,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'16',1,0)) H16,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'17',1,0)) H17,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'18',1,0)) H18,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'19',1,0)) H19,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'20',1,0)) H20,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'21',1,0)) H21,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'22',1,0)) H22 ,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'23',1,0)) H23,
COUNT(*) TOTAL
FROM v$log_history a
WHERE first_time>=to_char(sysdate-7)
GROUP BY SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH:MI:SS'),1,5)
ORDER BY SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH:MI:SS'),1,5) DESC;
SELECT thread#,
sequence#,
to_char(first_time, 'MM/DD/RR HH24:MI:SS')
FROM v$log_history
WHERE thread# = 1
AND first_time > sysdate - 1
ORDER BY first_time DESC;
SELECT thread#,
sequence#,
to_char(first_time, 'MM/DD/RR HH24:MI:SS')
FROM v$log_history
WHERE thread# = 2
AND first_time > sysdate - 1
ORDER BY first_time DESC;
4. 备份信息
备份信息主要是查看数据库备份任务的执行情况。主要包括备份任务的会话信息标识,备份内容,开始时间,结束时间,消耗时间。以下脚本(for Oracle)供参考:
SELECT session_key,
start_time,
end_time,
status,
time_taken_display tt
FROM v$rman_backup_job_details
WHERE start_time > sysdate -1
ORDER BY session_key;
5. 性能信息
数据库性能诊断分析的基本目标分两个方面,一方面要看数据库本身跟性能有关的资源使用情况是否正常,另外一方面就是要抓取异常的会话、事件、SQL等。因此,这部分内容首先要收集数据库缓存使用情况,然后要收集异常会话以及相关进程的资源参数,24小时内CPU等待最长的事件,执行解析最多的SQL语句等。以下脚本(for Oracle)供参考:
SELECT free_space,
avg_free_size,
used_space,
avg_used_size,
request_failures,
last_failure_size
FROM v$shared_pool_reserved;
SELECT *
FROM (SELECT t.sid,
t.serial#,
trunc(sysdate - logon_time) AS online_time,
t.PROGRAM,
t.status,
t.LOGON_TIME,
t.sql_id,
t.prev_sql_id,
t.event
FROM gv$session t
WHERE t.type <> 'BACKGROUND' AND program is not null
ORDER BY logon_time)
WHERE rownum <= 30;
SELECT *
FROM gv$resource_limit
WHERE trim(limit_value) != 'UNLIMITED';
SELECT t.*, s.sid, s.serial#, s.machine, s.program, s.osuser
FROM (SELECT c.USERNAME, a.event, to_char(a.cnt) as seconds, a.sql_id, dbms_lob.substr(b.sql_fulltext, 100, 1) sqltext
FROM (SELECT rownum rn, t.*
FROM (SELECT decode(s.session_state, 'WAITING', s.event, 'Cpu + Wait For Cpu') Event, s.sql_id, s.user_id, count(*) cnt
FROM v$active_session_history s
WHERE sample_time > sysdate - 1
GROUP BY s.user_id, decode(s.session_state, 'WAITING', s.event, 'Cpu + Wait For Cpu'), s.sql_id
ORDER BY cnt DESC) t
WHERE rownum < 20) a, v$sqlarea b, dba_users c
WHERE a.sql_id = b.sql_id AND a.user_id = c.user_id
ORDER BY cnt DESC) t, v$session s
WHERE t.sql_id = s.sql_id(+);
SELECT *
FROM (SELECT sql_id, sql_text, s.executions, s.last_load_time, s.first_load_time, s.disk_reads, s.buffer_gets, s.parse_calls
FROM v$sql s
ORDER BY s.executions DESC)
WHERE rownum <= 10;
SELECT *
FROM (SELECT sql_id, sql_text, s.EXECUTIONS, s.LAST_LOAD_TIME, s.FIRST_LOAD_TIME, s.DISK_READS, s.BUFFER_GETS, s.PARSE_CALLS
FROM v$sql s
ORDER BY s.PARSE_CALLS DESC)
WHERE rownum <= 10;
6. 容灾信息
所谓容灾信息就是指数据库有容灾配置模式的场景,比如Oracle的Data Guard。如果有相关的配置,则需要检查主备库的同步是否异常,主要通过归档日志的同步信息来判断分析。这个时候需要获取归档目标名称、状态、数据库当前模式、目的地路径等静态信息,需要获取每个节点应用日志和归档日志的最大号,SCN最大值和最小值等相关信息。以下脚本(for Oracle)供参考:
SELECT dest_name,
status,
database_mode,
destination
FROM v$archive_dest_status
WHERE dest_id in ('1','2');
SELECT m.thread#,
m.sequence#,
first_change#,
next_change#
FROM v$log_history m,
(SELECT thread#, max(sequence#) as sequence#
FROM v$log_history
GROUP BY thread#) t
WHERE m.thread# = t.thread#
AND m.sequence# = t.sequence#;
SELECT UNIQUE thread# AS thread,
MAX(sequence#) OVER (PARTITION BY thread#) AS last
FROM v$archived_log;
三、实现自动化脚本集成
1. 将基本SQL文转换成可用SQL脚本
通过手动执行SQL命令的方式,查询结果可以展现在屏幕上。但是以脚本后台模式执行的时候,就需要将查询结果格式化之后输入到结果文件当中。可以采用文本文件或者HTML文件。采用文件文件的时候,需要对输出结果进行美观易读方面的格式化,想再升级为HTML文件时,SQL脚本中要加HTML头。例如以下是格式化之前和之后的对比:
PROMPT <p><h1 class="awr">XX公司数据库巡检报告</h1>
PROMPT <p><a class="awr">数据库基本信息 </a>
COLUMN log_mode FOR a12
COLUMN guard_status FOR a10
COLUMN force_logging FOR a15
COLUMN flashback_on FOR a15
COLUMN db_unique_name FOR a10
COLUMN platform_name FOR a20
SELECT name AS db_name,
dbid,
created,
platform_name,
db_unique_name,
log_mode,
guard_status,
force_logging,
flashback_on
FROM v$database;
2. 通过Shell脚本调用SQL脚本
如果单纯通过Shell脚本完成对SQL脚本的调用还是有些单薄。因此Shell脚本的设计需要完成两方面的功能,一方面需要收集操作系统相关的诊断信息,另外一方面要定义SQL脚本执行的初始化参数,完成对SQL脚本的调用执行。对于操作系统级别的诊断信息收集,无非是在Shell当中调用操作系统命令,如:uptime、vmstat、free、df、sar、iostat;对于SQL脚本的调用,需要完善相关初始化信息,如位置信息(脚本位置、输入文件位置、数据库安装目录等相关信息),时间信息(执行日期、时间),连接必要信息(用户、数据库名称、标识等)。以下调用过程脚本片段供参考:
function _dbinfo(){
cd $PWDDIRECTORY
su - oracle -c "cd $LOGDIR/db;export ORACLE_SID=$1;sqlplus -s /nolog <<EOF
conn / as sysdba;
@$LOGDIR/../dailyhealthycheck_withadg.sql
exit;
EOF"
}
3. 通过Python脚本实现集中自动化调用
对于这个步骤实现的功能有两个重点,一个是完成集中化的优势,一个是完成自动化交互的功能。集中化的优势需要在循环语句当中完成。自动化交互的功能本来就是Python之类脚本语言的优势。只是我们在调用的时候需要考虑的初始化参数的定义和确认。以下调用过程脚本片段供参考:
def ssh_cmd(ip, user, passwd, cmd):
ret = -1
ssh = pexpect.spawn('ssh %s@%s "%s"' % (user,ip, cmd),timeout=120)
try:
i = ssh.expect(['password:', 'continue connecting (yes/no)?'], timeout=5)
if i == 0 :
ssh.sendline(passwd)
elif i == 1:
ssh.sendline('yes\n')
ssh.expect('password: ')
ssh.sendline(passwd)
ssh.sendline(cmd)
r = ssh.read()
print (r)
ret = r
except pexpect.EOF:
print ("EOF")
ssh.close()
ret = -1
except pexpect.TIMEOUT:
print ("TIMEOUT")
ssh.close()
ret = -2
return ret
4. 工具化产品化打造过程
大部分数据库管理员设计脚本工具都是为了便于自己的日常管理工作,基本上不会考虑到工具的可维护性、健壮性、规范化以及未来的扩展性。如果想让自己辛苦设计出来的脚本最终可以转化为自动化运维工具,那么还要考虑到以下三个方面的问题:
(1). 遵循开发者规范,将脚本的设计趋向于标准化、模块化。例如脚本注释标准化,利用函数模块化设计,变量定义规范化。
(2). 每一个执行步骤要考虑到执行前的确认步骤,执行后的反馈步骤,执行中的异常处理场景。
四、参考脚本
-- -----------------------------------------------------------------------------------
-- File Name : dailyhealthycheck_withadg_pdb.sql
-- Author : haizdl@126.com
-- Description : Daily Healthy Checking for Oracle Rac Database.
-- Requirements : Access to the V$ views.
-- Call Syntax : by script "healthyCheck_forLinux.sh" or "healthyCheck1_forAIX.sh"
-- Last Modified: 23/05/2017
-- -----------------------------------------------------------------------------------
SET MARKUP HTML ON SPOOL ON ENTMAP OFF PREFORMAT OFF
SET TERM OFF
SET HEADING ON
SET VERIFY OFF
SET FEEDBACK OFF
SET LINE 55555
SET PAGES 999999
SET LONG 999999999
SET LONGCHUNKSIZE 999999
-- -----------------------------------------------------------------------------------
-- SECTION: 巡检脚本初始化
-- -----------------------------------------------------------------------------------
COLUMN dbid new_value spool_dbid
COLUMN inst_num new_value spool_inst_num
SELECT dbid
FROM v$database
WHERE rownum = 1;
SELECT instance_number AS inst_num
FROM v$instance
WHERE rownum = 1;
COLUMN spoolfile_name new_value spoolfile
SELECT 'spool_'||(SELECT instance_name FROM v$instance WHERE rownum=1)||'_'||TO_CHAR(SYSDATE,'yy-mm-dd_hh24.mi')||'_daily' AS spoolfile_name
FROM dual;
spool &&spoolfile..html
PROMPT <p><h1 class="awr">XXX数据库日巡检报告</h1>
-- -----------------------------------------------------------------------------------
-- SECTION: 数据库基本信息
-- -----------------------------------------------------------------------------------
PROMPT <p><h3 class="awr">数据库基本信息汇总</h3>
/*
数据库标识、数据库名、创建日期、平台名称、唯一名称、归档模式等信息。
*/
PROMPT <p><a class="awr">数据库状态 </a>
COLUMN log_mode FOR a12
COLUMN guard_status FOR a10
COLUMN force_logging FOR a15
COLUMN flashback_on FOR a15
COLUMN db_unique_name FOR a10
COLUMN platform_name FOR a20
SELECT name AS db_name,
dbid,
created,
platform_name,
db_unique_name,
log_mode,
guard_status,
force_logging,
flashback_on
FROM v$database;
/*
实例的序号、名称、主机名、启动时间、状态、运行时间等。
*/
PROMPT <p><a class="awr">数据库实例基本信息</a>
CLEAR COLUMNS
SET LINE 200
COLUMN host_name FOR A50
SELECT instance_name,
inst_id,
host_name,
TO_CHAR(startup_time, 'YYYY/MM/DD HH24:MI:SS') AS startup_time,
status,
SYSDATE - STARTUP_TIME AS Running_Time
FROM gv$instance
ORDER BY inst_id;
-- -----------------------------------------------------------------------------------
-- SECTION: 联机重做日志信息
-- -----------------------------------------------------------------------------------
PROMPT <h3 class="awr">redo信息</h3>
/*
联机日志的组号、线程号、序列号、文件路径、大小、状态、归档状态、组内成员数量。
*/
PROMPT <p><a class="awr">数据库联机日志</a>
CLEAR COLUMNS
SET LINE 200
SET PAGES 1000
COL status FOR a30
COL member FOR a45
SELECT t2.member,
t1.group#,
t1.thread#,
t1.sequence#,
t1.bytes / 1024 / 1024 AS SIZE_MB,
t1.status,
t1.archived,
t1.members
FROM v$log t1, v$logfile t2
WHERE t1.group# = t2.group#
ORDER BY thread#, group#;
/*
(即可分析6天的波度,又可分析24小时内,可很容易看出异常情况)。
*/
PROMPT <p><a class="awr">最近7天中每天日志切换的量</a>
SELECT SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH:MI:SS'),1,5) Day,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'00',1,0)) H00,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'01',1,0)) H01,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'02',1,0)) H02,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'03',1,0)) H03,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'04',1,0)) H04,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'05',1,0)) H05,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'06',1,0)) H06,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'07',1,0)) H07,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'08',1,0)) H08,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'09',1,0)) H09,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'10',1,0)) H10,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'11',1,0)) H11,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'12',1,0)) H12,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'13',1,0)) H13,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'14',1,0)) H14,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'15',1,0)) H15,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'16',1,0)) H16,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'17',1,0)) H17,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'18',1,0)) H18,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'19',1,0)) H19,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'20',1,0)) H20,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'21',1,0)) H21,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'22',1,0)) H22 ,
SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'23',1,0)) H23,
COUNT(*) TOTAL
FROM v$log_history a
WHERE first_time>=to_char(sysdate-7)
GROUP BY SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH:MI:SS'),1,5)
ORDER BY SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH:MI:SS'),1,5) DESC;
/*
日志切换频率分析(注意观察各行里first_time之间的时间差异会不会很短,很短就是切换过频繁)。
*/
PROMPT <p><a class="awr">日志切换频率分析</a>
SET LINE 200
SET PAGES 1000
SELECT thread#,
sequence#,
to_char(first_time, 'MM/DD/RR HH24:MI:SS')
FROM v$log_history
WHERE thread# = 1
AND first_time > sysdate - 1
ORDER BY first_time DESC;
SELECT thread#,
sequence#,
to_char(first_time, 'MM/DD/RR HH24:MI:SS')
FROM v$log_history
WHERE thread# = 2
AND first_time > sysdate - 1
ORDER BY first_time DESC;
-- -----------------------------------------------------------------------------------
-- SECTION: 归档日志信息
-- -----------------------------------------------------------------------------------
PROMPT <h3 class="awr">归档日志信息</h3>
/*
归档日志产生日期,产生的大小,和文件数等。
*/
PROMPT <p><a class="awr">查询归档产生状况</a>
SELECT lpad(to_char(first_time, 'yyyymmdd'), 12) "DATE",
trunc(sum(blocks * block_size) / 1024 / 1024) "SIZE(MB)",
count(*)
FROM v$archived_log
WHERE first_time > sysdate - 6
AND creator = 'ARCH'
GROUP BY lpad(to_char(first_time, 'yyyymmdd'), 12)
ORDER BY 1;
-- -----------------------------------------------------------------------------------
-- SECTION: 空间使用
-- -----------------------------------------------------------------------------------
PROMPT <h3 class="awr">空间信息</h3>
/*
表空间名称,类型,区管理方式,区初始大小,段的管理方式,总大小,剩余大小,剩余百分比。
*/
PROMPT <p><a class="awr">数据库表空间信息</a>
CLEAR COLUMNS
COLUMN tablespace_name FOR a20
COLUMN contents FOR a9
COLUMN ext_mgmt FOR a12
COLUMN alloc_type FOR a9
COLUMN ext_kb FOR 9999999
COLUMN ssm FOR a10
COLUMN total_gb FOR 99999999.99
COLUMN free_gb FOR 99999999.99
COLUMN free_pct FOR 999.99
SET PAGES 100
SELECT c.tablespace_name,
c.contents,
c.extent_management ext_mgmt,
c.allocation_type alloc_type,
c.initial_extent / 1024 ext_kb,
c.segment_space_management SSM,
nvl(a.total_gb, 0) total_gb,
decode(c.contents, 'TEMPORARY', nvl(a.total_gb, 0) - c.block_size * b.used_blocks / 1073741824, nvl(b.free_gb, 0)) free_gb,
100 * decode(c.contents, 'TEMPORARY', nvl(a.total_gb, 0) - c.block_size * b.used_blocks / 1073741824, nvl(b.free_gb, 0)) / nvl(a.total_gb, 1) free_pct
FROM (SELECT tablespace_name, nvl(sum(bytes) / 1073741824, 0) total_gb
FROM dba_data_files
GROUP BY tablespace_name
UNION all
SELECT tablespace_name, nvl(sum(bytes) / 1073741824, 0) total_gb
FROM dba_temp_files
GROUP BY tablespace_name) a,
(SELECT tablespace_name,
nvl(sum(bytes) / 1073741824, 0) free_gb,
0 used_blocks
FROM dba_free_space
GROUP BY tablespace_name
UNION all
SELECT tablespace_name,
0 free_gb,
nvl(sum(used_blocks), 0) used_blocks
FROM gv$sort_segment
GROUP BY tablespace_name) b,
dba_tablespaces c
WHERE c.tablespace_name = b.tablespace_name(+)
AND c.tablespace_name = a.tablespace_name(+)
ORDER BY c.contents, free_pct, c.tablespace_name;
PROMPT <p><a class="awr">数据库表空间与数据文件</a>
/*
表空间文件对应的表空间,是否自动扩展,当前值,最大值。
*/
SET LINE 200
SET PAGES 100
COLUMN file_name FOR A65
COLUMN tablespace_name FOR A30
SELECT file_id,
file_name,
tablespace_name,
autoextensible,
bytes / 1073741824 as current_gb,
maxbytes / 1073741824 as max_gb
FROM dba_data_files
UNION
SELECT file_id,
file_name,
tablespace_name as ts_name,
autoextensible,
bytes / 1073741824 as cur_gb,
maxbytes / 1073741824 as max_gb
FROM dba_temp_files
ORDER BY tablespace_name, file_id, file_name;
-- -----------------------------------------------------------------------------------
-- SECTION: 存储信息
-- -----------------------------------------------------------------------------------
PROMPT <h3 class="awr">ASM信息</h3>
/*
磁盘组的序号、名称、冗余类型、总大小、剩余大小、使用频繁的数据大小、镜像需要的空间、可用于镜像的大小、offline的磁盘数量。
*/
PROMPT <p><a class="awr">查询ASM磁盘组信息</a>
CLEAR COLUMNS
COLUMN group_number FOR 9999999999
COLUMN name FOR A20
COLUMN type FOR A20
COLUMN total_mb FOR 9999999999
COLUMN free_mb FOR 9999999999
COLUMN hot_used_mb FOR 9999999999
COLUMN required_mirror_free_mb FOR 9999999999
COLUMN usable_file_mb FOR 9999999999
COLUMN offline_disks FOR 9999999999
SELECT group_number,
name,
type,
total_mb,
free_mb,
hot_used_mb,
required_mirror_free_mb,
usable_file_mb,
offline_disks
FROM v$asm_diskgroup_stat;
/*
磁盘组对应磁盘的信息。
*/
PROMPT <p><a class="awr">查询ASM磁盘组存储信息</a>
CLEAR COLUMNS
COLUMN g_number FOR 9999
COLUMN d_number FOR 9999
COLUMN g_name FOR A8
COLUMN d_name FOR A15
COLUMN g_total_mb FOR 99999999
COLUMN d_total_mb FOR 99999999
COLUMN path FOR A30
COLUMN h_status FOR A8
COLUMN g_free_mb FOR 99999999
COLUMN d_free_mb FOR 99999999
SELECT g.group_number g_number,
g.name g_name,
g.total_mb g_total_mb,
g.free_mb g_free_mb,
g.state g_state,
d.disk_number d_number,
d.name d_name,
d.path,
d.total_mb d_total_mb,
d.free_mb d_free_mb,
d.header_status h_status,
d.mode_status m_status
FROM v$asm_diskgroup_stat g, v$asm_disk_stat d
WHERE g.group_number = d.group_number
ORDER BY g.group_number, d.disk_number;
-- -----------------------------------------------------------------------------------
-- SECTION: 闪回区信息
-- -----------------------------------------------------------------------------------
PROMPT <h3 class="awr">闪回信息</h3>
/*
文件类型、空间使用百分比、可回收百分比、文件的数量。
*/
PROMPT <p><a class="awr">闪回区空间使用状况</a>
CLEAR COLUMNS
SET LINE 100
SELECT *
FROM v$flash_recovery_area_usage;
-- -----------------------------------------------------------------------------------
-- SECTION: 备份信息
-- -----------------------------------------------------------------------------------
PROMPT <h3 class="awr">数据库备份信息</h3>
/*
会话信息标识,备份内容,开始时间,结束时间,消耗时间。
*/
PROMPT <p><a class="awr">查询备份信息</a>
CLEAR COLUMNS
SET PAGES 1000
COLUMN time_taken_display FOR a10
COLUMN start_time FOR a20
COLUMN end_time FOR a20
COLUMN status FOR a15
COLUMN tt FOR a10
SELECT session_key,
start_time,
end_time,
status,
time_taken_display tt
FROM v$rman_backup_job_details
WHERE start_time > sysdate -1
ORDER BY session_key;
-- -----------------------------------------------------------------------------------
-- SECTION: 会话信息
-- -----------------------------------------------------------------------------------
PROMPT <p><h3 class="awr">session情况</h3>
/*
查询登录异常的会话。
*/
PROMPT <p><a class="awr">登录时间最长的会话</a>
SET LINE 500
CLEAR COLUMNS
COLUMN spid for a15
COLUMN sql_id for a20
COLUMN program for a40
COLUMN event for a35
SELECT *
FROM (SELECT t.sid,
t.serial#,
trunc(sysdate - logon_time) AS online_time,
t.PROGRAM,
t.status,
t.LOGON_TIME,
t.sql_id,
t.prev_sql_id,
t.event
FROM gv$session t
WHERE t.type <> 'BACKGROUND' AND program is not null
ORDER BY logon_time)
WHERE rownum <= 30;
PROMPT <p><a class="awr">查看进程资源限制</a>
/*
资源所在的实例,资源的名称,当前大小,上次启动以来的最大值,初始大小
*/
CLEAR COLUMNS
SET PAGES 100
SET LINE 200
COLUMN resource_name FOR a30
COLUMN current_utilization FOR 9999999
COLUMN max_utilization FOR 9999999
COLUMN initial_allocation FOR a18
COLUMN limit_value FOR a15
SELECT *
FROM gv$resource_limit
WHERE trim(limit_value) != 'UNLIMITED';
-- -----------------------------------------------------------------------------------
-- SECTION: 等待事件
-- -----------------------------------------------------------------------------------
prompt <a class="awr">24小时内CPU等待最长的事件</a>
CLEAR COLUMNS
SET PAGES 100
SET LINE 800
COLUMN user_id for a45
COLUMN SQL_ID for a30
COLUMN Event for a45
COLUMN sql_id for a45
COLUMN user_id for a45
COLUMN cnt for a45
COLUMN machine for a45
COLUMN program for a45
SELECT t.*, s.sid, s.serial#, s.machine, s.program, s.osuser
FROM (SELECT c.USERNAME, a.event, to_char(a.cnt) as seconds, a.sql_id, dbms_lob.substr(b.sql_fulltext, 100, 1) sqltext
FROM (SELECT rownum rn, t.*
FROM (SELECT decode(s.session_state, 'WAITING', s.event, 'Cpu + Wait For Cpu') Event, s.sql_id, s.user_id, count(*) cnt
FROM v$active_session_history s
WHERE sample_time > sysdate - 1
GROUP BY s.user_id, decode(s.session_state, 'WAITING', s.event, 'Cpu + Wait For Cpu'), s.sql_id
ORDER BY cnt DESC) t
WHERE rownum < 20) a, v$sqlarea b, dba_users c
WHERE a.sql_id = b.sql_id AND a.user_id = c.user_id
ORDER BY cnt DESC) t, v$session s
WHERE t.sql_id = s.sql_id(+);
-- -----------------------------------------------------------------------------------
-- SECTION: 内存信息
-- -----------------------------------------------------------------------------------
PROMPT <h3 class="awr">04031报错</h3>
PROMPT <p><a class="awr">04031报错(出错时,检查shared_pool_reserved池是否有报错)</a>
SELECT free_space,
avg_free_size,
used_space,
avg_used_size,
request_failures,
last_failure_size
FROM v$shared_pool_reserved;
-- -----------------------------------------------------------------------------------
-- SECTION: SQL信息
-- -----------------------------------------------------------------------------------
PROMPT <h3 class="awr">当前SQL</h3>
PROMPT <p><a class="awr">执行次数最多</a>
SET LINE 300
SELECT *
FROM (SELECT sql_id, sql_text, s.executions, s.last_load_time, s.first_load_time, s.disk_reads, s.buffer_gets, s.parse_calls
FROM v$sql s
ORDER BY s.executions DESC)
WHERE rownum <= 10;
PROMPT <p><a class="awr">解析次数最多</a>
SET LINE 300
SELECT *
FROM (SELECT sql_id, sql_text, s.EXECUTIONS, s.LAST_LOAD_TIME, s.FIRST_LOAD_TIME, s.DISK_READS, s.BUFFER_GETS, s.PARSE_CALLS
FROM v$sql s
ORDER BY s.PARSE_CALLS DESC)
WHERE rownum <= 10;
-- -----------------------------------------------------------------------------------
-- SECTION: ADG
-- -----------------------------------------------------------------------------------
PROMPT <h3 class="awr">ADG信息</h3>
/*
归档目标名称、状态、数据库当前模式、目的地路径
*/
PROMPT <p><a class="awr">归档目的地状态</a>
column dest_name FOR a20
column status FOR a10
column database_mode FOR a20
column destination FOR a20
SELECT dest_name,
status,
database_mode,
destination
FROM v$archive_dest_status
WHERE dest_id in ('1','2');
/*
每个节点,应用日志的最大号,和scn最大值和最小值
*/
PROMPT <p><a class="awr">查询最大的应用日志号</a>
CLEAR COLUMNS
SET LINE 150
SELECT m.thread#,
m.sequence#,
first_change#,
next_change#
FROM v$log_history m,
(SELECT thread#, max(sequence#) as sequence#
FROM v$log_history
GROUP BY thread#) t
WHERE m.thread# = t.thread#
AND m.sequence# = t.sequence#;
/*
查询每个节点最大的归档日志号
*/
PROMPT <p><a class="awr">查询最大的归档日志号</a>
SET LINE 100
SELECT UNIQUE thread# AS thread,
MAX(sequence#) OVER (PARTITION BY thread#) AS last
FROM v$archived_log;
/*
PROMPT <p><a class="awr">查询GAP</a>
备库缺少日志的起始号码
clear columns
set line 200
SELECT THREAD#,LOW_SEQUENCE#,HIGH_SEQUENCE# FROM V$ARCHIVE_GAP;
*/
PROMPT <br><a class="awr" href="#top">Back to Top</a>
PROMPT <p>End of Report</p>
PROMPT <p></body></html>
SPOOL OFF
exit;
#!/bin/bash
#----------------------------------------------------------------------------------------------------+
# HAIZDL@126.COM |
#----------------------------------------------------------------------------------------------------|
# COPYRIGHT (C) 1998-2015 HAIZDL ALL RIGHTS RESERVED. |
#----------------------------------------------------------------------------------------------------|
# DATABASE : ORACLE |
# FILE : healthyCheck1_forLinux.sh |
# CLASS : DATABASE ADMINISTRATION |
# PURPOSE : COLLECTING LOGS FOR ORACLE DATABASE DAILY CHECKING. |
# NOTE : AS WITH ANY CODE, ENSURE TO TEST THIS SCRIPT IN A DEVELOPMENT |
# ENVIRONMENT BEFORE ATTEMPTING TO RUN IT IN PRODUCTION. |
#----------------------------------------------------------------------------------------------------+
#----------------------------------------------
#CHECK JOB OPERATOR USER INFO.
#----------------------------------------------
function _userVerify(){
USER=`whoami`
if [ "$USER" != "root" ]; then
echo "PLS USE ROOT USER FOR THIS JOB."
echo "USAGE: $0"
exit 1
fi
}
#----------------------------------------------
#BASIC VARIABLE.
#----------------------------------------------
HOST=`hostname`
PWDDIRECTORY=/tmp/ora
DATESTRING=`date '+%Y%m%d'`
LOGDIR=$PWDDIRECTORY/$DATESTRING$HOST
ASMSID=+ASM1
#LOG DIRECTORY PARAMETER
OINSTALLDIRECTORY=/home/u01/app
RDBMSDIRECTORY=$OINSTALLDIRECTORY/oracle/diag/rdbms
GRIDLOGDIRECTORY=$OINSTALLDIRECTORY/11.2.0.4/grid/log
GRIDDIRECTORY=$OINSTALLDIRECTORY/grid
ALERTOFCLUSTER=$GRIDLOGDIRECTORY/$HOST/alert$HOST.log
ALERTOFLISTENER=$GRIDDIRECTORY/diag/tnslsnr/$HOST/listener/trace/listener.log
ALERTOFLISTENERSCAN=$GRIDLOGDIRECTORY/diag/tnslsnr/$HOST/listener_scan1/trace/listener_scan1.log
ALERTOFASM=$GRIDDIRECTORY/diag/asm/+asm/$ASMSID/trace/alert_$ASMSID.log
OPATCHORACLEDIR=$OINSTALLDIRECTORY/oracle/product/11.2.0.4/dbhome_1/OPatch
OPATCHGRIDDIR=$OINSTALLDIRECTORY/11.2.0.4/grid/OPatch
#DIAGNOSTIC FILES
LOGOFOSINFO=$LOGDIR/os/info_os_`date '+%y%m%d%H%M'`
LOGOFCLUSTER=$LOGDIR/cluster/info_cluster_`date '+%y%m%d%H%M'`
DBFILE=$LOGDIR/dbs.tmp
#----------------------------------------------
#CREATE LOG INFO DIRECTORY.
#----------------------------------------------
function _initialization(){
if [ -d $LOGDIR ]; then
echo "INIT: LOG DIRECTORY EXISTS, DELETING."
rm -rf $LOGDIR
fi
echo "INIT: LOG DIRECTORY CREATING."
mkdir $LOGDIR
mkdir $LOGDIR/os
mkdir $LOGDIR/cluster
mkdir $LOGDIR/db
chmod -R 777 $LOGDIR
if [ $? -ne 0 ]; then
echo "INIT: LOG DIRECTORY CREATING ERROR!"
exit 1
fi
}
#----------------------------------------------------------------------------------------------------|
# 1. CLUSTER LOG. |
#----------------------------------------------------------------------------------------------------|
function _clusterlog(){
cd $PWDDIRECTORY
if [ -e $ALERTOFCLUSTER ]; then
ALERTOFCLUSTERNUM=$(($(awk '{print NR}' $ALERTOFCLUSTER |tail -n1) - $(cat -n $ALERTOFCLUSTER| grep "`date -d last-day +%Y-%m-%d|cut -c 1-10`" | awk 'BEGIN {min=99999999; max=-1;} {if(min>$1) min=$1; if(max<$1) max=$1;} END {printf("%d", min);}') +1))
if [ $ALERTOFCLUSTERNUM -gt 0 ]; then
tail -$ALERTOFCLUSTERNUM $ALERTOFCLUSTER >>$LOGDIR/cluster/alert$HOST_`date '+%y%m%d'`.log
if [ $? -ne 0 ]; then
echo " JOB-1-1: CLUSTER ALERT LOG COPY ERROR!"
exit 1
fi
fi
fi
if [ -e $ALERTOFLISTENER ]; then
ALERTOFLISTENERNUM=$(($(awk '{print NR}' $ALERTOFLISTENER |tail -n1) - $(cat -n $ALERTOFLISTENER| grep "`date -d last-day "+%d-%^b-%Y"`" | awk 'BEGIN {min=9999999; max=-1;} {if(min>$1) min=$1; if(max<$1) max=$1;} END {printf("%d", min);}') +1))
if [ $ALERTOFLISTENERNUM -gt 0 ]; then
tail -$ALERTOFLISTENERNUM $ALERTOFLISTENER >>$LOGDIR/cluster/alert_listener_$HOST_`date '+%y%m%d'`.log
if [ $? -ne 0 ]; then
echo " JOB-1-2: CLUSTER ALERT LOG COPY ERROR!"
exit 1
fi
fi
fi
if [ -e $ALERTOFLISTENERSCAN ]; then
ALERTOFLISTENERSCANNUM=$(($(awk '{print NR}' $ALERTOFLISTENERSCAN |tail -n1) - $(cat -n $ALERTOFLISTENERSCAN| grep "`date -d last-day "+%d-%^b-%Y"`" | awk 'BEGIN {min=9999999; max=-1;} {if(min>$1) min=$1; if(max<$1) max=$1;} END {printf("%d", min);}') +1))
if [ $ALERTOFLISTENERSCANNUM -gt 0 ]; then
tail -$ALERTOFLISTENERSCANNUM $ALERTOFLISTENERSCAN >>$LOGDIR/cluster/alert_listener_scan1_`date '+%y%m%d'`.log
if [ $? -ne 0 ]; then
echo " JOB-1-3: CLUSTER ALERT LOG COPY ERROR!"
exit 1
fi
fi
fi
if [ -e $ALERTOFASM ]; then
ALERTOFASMNUM=$(($(awk '{print NR}' $ALERTOFASM |tail -n1) - $(cat -n $ALERTOFASM| grep "`date -d last-day|cut -c 1-8``date -d last-day +%d`" | awk 'BEGIN {min=9999999; max=-1;} {if(min>$1) min=$1; if(max<$1) max=$1;} END {printf("%d", min);}') +1))
if [ $ALERTOFASMNUM -gt 0 ]; then
tail -$ALERTOFASMNUM $ALERTOFASM >>$LOGDIR/cluster/alert_$ASMSID_`date '+%y%m%d'`.log
if [ $? -ne 0 ]; then
echo " JOB-1-4: CLUSTER ALERT LOG COPY ERROR!"
exit 1
fi
fi
fi
echo " " >>$LOGOFCLUSTER
echo " " >>$LOGOFCLUSTER
}
#----------------------------------------------------------------------------------------------------|
# 2. OS INFO. |
#----------------------------------------------------------------------------------------------------|
function _osinfo(){
echo "---------------------------------------------------------------" >>$LOGOFOSINFO
echo "- UPTIME:" >>$LOGOFOSINFO
echo "---------------------------------------------------------------" >>$LOGOFOSINFO
uptime >>$LOGOFOSINFO
echo " " >>$LOGOFOSINFO
echo "---------------------------------------------------------------" >>$LOGOFOSINFO
echo "- VMSTAT:" >>$LOGOFOSINFO
echo "---------------------------------------------------------------" >>$LOGOFOSINFO
vmstat 1 5 >>$LOGOFOSINFO
echo " " >>$LOGOFOSINFO
echo "---------------------------------------------------------------" >>$LOGOFOSINFO
echo "- POWERMT:" >>$LOGOFOSINFO
echo "---------------------------------------------------------------" >>$LOGOFOSINFO
powermt display dev=all >>$LOGOFOSINFO
echo " " >>$LOGOFOSINFO
echo "---------------------------------------------------------------" >>$LOGOFOSINFO
echo "- FREE:" >>$LOGOFOSINFO
echo "---------------------------------------------------------------" >>$LOGOFOSINFO
free -m >>$LOGOFOSINFO
echo " " >>$LOGOFOSINFO
echo "---------------------------------------------------------------" >>$LOGOFOSINFO
echo "- DF:" >>$LOGOFOSINFO
echo "---------------------------------------------------------------" >>$LOGOFOSINFO
df -h >>$LOGOFOSINFO
echo " " >>$LOGOFOSINFO
echo "---------------------------------------------------------------" >>$LOGOFOSINFO
echo "- SAR-PAGE:" >>$LOGOFOSINFO
echo "---------------------------------------------------------------" >>$LOGOFOSINFO
sar -B 1 5 >>$LOGOFOSINFO
echo " " >>$LOGOFOSINFO
echo "---------------------------------------------------------------" >>$LOGOFOSINFO
echo "- SAR-CPU:" >>$LOGOFOSINFO
echo "---------------------------------------------------------------" >>$LOGOFOSINFO
sar -u 1 5 >>$LOGOFOSINFO
echo " " >>$LOGOFOSINFO
echo "---------------------------------------------------------------" >>$LOGOFOSINFO
echo "- IOSTAT:" >>$LOGOFOSINFO
echo "---------------------------------------------------------------" >>$LOGOFOSINFO
iostat -mx 1 5 | grep power >>$LOGOFOSINFO
echo " " >>$LOGOFOSINFO
echo "---------------------------------------------------------------" >>$LOGOFOSINFO
echo "- SAR-NETWORK:" >>$LOGOFOSINFO
echo "---------------------------------------------------------------" >>$LOGOFOSINFO
sar -n DEV 1 5 | grep bond >>$LOGOFOSINFO
echo " " >>$LOGOFOSINFO
if [ $? -ne 0 ]; then
echo " JOB-2-1: OS INFO ERROR!"
exit 1
fi
LOGOFSYS=/var/log/messages
if [ -e $LOGOFSYS ]; then
cp $LOGOFSYS $LOGDIR/os
if [ $? -ne 0 ]; then
echo " JOB-2-2: OS INFO ERROR!"
exit 1
fi
fi
}
#----------------------------------------------------------------------------------------------------|
# 3. CLUSTER INFO. |
#----------------------------------------------------------------------------------------------------|
function _clusterinfo(){
echo "---------------------------------------------------------------" >>$LOGOFCLUSTER
echo "- 查看集群资源状态:" >>$LOGOFCLUSTER
echo "---------------------------------------------------------------" >>$LOGOFCLUSTER
echo " " >>$LOGOFCLUSTER
su - grid -c "crsctl status res -t " >>$LOGOFCLUSTER
echo " " >>$LOGOFCLUSTER
echo "---------------------------------------------------------------" >>$LOGOFCLUSTER
echo "- 查看集群服务状态:" >>$LOGOFCLUSTER
echo "---------------------------------------------------------------" >>$LOGOFCLUSTER
echo " " >>$LOGOFCLUSTER
su - grid -c "crsctl check cluster -all" >>$LOGOFCLUSTER
echo " " >>$LOGOFCLUSTER
echo "---------------------------------------------------------------" >>$LOGOFCLUSTER
echo "- 查看集群监听状态:" >>$LOGOFCLUSTER
echo "---------------------------------------------------------------" >>$LOGOFCLUSTER
echo " " >>$LOGOFCLUSTER
echo "#1 本地监听 ">>$LOGOFCLUSTER
su - grid -c "lsnrctl status" >>$LOGOFCLUSTER
echo "#2 SCAN监听 " >>$LOGOFCLUSTER
su - grid -c "lsnrctl status LISTENER_SCAN1" >>$LOGOFCLUSTER
echo " " >>$LOGOFCLUSTER
echo "---------------------------------------------------------------" >>$LOGOFCLUSTER
echo "- 查看磁盘组信息及状态 ">>$LOGOFCLUSTER
echo "---------------------------------------------------------------" >>$LOGOFCLUSTER
echo "- #1 ">>$LOGOFCLUSTER
su - grid -c "asmcmd lsdg" >>$LOGOFCLUSTER
echo "---------------------------------------------------------------" >>$LOGOFCLUSTER
echo "- 查看OCR信息 ">>$LOGOFCLUSTER
echo "---------------------------------------------------------------" >>$LOGOFCLUSTER
echo "- #1 心跳盘 ">>$LOGOFCLUSTER
su - grid -c "crsctl query css votedisk" >>$LOGOFCLUSTER
echo "- #2 一致性检查 " >>$LOGOFCLUSTER
su - grid -c "ocrcheck" >>$LOGOFCLUSTER
echo " " >>$LOGOFCLUSTER
}
#----------------------------------------------------------------------------------------------------|
# 4. DB INFO. |
#----------------------------------------------------------------------------------------------------|
function _dbinfo(){
cd $PWDDIRECTORY
su - oracle -c "cd $LOGDIR/db;export ORACLE_SID=$1;sqlplus -s /nolog <<EOF
conn / as sysdba;
@$LOGDIR/../dailyhealthycheck_withadg.sql
exit;
EOF"
}
#----------------------------------------------------------------------------------------------------|
# 5. DB LOG. |
#----------------------------------------------------------------------------------------------------|
function _dblog(){
cd $PWDDIRECTORY
DBNAME2=$1
DBNAME2=${DBNAME2%?}
ALERTOFDB=$RDBMSDIRECTORY/$DBNAME2/$1/trace/alert_$1.log
if [ -e $ALERTOFDB ]; then
ALERTOFDBNUM=$(($(awk '{print NR}' $ALERTOFDB |tail -n1) - $(cat -n $ALERTOFDB| grep "`date -d last-day|cut -c 1-8``date -d last-day +%d`" | awk 'BEGIN {min=9999999; max=-1;} {if(min>$1) min=$1; if(max<$1) max=$1;} END {printf("%d", min);}') +1))
if [ $ALERTOFDBNUM -gt 0 ]; then
tail -$ALERTOFDBNUM $ALERTOFDB >>$LOGDIR/db/alert_$1_`date '+%y%m%d_%H:%M:%S'`.log
if [ $? -ne 0 ]; then
echo " JOB-5: DB LOG ERROR!"
exit 1
fi
fi
fi
}
#----------------------------------------------------------------------------------------------------|
# 6. RMAN INFO. |
#----------------------------------------------------------------------------------------------------|
function _rmaninfo(){
cd $PWDDIRECTORY
LOGOFRMAN=$LOGDIR/rman_$1-`date '+%y%m%d_%H:%M:%S'`.log
su - oracle -c "export ORACLE_SID=$1;rman target / <<EOF >$LOGOFRMAN
show all;
list backup of database summary;
exit;
EOF"
if [ $? -ne 0 ]; then
echo "JOB-6: RMAN INFO ERROR!"
exit 1
fi
}
#----------------------------------------------------------------------------------------------------|
# 7. AWR. |
#----------------------------------------------------------------------------------------------------|
function _awrinfo(){
cd $PWDDIRECTORY
su - oracle -c "cd $LOGDIR/db;mkdir ./$1AWR;cd $1AWR;export ORACLE_SID=$1;sqlplus -silent /nolog <<EOF
conn /as sysdba
@$LOGDIR/../makeawr.sql;
exit;
EOF"
sed -i "s/PL\/SQL procedure successfully completed.//g" $LOGDIR/$1AWR/awrsql.sql
su - oracle -c "cd $LOGDIR;cd ./$1AWR;export ORACLE_SID=$1;sqlplus -silent /nolog <<EOF
conn /as sysdba
set echo off
@awrsql.sql;
exit;
EOF"
if [ $? -ne 0 ]; then
echo "JOB7-5: AWR ERROR!"
exit 1
fi
}
#----------------------------------------------------------------------------------------------------|
# 8. TAR. |
#----------------------------------------------------------------------------------------------------|
function _logtar(){
cd $PWDDIRECTORY
ls $LOGDIR/db/*.html > $LOGDIR/html.tmp
cat $LOGDIR/html.tmp | while read FNAME
do
sed -i 's/GBK/UTF8/g' $FNAME
done
rm -rf $LOGDIR/html.tmp
rm -f *$HOST.tar.gz
if [ -d $DATESTRING$HOST ]; then
tar -czvf $DATESTRING-$HOST.tar.gz ./$DATESTRING$HOST
rm -rf ./$DATESTRING$HOST
else
echo " JOB-8: TAR ERROR."
fi
}
#----------------------------------------------------------------------------------------------------|
# MAIN(). |
#----------------------------------------------------------------------------------------------------|
echo "========================================================="
echo "= JOB-NAME: $0"
echo "========================================================="
_userVerify
_initialization
echo "----------------------------------------------------"
echo "- JOB-1: CLUSTER LOG."
echo "----------------------------------------------------"
_clusterlog
echo "----------------------------------------------------"
echo "- JOB-2: OS INFO."
echo "----------------------------------------------------"
_osinfo
echo "----------------------------------------------------"
echo "- JOB-3: CLUSTER INFO."
echo "----------------------------------------------------"
_clusterinfo
echo "----------------------------------------------------"
echo "- JOB-(4-7): DB JOBs."
echo "----------------------------------------------------"
ps -ef | grep ora_pmon | grep -v grep | awk '{print $8}' | awk -F "_" '{print $3}' >$DBFILE
if [ -e $DBFILE ]; then
cat $DBFILE | while read DBSID
do
echo "- # JOB-4: DB INFO."
_dbinfo $DBSID
echo "- # JOB-5: DB LOG."
_dblog $DBSID
done
else
echo " JOB-4-7: NO DBSID FILES ERROR."
fi
rm -rf $DBFILE
echo "----------------------------------------------------"
echo "- JOB-8: LOG TAR."
echo "----------------------------------------------------"
_logtar
原题:DBA如何定制自动化巡检工具 如有任何问题,可点击文末阅读原文,到社区提问 觉得本文有用,请转发或点击“在看”,让更多同行看到
资料/文章推荐:
欢迎关注社区 “数据库”技术主题 ,将会不断更新优质资料、文章。地址:https://www.talkwithtrend.com/Channel/597
*本公众号所发布内容仅代表作者观点,不代表社区立场