今天本来想再验证并学习一下崖山数据库,结果我重启集群之后,发现无法启动了。
[yashan@yashandb1 ~]$ yasboot cluster stop -c yashantest
type | uuid | name | hostid | index | status | return_code | progress | cost
------------------------------------------------------------------------------------------------------------
task | 0c25ec3126e3174f | StopYasdbCluster | - | yashantest | SUCCESS | 0 | 100 | 1
------+------------------+------------------+--------+------------+---------+-------------+----------+------
task completed, status: SUCCESS
[yashan@yashandb1 ~]$
[yashan@yashandb1 ~]$
[yashan@yashandb1 ~]$
[yashan@yashandb1 ~]$ yasboot cluster start -c yashantest
type | uuid | name | hostid | index | status | return_code | progress | cost
------------------------------------------------------------------------------------------------------------
task | 666129e032c94bb9 | StartYasdbCluster | - | yashantest | FAILED | 1 | 100 | 14
------+------------------+-------------------+--------+------------+--------+-------------+----------+------
task completed, status: FAILED
retcode: 1
stdout: node:1-1 start with ping failed
stderr: wait node 1-1 process start failed: Starting instance open
failed to open file /home/yashan/yasdb_home/yashandb/data/yashantest/db-1-1/archive/arch_0_1.ARC, errno 2, error message "No such file or directory"
Failed to start instance
[yashan@yashandb1 ~]$
为什么我在启动数据库的时候,会去找归档呢?难道是需要恢复,我想应该是这样理解。那么我先看下归档是否存在:
[root@yashandb1 archive]# pwd
/home/yashan/yasdb_home/yashandb/data/yashantest/db-1-1/archive
[root@yashandb1 archive]# ls -ltr
total 18287600
-rw-r----- 1 yashan yashan 535392256 Nov 21 22:23 arch_0_76.ARC
-rw-r----- 1 yashan yashan 536846336 Nov 21 22:23 arch_0_77.ARC
-rw-r----- 1 yashan yashan 536289280 Nov 21 22:23 arch_0_78.ARC
-rw-r----- 1 yashan yashan 533561344 Nov 21 22:24 arch_0_79.ARC
-rw-r----- 1 yashan yashan 533712896 Nov 21 22:24 arch_0_80.ARC
-rw-r----- 1 yashan yashan 534581248 Nov 21 22:24 arch_0_81.ARC
-rw-r----- 1 yashan yashan 535502848 Nov 21 22:24 arch_0_82.ARC
-rw-r----- 1 yashan yashan 536870912 Nov 22 02:01 arch_0_83.ARC
-rw-r----- 1 yashan yashan 532779008 Nov 22 08:47 arch_0_84.ARC
。。。。。。
-rw-r----- 1 yashan yashan 534536192 Nov 22 08:58 arch_0_104.ARC
-rw-r----- 1 yashan yashan 532910080 Nov 22 08:59 arch_0_105.ARC
-rw-r----- 1 yashan yashan 535052288 Nov 22 08:59 arch_0_106.ARC
-rw-r----- 1 yashan yashan 535629824 Nov 22 08:59 arch_0_107.ARC
-rw-r----- 1 yashan yashan 536842240 Nov 24 02:00 arch_0_108.ARC
-rw-r----- 1 yashan yashan 536870912 Nov 27 02:00 arch_0_109.ARC
-rw-r----- 1 yashan yashan 536870912 Nov 30 02:00 arch_0_110.ARC
[root@yashandb1 archive]#
可以看到76号之前的归档都已经不在了,可能是之前空间不足,我删掉了.我们接下来看下数据库启动的情况,为什么报错,在哪里报错了?
[yashan@yashandb1 run]$ pwd
/home/yashan/yasdb_home/yashandb/23.1.1.100/log/yashantest/db-1-1/run
[yashan@yashandb1 run]$ tail -100 run.log
024-02-02 01:34:13.064 32066 [INFO] [DATABASE] start database shutdown phase1
2024-02-02 01:34:13.159 32066 [INFO] [ARCH] arch data manager destroy success
2024-02-02 01:34:13.165 32066 [INFO] [DATABASE] database shutdown phase1 completed
2024-02-02 01:34:13.169 32066 [INFO] [ELECTION] stop election
2024-02-02 01:34:13.174 32066 [INFO] [DATABASE] replication module shutdown
2024-02-02 01:34:13.277 32066 [INFO] [HA] stop sender threads
2024-02-02 01:34:13.284 32066 [INFO] [HA] stop redo receiver 0
2024-02-02 01:34:13.289 32077 [INFO] [HA] redo receiver thread exited
2024-02-02 01:34:13.299 32066 [INFO] [SCF] scfManager destroy succeed
2024-02-02 01:34:13.333 32082 [INFO] [ARCH] archive process exited
2024-02-02 01:34:13.340 32066 [INFO] [ARCH] archive manager destroy success
2024-02-02 01:34:13.347 32066 [INFO] [REDO] redo manager destroy success
2024-02-02 01:34:13.353 32066 [INFO] [SAVEPOINT] savepoint pool destroy success
2024-02-02 01:34:13.360 32066 [INFO] [XA] XA manager destroy success
2024-02-02 01:34:13.367 32066 [INFO] [XACT] XACT manager destroy success
2024-02-02 01:34:13.381 32066 [INFO] [TABLESPACE] tablespace manager destroy success
2024-02-02 01:34:13.388 32066 [INFO] [CTRL] ctrl manager destroy success
2024-02-02 01:34:13.396 32066 [INFO] [COAST] coast manager destroy success
2024-02-02 01:34:13.404 32066 [INFO] [ALLOCATOR] memory allocator destroy success
2024-02-02 01:34:13.410 32066 [INFO] [DB] kernel shutdown successfully
2024-02-02 01:34:13.418 32066 [INFO] [TIMER] timer stop success
2024-02-02 01:34:13.418 32066 [INFO] [PARAM] param context destroy success
2024-02-02 01:34:13.418 32066 [INFO] [SESSION_POOL] session pool destroy success
2024-02-02 01:34:40.493 32109 [INFO] [DB] start kernel
2024-02-02 01:34:40.721 32109 [INFO] [PARAM] the number of synchronous standby in QUORUM_SYNC_STANDBYS and REQUIRED_SYNC_STANDBYS is 0 and 0
2024-02-02 01:34:40.725 32109 [INFO] [HA] start redo receiver
2024-02-02 01:34:40.732 32120 [INFO] [HA] redo receiver thread started
2024-02-02 01:34:40.967 32109 [INFO] [CTRL] start mount type: normal
2024-02-02 01:34:41.073 32109 [INFO] [DB] the database protection mode is MAXIMUM PERFORMANCE
2024-02-02 01:34:41.075 32109 [INFO] [REDO] load the log file for instance 0
2024-02-02 01:34:41.076 32109 [INFO] [REDO] database scn 506464592027127808, lsn 1449649, flush point 0-111-117425-597041, reset point 0-0-0-0, space change lsn 837038
2024-02-02 01:34:41.079 32124 [INFO] [ARCH] archive process start
2024-02-02 01:34:41.089 32109 [ERROR][errno=00313]: failed to open file /home/yashan/yasdb_home/yashandb/data/yashantest/db-1-1/archive/arch_0_1.ARC, errno 2, error message "No such file or directory"
2024-02-02 01:34:41.093 32109 [INFO] [DATABASE] shutdown database, mode: shutdown immediate
我本以为崖山很像Oracle,那么是否可以尝试recover一把,然后打开数据库呢?然而我发现我错了。由于崖山数据库启动也分为:nomount、mount、open 3个阶段。我发现居然不能mount?这是什么鬼?
[yashan@yashandb1 ~]$ yasql sys/Enmo#123
YashanDB SQL Personal Edition Release 23.1.1.100 x86_64
Connected to:
YashanDB Server Personal Edition Release 23.1.1.100 x86_64 - X86 64bit Linux
SQL>
[yashan@yashandb1 ~]$ yasql sys/Enmo#123
YashanDB SQL Personal Edition Release 23.1.1.100 x86_64
Connected to:
YashanDB Server Personal Edition Release 23.1.1.100 x86_64 - X86 64bit Linux
SQL> recover database;
YAS-02037 the database must be mounted and not open
SQL> alter database mount;
YAS-00313 failed to open file /home/yashan/yasdb_home/yashandb/data/yashantest/db-1-1/archive/arch_0_1.ARC, errno 2, error message "No such file or directory"
SQL> Starting instance nomount
Instance started
大家可以清楚的看,数据库此时只能nomount,无法mount。这与Oracle完全是不同的。Oracle数据库在mount阶段只要是spfile、controlfile是完整的就行,并不会去读归档或者redo文件,也就是说在mount阶段,你归档或者redo损坏也不会影响mount过程。然而,崖山并不是这样。
本来我想尝试将不存在的arch log从控制文件中删除,通过dd覆盖掉,发现启动报错controlfile无效。看来是我操作的姿势不太对。
[yashan@yashandb1 config]$ dd if=/tmp/ctrl.dd01 of=/home/yashan/yasdb_home/yashandb/data/yashantest/db-1-1/dbfiles/ctrl1 bs=1 seek=2228224 count=8160 conv=notrunc
8160+0 records in
8160+0 records out
8160 bytes (8.2 kB) copied, 0.0590542 s, 138 kB/s
[yashan@yashandb1 config]$
[yashan@yashandb1 config]$
[yashan@yashandb1 config]$ dd if=/tmp/ctrl.dd01 of=/home/yashan/yasdb_home/yashandb/data/yashantest/db-1-1/dbfiles/ctrl1 bs=1 seek=2236384 count=8160 conv=notrunc
8160+0 records in
8160+0 records out
8160 bytes (8.2 kB) copied, 0.0622431 s, 131 kB/s
[yashan@yashandb1 config]$ dd if=/tmp/ctrl.dd01 of=/home/yashan/yasdb_home/yashandb/data/yashantest/db-1-1/dbfiles/ctrl1 bs=1 seek=2244544 count=8160 conv=notrunc
8160+0 records in
8160+0 records out
8160 bytes (8.2 kB) copied, 0.0546975 s, 149 kB/s
[yashan@yashandb1 config]$
[yashan@yashandb1 data]$ dd if=/tmp/ctrl.dd02 of=/home/yashan/yasdb_home/yashandb/data/yashantest/db-1-1/dbfiles/ctrl1 bs=1 seek=2252704 count=880 conv=notrunc
880+0 records in
880+0 records out
880 bytes (880 B) copied, 0.00631975 s, 139 kB/s
[yashan@yashandb1 data]$ cp /home/yashan/yasdb_home/yashandb/data/yashantest/db-1-1/dbfiles/ctrl1 /home/yashan/yasdb_home/yashandb/data/yashantest/db-1-1/dbfiles/ctrl2
[yashan@yashandb1 data]$ cp /home/yashan/yasdb_home/yashandb/data/yashantest/db-1-1/dbfiles/ctrl1 /home/yashan/yasdb_home/yashandb/data/yashantest/db-1-1/dbfiles/ctrl3
[yashan@yashandb1 data]$
[yashan@yashandb1 config]$ Starting instance mount
the control files are invalid
Failed to start instance
[1]+ Exit 255 yasdb mount
在测试过程中,发现崖山对于控制文件数量是有要求的,最低要求2个,否则无法启动,这还是蛮有意思的。
[yashan@yashandb1 config]$ yasdb mount &
[1] 18269
[yashan@yashandb1 config]$ the value of parameter CONTROL_FILES is invalid, at least two control files are required
Failed to start instance
既然如此那我们就重新找一台环境重建一套环境来简单测一下。一探究竟!
通过alter system switch logfile我们先来观察一下run.log(类似Oracle alert log).
2024-02-02 16:25:10.674 28369 [INFO] [HA] stop redo receiver 0
2024-02-03 02:00:10.054 24342 [INFO] [TABLESPACE] try to update instance 0 's space change lsn from 3330 to 13734
2024-02-03 02:00:19.112 24342 [INFO] [TABLESPACE] try to update instance 0 's space change lsn from 13734 to 22160
2024-02-03 11:24:02.288 19253 [INFO] [REDO] instance: 0, switch redo target asn: 4294967295, current asn: 2
2024-02-03 11:24:02.289 19253 [INFO] [REDO] instance 0, switch redo file, type 1, point 0-2-30628-25408
2024-02-03 11:24:02.319 19253 [INFO] [REDO] switch redo file, new asn 3
2024-02-03 11:24:03.481 28371 [INFO] [ARCH] instance: 0, itemCounts: 2, add archive file filename /data/yashan/yasdb_data/db-1-1/archive/arch_0_2.ARC, asn: 2, ctrlId: 1, used: 1
2024-02-03 11:24:03.494 28371 [INFO] [ARCH] add new archive file /data/yashan/yasdb_data/db-1-1/archive/arch_0_2.ARC
2024-02-03 11:24:12.107 19253 [INFO] [REDO] instance: 0, switch redo target asn: 4294967295, current asn: 3
2024-02-03 11:24:12.108 19253 [INFO] [REDO] instance 0, switch redo file, type 1, point 0-3-1-25409
2024-02-03 11:24:12.144 19253 [INFO] [REDO] switch redo file, new asn 4
2024-02-03 11:24:12.516 28371 [INFO] [ARCH] instance: 0, itemCounts: 3, add archive file filename /data/yashan/yasdb_data/db-1-1/archive/arch_0_3.ARC, asn: 3, ctrlId: 2, used: 1
2024-02-03 11:24:12.532 28371 [INFO] [ARCH] add new archive file /data/yashan/yasdb_data/db-1-1/archive/arch_0_3.ARC
这里用一个新库,本想尝试去dump一下控制文件,发现崖山数据库目前根本不支持,只支持dump datafile以及logfile等。从上面内容可以看到有几个关键的结构,比如scn,asn,lsn等,下面进一步观察一下。
SQL> select CURRENT_SCN,BLOCK_SIZE,OPEN_MODE,RCY_POINT,FLUSH_POINT,RESET_POINT from v$database;
CURRENT_SCN BLOCK_SIZE OPEN_MODE RCY_POINT FLUSH_POINT RESET_POINT
--------------------- ------------ ----------------- --------------------------------- --------------------------------- ---------------------------------
528770185152032768 8192 READ_WRITE 0-6-1-25418 0-6-1-25418 0-0-0-0
1 row fetched.
SQL> select * from v$Logfile;
THREAD# ID NAME BLOCK_SIZE BLOCK_COUNT USED_BLOCKS SEQUENCE# STATUS
------- ------------ ---------------------------------------------------------------- ------------ ------------ ------------ ------------ ---------
1 0 /data/yashan/yasdb_data/db-1-1/dbfiles/redo1 4096 32768 2 5 INACTIVE
1 1 /data/yashan/yasdb_data/db-1-1/dbfiles/redo2 4096 32768 1 6 CURRENT
1 2 /data/yashan/yasdb_data/db-1-1/dbfiles/redo3 4096 32768 2 3 INACTIVE
1 3 /data/yashan/yasdb_data/db-1-1/dbfiles/redo4 4096 32768 67 4 INACTIVE
4 rows fetched.
SQL> alter system switch logfile;
Succeed.
SQL> select * from v$Logfile;
THREAD# ID NAME BLOCK_SIZE BLOCK_COUNT USED_BLOCKS SEQUENCE# STATUS
------- ------------ ---------------------------------------------------------------- ------------ ------------ ------------ ------------ ---------
1 0 /data/yashan/yasdb_data/db-1-1/dbfiles/redo1 4096 32768 2 5 INACTIVE
1 1 /data/yashan/yasdb_data/db-1-1/dbfiles/redo2 4096 32768 2 6 ACTIVE
1 2 /data/yashan/yasdb_data/db-1-1/dbfiles/redo3 4096 32768 1 7 CURRENT
1 3 /data/yashan/yasdb_data/db-1-1/dbfiles/redo4 4096 32768 67 4 INACTIVE
4 rows fetched.
SQL> select CURRENT_SCN,BLOCK_SIZE,OPEN_MODE,RCY_POINT,FLUSH_POINT,RESET_POINT from v$database;
CURRENT_SCN BLOCK_SIZE OPEN_MODE RCY_POINT FLUSH_POINT RESET_POINT
--------------------- ------------ ----------------- --------------------------------- --------------------------------- ---------------------------------
528770185152032768 8192 READ_WRITE 0-6-1-25418 0-7-1-25419 0-0-0-0
1 row fetched.
SQL> alter system switch logfile;
Succeed.
SQL> select CURRENT_SCN,BLOCK_SIZE,OPEN_MODE,RCY_POINT,FLUSH_POINT,RESET_POINT from v$database;
CURRENT_SCN BLOCK_SIZE OPEN_MODE RCY_POINT FLUSH_POINT RESET_POINT
--------------------- ------------ ----------------- --------------------------------- --------------------------------- ---------------------------------
528770185152032768 8192 READ_WRITE 0-6-1-25418 0-8-1-25420 0-0-0-0
1 row fetched.
SQL> select CURRENT_SCN,BLOCK_SIZE,OPEN_MODE,RCY_POINT,FLUSH_POINT,RESET_POINT from v$database;
CURRENT_SCN BLOCK_SIZE OPEN_MODE RCY_POINT FLUSH_POINT RESET_POINT
--------------------- ------------ ----------------- --------------------------------- --------------------------------- ---------------------------------
528770185152032768 8192 READ_WRITE 0-6-1-25418 0-8-1-25420 0-0-0-0
1 row fetched.
SQL> alter system switch logfile;
Succeed.
SQL> select * from v$Logfile;
THREAD# ID NAME BLOCK_SIZE BLOCK_COUNT USED_BLOCKS SEQUENCE# STATUS
------- ------------ ---------------------------------------------------------------- ------------ ------------ ------------ ------------ ---------
1 0 /data/yashan/yasdb_data/db-1-1/dbfiles/redo1 4096 32768 1 9 CURRENT
1 1 /data/yashan/yasdb_data/db-1-1/dbfiles/redo2 4096 32768 2 6 ACTIVE
1 2 /data/yashan/yasdb_data/db-1-1/dbfiles/redo3 4096 32768 2 7 ACTIVE
1 3 /data/yashan/yasdb_data/db-1-1/dbfiles/redo4 4096 32768 2 8 ACTIVE
4 rows fetched.
SQL> select CURRENT_SCN,BLOCK_SIZE,OPEN_MODE,RCY_POINT,FLUSH_POINT,RESET_POINT from v$database;
CURRENT_SCN BLOCK_SIZE OPEN_MODE RCY_POINT FLUSH_POINT RESET_POINT
--------------------- ------------ ----------------- --------------------------------- --------------------------------- ---------------------------------
528770185152032768 8192 READ_WRITE 0-6-1-25418 0-9-1-25421 0-0-0-0
1 row fetched.
SQL> select * from v$Logfile;
THREAD# ID NAME BLOCK_SIZE BLOCK_COUNT USED_BLOCKS SEQUENCE# STATUS
------- ------------ ---------------------------------------------------------------- ------------ ------------ ------------ ------------ ---------
1 0 /data/yashan/yasdb_data/db-1-1/dbfiles/redo1 4096 32768 1 9 CURRENT
1 1 /data/yashan/yasdb_data/db-1-1/dbfiles/redo2 4096 32768 2 6 ACTIVE
1 2 /data/yashan/yasdb_data/db-1-1/dbfiles/redo3 4096 32768 2 7 ACTIVE
1 3 /data/yashan/yasdb_data/db-1-1/dbfiles/redo4 4096 32768 2 8 ACTIVE
4 rows fetched.
SQL> alter system checkpoint;
Succeed.
SQL> select * from v$Logfile;
THREAD# ID NAME BLOCK_SIZE BLOCK_COUNT USED_BLOCKS SEQUENCE# STATUS
------- ------------ ---------------------------------------------------------------- ------------ ------------ ------------ ------------ ---------
1 0 /data/yashan/yasdb_data/db-1-1/dbfiles/redo1 4096 32768 1 9 CURRENT
1 1 /data/yashan/yasdb_data/db-1-1/dbfiles/redo2 4096 32768 2 6 INACTIVE
1 2 /data/yashan/yasdb_data/db-1-1/dbfiles/redo3 4096 32768 2 7 INACTIVE
1 3 /data/yashan/yasdb_data/db-1-1/dbfiles/redo4 4096 32768 2 8 INACTIVE
4 rows fetched.
SQL> select CURRENT_SCN,BLOCK_SIZE,OPEN_MODE,RCY_POINT,FLUSH_POINT,RESET_POINT from v$database;
CURRENT_SCN BLOCK_SIZE OPEN_MODE RCY_POINT FLUSH_POINT RESET_POINT
--------------------- ------------ ----------------- --------------------------------- --------------------------------- ---------------------------------
528770185152032768 8192 READ_WRITE 0-9-1-25421 0-9-1-25421 0-0-0-0
1 row fetched.
SQL> show parameter checkpoint
NAME VALUE
---------------------------------------------------------------- ----------------------------------------------------------------
CHECKPOINT_INTERVAL 100000
CHECKPOINT_TIMEOUT 300
2 rows fetched.
上述有几个非常关键的点,rcy_point,flush_point。其中rcy_point表示数据库重启日志恢复起点,flush_point,表示日志刷盘的点(可以理解为Oracle的checkpoint rba)。我们先来看下v$database中几个关键字段的官方解释:
RCY_POINT VARCHAR(32) 数据库启动时,日志恢复点。 格式{rst}{asn}{blockid}_{lfn}
FLUSH_POINT VARCHAR(32) 数据库当前日志刷盘点。 格式{rst}{asn}{blockid}_{lfn}
rst:为reset id,每次failover后,数据库新产生的redo文件的reset id会加1。
asn: 归档序列号,archive sequence number,每产生一个redo,ASN会加1,每个redo的ASN不相同。
blockid:redo文件内页面所在ID,页面的偏移量为 block id*block size。
lfn:log flush number,日志序列号,每次redo刷盘,LFN加1。
同时我们看到也有崖山也有checkpoint的相关参数,这里我来设置下参数,观察一下。
set CHECKPOINT_TIMEOUT=5; alter system
Succeed.
$database; select CURRENT_SCN,BLOCK_SIZE,OPEN_MODE,RCY_POINT,FLUSH_POINT,RESET_POINT from v
CURRENT_SCN BLOCK_SIZE OPEN_MODE RCY_POINT FLUSH_POINT RESET_POINT
--------------------- ------------ ----------------- --------------------------------- --------------------------------- ---------------------------------
528770185152032768 8192 READ_WRITE 0-9-1-25421 0-9-1-25421 0-0-0-0
1 row fetched.
SQL> select * from v$Logfile;
THREAD# ID NAME BLOCK_SIZE BLOCK_COUNT USED_BLOCKS SEQUENCE# STATUS
------- ------------ ---------------------------------------------------------------- ------------ ------------ ------------ ------------ ---------
1 0 /data/yashan/yasdb_data/db-1-1/dbfiles/redo1 4096 32768 1 9 CURRENT
1 1 /data/yashan/yasdb_data/db-1-1/dbfiles/redo2 4096 32768 2 6 INACTIVE
1 2 /data/yashan/yasdb_data/db-1-1/dbfiles/redo3 4096 32768 2 7 INACTIVE
1 3 /data/yashan/yasdb_data/db-1-1/dbfiles/redo4 4096 32768 2 8 INACTIVE
4 rows fetched.
SQL> !date
Sat Feb 3 12:06:49 CST 2024
!date
Sat Feb 3 12:07:03 CST 2024
$database; select CURRENT_SCN,BLOCK_SIZE,OPEN_MODE,RCY_POINT,FLUSH_POINT,RESET_POINT from v
CURRENT_SCN BLOCK_SIZE OPEN_MODE RCY_POINT FLUSH_POINT RESET_POINT
--------------------- ------------ ----------------- --------------------------------- --------------------------------- ---------------------------------
528770185152032768 8192 READ_WRITE 0-9-1-25421 0-9-1-25421 0-0-0-0
1 row fetched.
SQL>
create table test20240203 as select * from dba_objects;
Succeed.
SQL> insert into test20240203 select * from test20240203;
1356 rows affected.
SQL> commit;
Succeed.
SQL> insert into test20240203 select * from test20240203;
2712 rows affected.
SQL> commit;
Succeed.
SQL> !date
Sat Feb 3 12:08:42 CST 2024
$database; select CURRENT_SCN,BLOCK_SIZE,OPEN_MODE,RCY_POINT,FLUSH_POINT,RESET_POINT from v
CURRENT_SCN BLOCK_SIZE OPEN_MODE RCY_POINT FLUSH_POINT RESET_POINT
--------------------- ------------ ----------------- --------------------------------- --------------------------------- ---------------------------------
528779327979118592 8192 READ_WRITE 0-9-176-25431 0-9-176-25431 0-0-0-0
1 row fetched.
SQL>
看上去崖山的检查点根本没有增量,似乎跟官方文档描述不符(不知道是不是我测试不对)。
这里我本想dump一下controlfile,看看具体结构,发现根本不支持,只支持dump datafile,logfile等,既然如此,那就先dump logfile看看内容吧。从redo logfile等dump 来看,其结构信息也是及其简单。
2024-02-03 12:26:15.408
dump type : logfile blocks
size : 741376bytes
redo head: rstid 0, asn 9, first lfn 25421, next lfn 25421, block size 4096, total blocks 32768, used blocks 1, first scn 528770185152032768, last scn 528770185152032768
checksum 4041231152, initTime 1706932509021519, complete time 1970-01-01 00:00:00, create time 2024-02-02 08:24:59, database id 3980299525, ident 4229072327, slot 0
batch: lfn 25421, block id 1, size 281128, space size 282624, part count 1, scn 528779254599004160
reset id 0, asn 9, last lsn 27902
group: lsn 27898, size 61565, handler's id 22, encrypt: 0
atomic 0, offset 16
ATTACH_BLOCK 5-0-30041 block id: spcBid(0, 30041), space 5, flags 1
UNDO_INIT_BLOCK 5-0-30041 block: 0-30041, prev block: 63-67108863
UNDO_WRITE 5-0-30041 type: XACT XNODE, isApplied: 0, isXslot: 0, isXfirst: 0, isOwscn: 0, dataSize: 24 undo block: 0-0, undo version: 0, undo dir: 0, xsn: 0, xmap: 46-2625, scn: 0
DETACH_BLOCK 5-0-30041 changeNum: 1, changed
ATTACH_BLOCK 5-0-4600 block id: spcBid(0, 4600), space 5, flags 4
XACT_BEGIN 5-0-4600 xmap: 46-2625, undo block: 0-30041, undo version: 0, undo dir: 0
DETACH_BLOCK 5-0-4600 changeNum: 14, changed
ATTACH_BLOCK 5-0-4590 block id: spcBid(0, 4590), space 5, flags 4
UNDO_MODIFY_SEGMENT 5-0-4590 block list: count 1 head 0-30040 tail 0-30040, ufbCount: 6, ufbFirst: 0-30042
DETACH_BLOCK 5-0-4590 changeNum: 955, changed
atomic 1, offset: 154
ATTACH_BLOCK 0-0-812 block id: spcBid(0, 812), space 0, flags 257
HEAP_CLEAN_XSLOT 0-0-812 xslotId: 2, scn: 528488491119570944, fastClean: 0, isOwscn: 0
HEAP_REUSE_XSLOT 0-0-812 xslotId: 3, xid: 46-2625-1, isolevel: 0, ssn: 0, undoRow: block: 0-30041, ver: 0, id: 1
HEAP_INSERT 0-0-812 insert row: size: 80, cols: 11, format: 1, isLink: 0, isMigr: 0
ATTACH_BLOCK 5-0-30041 block id: spcBid(0, 30041), space 5, flags 1
UNDO_WRITE 5-0-30041 type: HEAP INSERT, isApplied: 0, isXslot: 1, isXfirst: 1, isOwscn: 0, dataSize: 28 xslot xid: 0-0-0, scn: 0 dataObjectId: 5, rowId: space: 0, block: 0-812, dir: 3, ssn: 0, u
ndoRow: block: 0-0, ver: 0, id: 0
DETACH_BLOCK 5-0-30041 changeNum: 2, changed
DETACH_BLOCK 0-0-812 changeNum: 9, changed
atomic 2, offset: 370
看到崖山logfile等的结构并不复杂,就是logfile header,然后batch、group以及atomic结构。group 有点像oracle的redo record,而atomic 有点类似oracle redo record change vector。
到最后,我们再来回过头看之前一号测试机的崖山数据库为啥重启就起不来了 呢?
看上去就是检查点长期未更新,也就是控制文件未更新,可能是之前系统夯死了。所以导致启动的时候需要招很早之前的日志。 实际上我发现崖山默认会清理归档,只会保留最近100个(有个参数控制)。
其次我查询v$datafile发现类似跟oracle类似,但是没有run log相关的检查点信息,这跟Oracle不一样。Oracle的v$datafile等记录是来自控制文件的。所以要去了解崖山controlfile结构还不容易。说到这里,似乎达梦这点一点不错,起码提供了一个工具可以去查看和修改controlfile。
根据前面的一些测试,简单总结一下:
1、崖山数据库文档说有增量检查点,但是我测试似乎发现不太对。
2、崖山默认会自动清理归档,保留100个,有一个参数进行控制。
3、目前暂时不支持dump controlfile,支持dump data file,redo log。
4、从崖山redo log dump来看,其结构与oracle完全不同的。
5、崖山的检查点结构跟oracle也不同,rcy_point和flush_point的结构分一样,分为4个部分,有点类似oracle的checkpoint rba。
6、崖山数据库启动也分为nomount、mount和open 3个阶段;与Oracle一样。但是其中mount阶段的行为与oracle不同,是需要做相关对象检查的。