合 OGG抽取进程与数据字典的关系
Tags: OGG
ogg抽取进程与数据字典的关系
从Oracle Goldengate 11.2开始,oracle引入了一种新的capture mode,称为Integrated Capture Mode。传统的capture mode被称为classic capture mode。
在classic capture mode中,goldengate extract进程直接读取oracle redo log,捕获数据变化,存为Goldengate的trail file格式,然后利用pump进程将这些trail file传输到目标数据库,目标数据库上的replicat进程读取这些trail file,再利用sql将这些变化apply到目标数据库里。
在新的integrated capture mode中,goldengate extract进程不再直接读取oracle redo log,而是通过与数据库log mining server整合来捕获数据变化。log mining server负责以LCR的格式从数据库日志中捕获数据变化,放在stream_pool中;然后extract进程再将这些抓取的数据存成trail file的格式。
与classic capture mode相比,这种integrated capture mode的主要差别就是extract不再直接读取oracle redo log,而交由数据库内部的log mining工具来完成。
由于extract进程是由操作系统来管理的,身处数据库系统之外,integrated capture mode的这种改变所带来的主要好处体现在兼容性方面:支持更多的数据类型和存储类型,以及由于与数据库更紧密的整合,不再需要为Oracle RAC,ASM和TDE作更多额外的配置工作。
以上看起来integrated capture优点多多(不列举),但是缺点呢?
以前配置classic capture,只要归档都在,可以随意指定抽取进程开始的时间或者跳过某些归档(特别是在归档文件不小心丢失的情况下),非常简单方便。比如:
1 2 | alter extract ESB_ZJ4 ,begin 2018-01-18 17:26:00 ALTER EXTRACT finance, EXTSEQNO 26, EXTRBA 0 |
直接到某次integrated capture进程重启后突然提示要一个非常陈旧的归档文件。抽取进程只是暂时了2个小时,这两天的归档全都在啊!尝试指定抽取进程的启动的时间点或者直接begin now都不成功,这意思很可能要重配抽取进程然后begin now,这套ogg还是非常重要的业务同步,丢失2个小时的同步数据,而且不知道是哪些表哪些数据丢了,惨了惨了脑袋缺氧。
后面还是通过重新register/unregister + begin now拉起来了抽取,惨痛的对比和补数。
integrated capture的缺点
在经典抽取中我们经常指定抽取进程启动的时间点或归档位置,只要归档文件都在就可以反复横跳。但这招在integrated capture就不灵了!
第一个缺点:直接alert extract XXXX begin now难了
MOS上(Doc ID 1610114.1)有说明:
If we are to skip redo, then the logminer dictionary will not be in sync. This is even more of the case now (in 12.1.2.0.0) with integrated dictionary.
Even if the IE is altered to BEGIN NOW the capture process still has to go through the intermediate logs before reaching the altered position and the logs has to be physically present. This is how the IE behaves currently if ALTER'ed. To skip the logs/ alter the Extract to "begin now" the capture process has to be unregistered/registered following by alter of the extract with "begin now".
翻译下就是即使alert IE抽取进程 begin now,抽取进程仍然需要这中间的所有日志文件来同步logminer dictionary。如果缺了中间的归档,唯一的办法是unregistered/registered + begin now,也就是重配抽取进程了。
想想begin now的场景往往就是抽取启动时报缺归档时应急处理的下策,现在这条路也走不通了。当然重建抽取进程当然效果也一样,但可麻烦很多。
再翻译下就是:以后IE抽取进程所需的归档丢了找不回,中间丢的数据真没了!IE抽取进程也绝对起不来了!重建抽取进程吧少年!
注意:虽然配置了integrated capture后ogg会自动管理归档文件,避免它被rman误删。但它可挡不住手动rm 或者“DELETE FORCE ARCHIVELOG...”。
第二个缺点:指定integrated capture从旧时间点开始抽取难了
像我前面遇到的问题,归档文件都在啊,在classic capture中我们完全可以重建抽取进程然后指定它从某个时间点开始抽取,数据完全不丢失。
对于integrated capture不是绝对不行,是非常难:
MOS上(Doc ID 2046831.1)有说明:
In order for a new IE to capture transactions earlier than that time, you need to search for a dictionary build that was created earlier than the historical time and register the IE to begin capture at a SCN of a dictionary build.
To find SCN for all previous dictionary builds issue:
SELECT first_change# FROM v$archived_log WHERE dictionary_begin = 'YES' AND standby_dest = 'NO' AND name IS NOT NULL AND status = 'A';
Convert the SCNs displayed to a timestamp that is closest to what you need but earlier, via
SELECT SCN_TO_TIMESTAMP(
Choose the timestamp that is earlier and closest to the time you want to start the new IE, and register the IE as follows:
dblogin userid
register extract myinteg, database scn
Then add IE as usual with a historical time as determined above.
add ext myinteg, integrated tranlog, begin
翻译下:要使用新加的IE进程抽取从一个旧的时间点A开始抽取,必须找到包含dictionary的日志文件(假定它的first_change#是时间点B),则A要晚于B,同时从这个日志文件开始所有日志文件必须都在。比如:
SQL> SELECT first_change# , scn_to_timestamp(first_change#) FROM v$archived_log WHERE dictionary_begin = 'YES' AND standby_dest = 'NO' AND name IS NOT NULL AND status = 'A';
FIRST_CHANGE#
-------------
18975095
19465792
19510832
SQL> SELECT SCN_TO_TIMESTAMP(18975095) FROM DUAL;
SCN_TO_TIMESTAMP(18975095)
--------------------------------------------------
04-AUG-15 02.15.57.000000000 PM --**包含dictionary的日志文件最早时间点2020-08-04 02:15:57**
GGSCI> DBLOGIN USERID c##[email protected]_dbms1 PASSWORD goldengate
GGSCI> MININGDBLOGIN USERID c##ogg PASSWORD goldengate
GGSCI> register extract myinteg, database scn 18975095
GGSCI> add ext myinteg, integrated tranlog, begin 2015-08-04 03:00
--**开始抽取的时间点不能早于2020-08-04 02:15:57。并且从FIRST_CHANGE#=18975095开始的归档文件都要在,不然遇到第一个问题。**
GGSCI> add exttrail dirdat/ie, ext myinteg , megabytes 1024
那包含有dictionary的日志文件哪来的?
A new dictionary is built when you create a new database and generally for normal database operations you do not need to create anymore new dictionaries.
A ggsci> register extract database command creates a new dictionary and can thus be shared if there are existing IEs.
If you regularly creates a dictionary you enable new or existing IEs the ability to capture from a historical timestamp by
sql>execute DBMS_LOGMNR_D.BUILD( options => DBMS_LOGMNR_D.STORE_IN_REDO_LOGS);
翻译下:在做register extract时会将当时的dictionary刷到redo文件中,也可以手动刷新。
总结:如果平时不手动刷新dictionary到redo文件(一般人哪会干这事,估计也没人知道),要指定新建的或现有IE进程从一个过去的时间点开始,不能超过IE进程“register extract的时间点”,同时从“register extract的时间点”所有归档必须都在!
这两个MOS的解释让我产生了更多的疑问:
问题一:classic capture为什么不需要dictionary?
问题二:integrated capture平时启停为什么不需要dictionary(register extract时的归档文件早就删了)?
问题三:为什么指定integrated capture的位置/时间时就需要在这之前的dictionary?
classic capture与数据字典
在classic capture mode中,goldengate extract进程直接读取oracle redo log,捕获数据变化。默认redo条目是不记录表的数据字典信息的。
1 | INSERT INTO HR.JOBS(JOB_ID, JOB_TITLE, MIN_SALARY, MAX_SALARY) VALUES('IT_WT','Technical Writer', 4000, 11000); |
单独从redo或归档中我们只能得到以下信息(明显不够啊):
1 | insert into "UNKNOWN"."OBJ# 45522"("COL 1","COL 2","COL 3","COL 4") values (HEXTORAW('45465f4748'),HEXTORAW('546563686e6963616c20577269746572'),HEXTORAW('c229'),HEXTORAW('c3020b')); |
Extract进程检测到redo条目中存在受支持的操作(是否为要抽取的表名),由于redo中包含对象ID,而不是对象名称。首次遇到object_id时,Extract进程将针对数据字典发出一系列语句以找出其含义,将确定相应的对象名称和类型:
1 2 3 4 | SELECT u.name, o.name, o.dataobj#, o.type#, (SELECT bitand(t.property, 1) FROM sys.tab$ t WHERE t.obj# = &ora_object_id) FROM sys.obj$ o, sys.user$ u WHERE o.obj# = &ora_object_id AND decode(bitand(o.flags, 128), 128, 'YES', 'NO') = 'NO' AND o.owner# = u.user# AND decode(bitand(o.flags, 16), 0, 'N', 16, 'Y', 'N') = 'N' AND (o.type# in (1, 19, 20, 34) OR EXISTS (SELECT 'x' FROM sys.tab$ t WHERE t.obj# = &ora_object_id)); |
当然还有不仅以上一条sql。但对抽取进程来说,表名信息最重要,它不需要列的信息。列信息在复制进程解析trail文件时才需要,将通过使用目标数据库的数据字典(ASSUMETARGETDEFS)或指定定义文件(SOURCEDEFS)来获取。
抽取出来的trail文件中相当于记录了:
1 | insert into HR.JOBS ("COL 1","COL 2","COL 3","COL 4") values (HEXTORAW('45465f4748'),HEXTORAW('546563686e6963616c20577269746572'),HEXTORAW('c229'),HEXTORAW('c3020b')); |
classic capture与dictionary的关系:
- Without DDL support configured, GoldenGate will always query online data dictionary and that doesn’t play well with object changes.
经在OGG 11.2上实验测试,ogg抽取时只查询online data dictionary。