原 Oracle DG中日志不同步的ORA-16191错误解决整理
现象
告警日志报错:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 | SYS@LHR11G> select dest_id,error,status,log_sequence,applied_scn from v$archive_dest a where a.status<>'INACTIVE'; DEST_ID ERROR STATUS LOG_SEQUENCE APPLIED_SCN ---------- ----------------------------------------------------------------- ------------------ ------------ ----------- 1 VALID 147 0 2 ORA-16191: Primary log shipping client not logged on standby ERROR 148 4951267 Error 1017 received logging on to the standby ------------------------------------------------------------------------------------------------------------------------ Check that the primary and standby are using a password file and remote_login_passwordfile is set to SHARED or EXCLUSIVE, and that the SYS password is same in the password files. returning error ORA-16191 ------------------------------------------------------------ PING[ARC2]: Heartbeat failed to connect to standby 'lhr11gdg'. Error is 16191. FAL Redo Shipping Client Established Network Login Failed to queue the whole gap GAP - thread 1 sequence 1643693-1643693 DBID 1458799814 branch 931019082 -- 备库报错 Errors in file /u01/app/oracle/diag/rdbms/dgorcl/dgorcl/trace/dgorcl_ora_7229.trc: ORA-00604: error occurred at recursive SQL level 1 ORA-16000: database open for read-only access ORA-06512: at line 2 -- 查看trace发现是触发器的问题 Error in executing triggers on connect internal *** 2024-08-17 21:50:27.744 dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x0, level=0, mask=0x0) ----- Error Stack Dump ----- ORA-00604: error occurred at recursive SQL level 1 ORA-16000: database o |
解释:
1 2 3 4 5 6 7 8 9 10 | [oracle@lhr11g trace]$ oerr ora 16191 16191, 0000, "Primary log shipping client not logged on standby" // *Cause: An attempt to ship redo to standby without logging on // to standby or with invalid user credentials. // *Action: Check that primary and standby are using password files and that // both primary and standby have the same SYS password. // Restart primary and/or standby after ensuring that // password file is accessible and REMOTE_LOGIN_PASSWORDFILE // initialization parameter is set to SHARED or EXCLUSIVE. [oracle@lhr11g trace]$ |
查询报错:
1 2 3 4 5 6 7 | SQL> SELECT thread#, dest_id, gvad.status, error, fail_sequence FROM gv$archive_dest gvad, gv$instance gvi WHERE gvad.inst_id = gvi.inst_id AND destination is NOT NULL ORDER BY thread#, dest_id; THREAD# DEST_ID STATUS ERROR FAIL_SEQUENCE --------------------------------------------------------------------------------- 1 1 VALID 0 1 2 ERROR ORA-16191: Primary log shipping client not logged on standby 0 |
可能的原因
密码文件问题
ORA-16191: Primary Log Shipping Client Not Logged On Standby Error in Dataguard Environment after SYS Password Change (Doc ID 2420498.1)
这个容易解决,可以重建如下密码文件,或建议在一个节点生成密码文件,然后将该文件拷贝到其它节点。
1 2 3 4 5 6 7 8 9 10 11 12 | orapwd file=$ORACLE_HOME/dbs/orapworcl1 password=oracle force=y ignorecase=Y orapwd file=$ORACLE_HOME/dbs/orapworcl2 password=oracle force=y ignorecase=Y orapwd file=$ORACLE_HOME/dbs/orapwdgorcl password=oracle force=y ignorecase=Y If sec_case_sensitive_logon set to TRUE, then ensure that password file is created with option IGNORECASE=N If sec_case_sensitive_logon set to FALSE, then ensure that password file is created with option IGNORECASE=Y -- 校验密码文件是否一致 md5sum orapw<SID> |
从12c开始,DG可以自动同步密码文件。参考:https://my.dbaup.com/oracle-12cdedgzidongtongbumimawenjian-asm-xintexinggongxiangmimawenjian.html
分析步骤
- Check where the Password File is located :
a. If within '$ORACLE_HOME/dbs", then File naming should be as "orapw$ORACLE_SID" (example : orapwORCL)
For RAC, need to ensure that correct ORACLE_SID/INSTANCE_NAME is reflecting in the Password File Name of each instance (orapw$ORACLE_SID => example : orapwORCL1, orapwORCL2)
b. Password File within ASM Disk Group and registered in OCR
Check the output of "$ srvctl config database -d
Checksum of Password Files at Primary and Standby should match
Check the output of "md5sum orapwORCL" at each Node of Primary and Standby
remote_login_passwordfile=EXCLUSIVE
sec_case_sensitive_logon=TRUE
If set to TRUE, then ensure that password file is created with option IGNORECASE=N
If set to FALSE, then ensure that password file is created with option IGNORECASE=Y
系统触发器问题
参考:Error 1017 / ORA-16191 In Standby Alertlog (Doc ID 2225190.1)
ORA-16191 ORA-01017 and ORA-16000 shipping the logs to Standby database (Doc ID 2129339.1)
In this particular case, customer had a custom (non-Oracle supplied) login trigger on primary. This was preventing standby to connect to primary to query gap status.
在这种情况下,客户在主库上有一个自定义(非Oracle提供的)登录触发器。这导致备用库无法连接到主库以查询gap状态。
主库告警日志报错:
1 2 3 4 5 6 7 8 9 10 11 12 | Wed Dec 07 15:40:36 2016 Media Recovery Waiting for thread 1 sequence 123569 Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION Error 1017 received logging on to the standby ------------------------------------------------------------ Check that the primary and standby are using a password file and remote_login_passwordfile is set to SHARED or EXCLUSIVE, and that the SYS password is same in the password files. returning error ORA-16191 ------------------------------------------------------------ FAL[client, USER]: Error 16191 connecting to rmseprod for fetching gap sequence Wed Dec 07 15:41:12 2016 |
备库告警日志:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | -- 备库报错 Errors in file /u01/app/oracle/diag/rdbms/dgorcl/dgorcl/trace/dgorcl_ora_7229.trc: ORA-00604: error occurred at recursive SQL level 1 ORA-16000: database open for read-only access ORA-06512: at line 2 -- 查看trace发现是触发器的问题 Error in executing triggers on connect internal *** 2024-08-17 21:50:27.744 dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x0, level=0, mask=0x0) ----- Error Stack Dump ----- ORA-00604: error occurred at recursive SQL level 1 ORA-16000: database o |
In this case rmseprod is the FAL_SERVER and points to primary database. A manual sqlplus connection to primary is fine using: sqlplus sys/password@rmseprod as sysdba
Also, we verified that password file is copied from primary and it's checksum matches with the one on primary.
解决:主备库都需要运行,且在手工追加完gap后,重启数据库:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | ALTER SYSTEM SET "_system_trig_enabled"=FALSE sid='*'; SET PAGESIZE 9999 SET LINE 9999 COL NAME FORMAT A40 COL KSPPDESC FORMAT A50 COL KSPPSTVL FORMAT A20 SELECT A.INDX, A.KSPPINM NAME, A.KSPPDESC, B.KSPPSTVL FROM X$KSPPI A, X$KSPPCV B WHERE A.INDX = B.INDX AND A.KSPPINM LIKE '/_%' ESCAPE '/' AND LOWER(A.KSPPINM) LIKE LOWER('_system_trig_enabled'); |