Wednesday, February 29, 2012

ORA-12514 When doing Switchover to STANDBY Database

Hi,

Today we faced issue when switching over the PRIMARY database to STANDBY database we faced the below issue.

Issue:

Welcome to DGMGRL, type "help" for information.
DGMGRL> connect sys/manager
Connected.
DGMGRL> switchover to 'CV_S';
Performing switchover NOW, please wait...
Operation requires shutdown of instance "CV" on database "CV_P"
Shutting down instance "CV"...
ORA-01109: database not open

Database dismounted.
ORACLE instance shut down.
Operation requires shutdown of instance "CV" on database "CV_S"
Shutting down instance "CV"...
ORA-01109: database not open

Database dismounted.
ORACLE instance shut down.
Operation requires startup of instance "CV" on database "CV_P"
Starting instance "CV"...
Unable to connect to database
ORA-12514: TNS:listener does not currently know of service requested in connect descriptor

Failed.
You are no longer connected to ORACLE
Please connect again.
Unable to start instance "CV"
You must start instance "CV" manually
Operation requires startup of instance "CV" on database "CV_S"
You must start instance "CV" manually
Switchover succeeded, new primary is "CV_S"
DGMGRL>

Troubleshooting:

Metalink and other notes pointed out that the issue will occur if we miss adding the parameter GLOBAL_DBNAME = db_unique_name_DGMGRL to the listener on the primary and standby database as the DMON uses service db_unique_name_DGMGRL when starting the databases during switchover.

But when we checked the listener configuration we found that desired parameter was added and also configured in the listener

Primary Database
------------------
LISTENER_DG_CV =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(Host = db-d6.abc.com)(Port = 1530)(ip = first))
)


SID_LIST_LISTENER_DG_CV_db-d6 =
(SID_LIST =
(SID_DESC =
(GLOBAL_DBNAME = CV)
(SDU = 32768)
(TDU = 32768)
(ORACLE_HOME = /oracle/10.2.0.4/CV)
(SID_NAME = CV)
)
(SID_DESC =
(GLOBAL_DBNAME = CV_P_DGMGRL)
(SDU = 32768)
(TDU = 32768)
(ORACLE_HOME = /oracle/10.2.0.4/CV)
(SID_NAME = CV)
)
)

Standby Database
------------------
LISTENER_DG_CV =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(Host = db2-d6.abc.com)(Port = 1530)(ip = first))
)


SID_LIST_LISTENER_DG_CV_db-d6 =
(SID_LIST =
(SID_DESC =
(GLOBAL_DBNAME = CV)
(SDU = 32768)
(TDU = 32768)
(ORACLE_HOME = /oracle/10.2.0.4/CV)
(SID_NAME = CV)
)
(SID_DESC =
(GLOBAL_DBNAME = CV_S_DGMGRL)
(SDU = 32768)
(TDU = 32768)
(ORACLE_HOME = /oracle/10.2.0.4/CV)
(SID_NAME = CV)
)
)

Also the listener had registered the service name

Primary Side
-------------
$ lsnrctl status LISTENER_DG_CV_db-d6

LSNRCTL for Linux: Version 10.2.0.4.0 - Production on 29-FEB-2012 09:53:13

Copyright (c) 1991, 2007, Oracle. All rights reserved.

Connecting to (ADDRESS=(PROTOCOL=TCP)(Host=db-d6.abc.com)(Port=1530)(ip=first))
STATUS of the LISTENER
------------------------
Alias LISTENER_DG_CV_db-d6
Version TNSLSNR for Linux: Version 10.2.0.4.0 - Production
Start Date 28-FEB-2012 12:08:48
Uptime 0 days 21 hr. 44 min. 24 sec
Trace Level off
Security ON: Local OS Authentication
SNMP ON
Listener Parameter File /etc/listener.ora
Listener Log File /oracle/10.2.0.4/CV/network/log/listener_dg_cv_db-d6.log
Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=173.38.0.22)(PORT=1530)))
Services Summary...
Service "CV" has 1 instance(s).
Instance "CV", status UNKNOWN, has 1 handler(s) for this service...
Service "CV_P_DGMGRL" has 1 instance(s).
Instance "CV", status UNKNOWN, has 1 handler(s) for this service...
The command completed successfully

Standby Side
-------------
$ lsnrctl status LISTENER_DG_CV_db2-d6

LSNRCTL for Linux: Version 10.2.0.4.0 - Production on 29-FEB-2012 09:53:13

Copyright (c) 1991, 2007, Oracle. All rights reserved.

Connecting to (ADDRESS=(PROTOCOL=TCP)(Host=db2-d6.abc.com)(Port=1530)(ip=first))
STATUS of the LISTENER
------------------------
Alias LISTENER_DG_CV_db2-d6
Version TNSLSNR for Linux: Version 10.2.0.4.0 - Production
Start Date 28-FEB-2012 12:08:48
Uptime 0 days 21 hr. 44 min. 24 sec
Trace Level off
Security ON: Local OS Authentication
SNMP ON
Listener Parameter File /etc/listener.ora
Listener Log File /oracle/10.2.0.4/CV/network/log/listener_dg_cv_db-d6.log
Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=173.38.0.22)(PORT=1530)))
Services Summary...
Service "CV" has 1 instance(s).
Instance "CV", status UNKNOWN, has 1 handler(s) for this service...
Service "CV_S_DGMGRL" has 1 instance(s).
Instance "CV", status UNKNOWN, has 1 handler(s) for this service...
The command completed successfully

Further when ever we tried making the sqlplus session (sqlplus system/abc@CV_P_DGMGRL or sqlplus system/abc@CV_S_DGMGRL) from standby to primary or vice-versa it went perfectly fine and had no issues.

Also we had other databases in our system which had the same configuration and the switchover was working fine. When we started to compare those databases with this database having the issue we found that the parameter db_domain was set here to abc.com

It seemed like the DMON was trying to connect to service db_unique_name_DGMGRL.abc.com (db_unique_name_DGMGRL.domain.com) in the connect descriptor of the tns names it was using to start the database. It made me think that DMON constructed its own tnentry to start the database and did not use the tns entry we had made in the tnsnames.ora


Solution:

In our case
db_unique_name = CV_P on primary database
db_unique_name = CV_S on standby database

To make the failover successful we re-registered the listener with service name db_unique_name_DGMGRL.abc.com by editing the listener file as below and reloading it.

Primary Database
------------------
LISTENER_DG_CV =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(Host = db-d6.abc.com)(Port = 1530)(ip = first))
)


SID_LIST_LISTENER_DG_CV_db-d6 =
(SID_LIST =
(SID_DESC =
(GLOBAL_DBNAME = CV.abc.com)
(SDU = 32768)
(TDU = 32768)
(ORACLE_HOME = /oracle/10.2.0.4/CV)
(SID_NAME = CV)
)
(SID_DESC =
(GLOBAL_DBNAME = CV_P_DGMGRL.abc.com)
(SDU = 32768)
(TDU = 32768)
(ORACLE_HOME = /oracle/10.2.0.4/CV)
(SID_NAME = CV)
)
)

$ lsnrctl reload LISTENER_DG_CV

$ lsnrctl status LISTENER_DG_CV_db-d6

LSNRCTL for Linux: Version 10.2.0.4.0 - Production on 29-FEB-2012 09:53:13

Copyright (c) 1991, 2007, Oracle. All rights reserved.

Connecting to (ADDRESS=(PROTOCOL=TCP)(Host=db-d6.abc.com)(Port=1530)(ip=first))
STATUS of the LISTENER
------------------------
Alias LISTENER_DG_CV_db-d6
Version TNSLSNR for Linux: Version 10.2.0.4.0 - Production
Start Date 28-FEB-2012 12:08:48
Uptime 0 days 21 hr. 44 min. 24 sec
Trace Level off
Security ON: Local OS Authentication
SNMP ON
Listener Parameter File /etc/listener.ora
Listener Log File /oracle/10.2.0.4/CV/network/log/listener_dg_cv_db-d6.log
Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=173.38.0.22)(PORT=1530)))
Services Summary...
Service "CV.abc.com" has 1 instance(s).
Instance "CV", status UNKNOWN, has 1 handler(s) for this service...
Service "CV_P_DGMGRL.abc.com" has 1 instance(s).
Instance "CV", status UNKNOWN, has 1 handler(s) for this service...
The command completed successfully

Repeated same steps on the Standby database also so that same changes and take place.

Then we brought up the database manually and now due to the initial switchover Standby database role is primary and Primary database role is Physical_Standby.

Hence to check the fix we switched over from Standby to Primary and it went perfectly fine

DGMGRL> switchover to 'CV_P';
Performing switchover NOW, please wait...
Operation requires shutdown of instance "CV" on database "CV_S"
Shutting down instance "CV"...
ORA-01109: database not open

Database dismounted.
ORACLE instance shut down.
Operation requires shutdown of instance "CV" on database "CV_P"
Shutting down instance "CV"...
ORA-01109: database not open

Database dismounted.
ORACLE instance shut down.
Operation requires startup of instance "CV" on database "CV_S"
Starting instance "CV"...
ORACLE instance started.
Database mounted.
Operation requires startup of instance "CV" on database "CV_P"
Starting instance "CV"...
ORACLE instance started.
Database mounted.
Switchover succeeded, new primary is "CV_P"
DGMGRL>

Alternate Solution: Instead of reloading the listeners we can nullify the db_domain parameter on the Primary and Standby Databases. I have not tried incase you get hit by this issue you can try it.

Regards,
Tanveer




Saturday, February 18, 2012

ORA-01111 and ORA-01110 on the STANDBY DATABASE

Hi,
Recently we built a STANDBY DATABASE for one of our production databases.
The STANDBY DATABASE was 10g with raw volumes

Issue Description

The recovery was stuck with the below error on the STANDBY DATABASE.

Alert Log
Errors in file /oracle/admin/CPD/bdump/CPD_mrp0_19746.trc:
ORA-01111: name for data file 178 is unknown - rename to correct file
ORA-01110: data file 178: '/oracle/product/10.2.0.4/CPD/dbs/UNNAMED00178'
ORA-01157: cannot identify/lock data file 178 - see DBWR trace file
ORA-01111: name for data file 178 is unknown - rename to correct file
ORA-01110: data file 178: '/oracle/product/10.2.0.4/CPD/dbs/UNNAMED00178'

Trace File
MRP0: Background Media Recovery terminated with error 1111
ORA-01111: name for data file 178 is unknown - rename to correct file
ORA-01110: data file 178: '/oracle/product/10.2.0.4/CPD/dbs/UNNAMED00178'
ORA-01157: cannot identify/lock data file 178 - see DBWR trace file
ORA-01111: name for data file 178 is unknown - rename to correct file
ORA-01110: data file 178: '/oracle/product/10.2.0.4/CPD/dbs/UNNAMED00178'

Troubleshooting

The error codes pointed out that it is a issue with missing file which is present in the control file but not present physically on the STANDBY host.
As this issue occurred later after we had successfully configured the STANDBY DATABASE, we were able to make out that this could have occured due to new raw volume addition on primary database which has not been provisioned on the STANDBY DATABASE side.

Hence at first we needed to find the number of files added on primary after the STANDBY DATABASE build. The below query listed the time the raw volumes were added to the database and hence we could figure out the raw volumes added to the database

set lines 150
set pages 0
col name for a75
select name,to_char(CREATION_TIME,'HH24:MI:SS MON-DD-YYYY') from v$datafile order
by creation_time;

We found that there were 2 files added to the database after we had created the STANDBY DATABASE.

Nextly we needed to find the exact raw volume which was needed by the STANDBY DATABASE to continue recovery.
From the error we were able to know that the STANDBY DATABASE needed the datafile with file number 178
We ran the below query on the primary database, which yielded us the exact name of the missing raw volume and the same query on the STANDBY side gave the psuedo name the STANDBY DATABASE had given to the missing file on the STANDBY DATABASE side. Hence confirming the exact raw volume name.

Primary Side
------------
SQL> select name from v$datafile where file#=178;
NAME
----------------------------------------
/dev/mapper/oraCPD_data_017_4600Mp9

STANDBY Side
-------
SQL> select name from v$datafile where file#=178;
NAME
----------------------------------------
/oracle/product/10.2.0.4/CPD/dbs/UNNAMED00178


Note: Though there were 2 files added to the production that was missing on STANDBY DATABASE side, the recovery was stuck at the very first missing file.
Hence we needed to correct only that file but for the issue not to occur again due to the other missing raw volume you need to make sure even that
raw volume is made available on the STANDBY DATABASE side.
(We got the all the missing raw volumes present on the primary side added to the STANDBY to avoid such issues in the future when there is new raw volume addition)


Solution

1. Make sure that all the missing raw volumes are added to the STANDBY side at host level.

2. Stop Recovery on the Standby Server
recover managed standby database cancel;

3. Change the standby_file_management parameter to MANUAL on STANDBY DATABASE
alter system set standby_file_management=manual;

4. Rename the datafile on Standby side
alter database create datafile '/oracle/product/10.2.0.4/CPD/dbs/UNNAMED00178'
as '/dev/mapper/oraCPD_data_017_4600Mp9';

5. Change the standby_file_management parameter to AUTO on STANDBY DATABASE
alter system set standby_file_management=auto;

6. Restart the Recovery
alter database recover managed standby database disconnect;

Regards,
Tanveer