Seguindo uma ordem cronológica dos artigos sobre Oracle RAC e MAA vamos fazer um switchover usando o Broker. Em um ambiente de produção você não seguiria esta ordem de eventos, mas para fins didáticos é interessante observar os passos.
No último artigo “forçamos” um failover do ambiente, o banco “maa” que rodava como primary sofreu um failover e o banco “maastb” é agora o primary. Além disso, no mesmo artigo o banco “maa” sofre um reinstate e passou a atuar como physical standby. Diferentemente do primeiro artigo que falei sobre failover, o último foi através do Broker. Todos os comandos partiram dele, do failover ao reinsate. Como disse no último artigo, o ambiente ainda não está 100%, precisamos configurar o Fast-Start Failover para ter um ambiente completo com MAA.
Infelizmente algumas atividades impediram publicar este artigo de maneira mais rápida, isso deixou este arigo um pouco afastado dos demais. Recomendaria a releitura dos artigos anteriores para relembrar alguns pontos.
SÉTIMO ARTIGO
Neste artigo vamos ver como realizar o switchover do ambiente através do Broker. O banco “maastb” que atua como primary sofrerá uma troca de papeis com o seu banco “maa” (que opera como physical standby).
Vamos ver aqui que através do Broker os passos e comandos ficam mais simples (quando comparado com o switchover manual que foi realizado neste artigo), basicamente iremos acompanhar alguns logs. Claro que tudo isso só acontecerá pois já tomamos diversos cuidados no caminho até aqui, destaco como uma das principais a conexão do Broker a cada instância (StaticConnectIdentifier) por permitir enviar os comandos diretamente uma a uma.
AMBIENTE
Para este artigo temos o banco “maastb” operando como primary, enquanto o banco “maa” opera como physical standby. Ambos são bancos Oracle RAC e completamente sincronizados. Além disso, o Broker está habilitado e “monitorando” o ambiente.
Como estamos com dois Oracle RAC de dois nós cada temos 4 alertlog para demonstrar, bem como 4 logs do Broker para acompanhar. Desta forma, não se assuste com o tamanho do artigo, ele pode parecer extenso mas estará recheado de detalhes técnicos para observar.
SWITCHOVER
Como disse no primeiro artigo sobre switchover, a diferença básica entre ele e um failover é que no switchover ambos os bancos (primary e standby) estão ativos e sabem da troca de papeis. Você não precisará fazer o reinstate do seu antigo primary, o standby assume no ponto subsequente ao que o primary chaveou. Não há perda de archives ou de qualquer informação da base de dados.
Status do Ambiente
A primeira coisa a verificar antes de fazer o swictover é se ambos, primary e standby, estão sincronizados. Não pode existir qualquer gap nos archives entre eles, se existir eu recomendo a você corrigir para depois dar sequência aos passos.
Para verificar o ambiente você pode usar o sqlplus ou o Broker. No Broker eu recomendo verificar a configuração básica e o status de envio dos logs, observe:
DGMGRL> SHOW CONFIGURATION; Configuration - dgrac Protection Mode: MaxAvailability Databases: maastb - Primary database maa - Physical standby database Fast-Start Failover: DISABLED Configuration Status: SUCCESS DGMGRL> DGMGRL> SHOW DATABASE maastb LogXptStatus; LOG TRANSPORT STATUS PRIMARY_INSTANCE_NAME STANDBY_DATABASE_NAME STATUS maastb2 maa maastb1 maa DGMGRL>
No sqlplus basta fazer a consulta abaixo:
SQL> SELECT protection_mode, protection_level, database_role FROM v$database; PROTECTION_MODE PROTECTION_LEVEL DATABASE_ROLE -------------------- -------------------- ---------------- MAXIMUM AVAILABILITY MAXIMUM AVAILABILITY PRIMARY SQL>
Para ilustrar que não iremos perder qualquer informação podemos usar a mesma tabela que criamos em artigos anteriores. Os dados serão removidos e no final vamos verificar se isso se mantêm após a troca de papeis:
SQL> SELECT * FROM testedg; C1 C2 ---------- --------- 1 11-MAY-14 2 11-MAY-14 SQL> DELETE FROM testedg; 2 rows deleted. SQL> commit; Commit complete. SQL>
Swicthover to
Com a garantia de que a sincronia entre os ambientes está correta, basta fazer o switchover através do Broker (a conexão pode ser em qualquer lado, mas recomendo ser no primary atual):
DGMGRL> CONNECT sys@maastb; Password: Connected. DGMGRL> DGMGRL> SWITCHOVER TO maa; Performing switchover NOW, please wait... New primary database "maa" is opening... Operation requires shutdown of instance "maastb2" on database "maastb" Shutting down instance "maastb2"... ORACLE instance shut down. Operation requires startup of instance "maastb2" on database "maastb" Starting instance "maastb2"... ORACLE instance started. Database mounted. Switchover succeeded, new primary is "maa" DGMGRL>
Viu como foi simples, mas antes de iniciarmos com a análise dos logs (alertlog e logs do broker) vamos ver o comando acima. Observe que a primeira ação do Broker foi conectar-se e abrir o novo primary (maa) e depois realizar o shutdown do antigo primary (maastb). Estas conexões utilizam o StaticConnectIdentifier para garantir uma conexão direta e exclusiva. Observe também que todas as instâncias do antigo primary foram desligadas e posteriormente o banco foi montado.
Um detalhe que cabe ressaltar antes de mostrar os logs abaixo, por um erro no meu ambiente as datas e horas dos Oracle RAC primary e standby estão fora de sincronia, com diferença aproximada de 3 horas e 30 minutos. De qualquer forma, entre os nós do mesmo cluster elas estão sincronizadas.
Tentarei mostrar de forma cronológica a ordem dos eventos que ocorreram. Para iniciar a análise temos o log do Broker da instância “maastb2” (instância que recebeu a conexão que fizemos no Broker no passo anterior):
06/10/2014 03:02:55 SWITCHOVER TO maa Notifying Oracle Clusterware to teardown primary database for SWITCHOVER Database Switchover needs instance count reduced to 1 06/10/2014 03:03:26 Command SWITCHOVER TO maa completed 06/10/2014 03:03:47 >> Starting Data Guard Broker bootstrap << Broker Configuration File Locations: dg_broker_config_file1 = "+DG01/maastb/dr1maastb.dat" dg_broker_config_file2 = "+FRA/maastb/dr2maastb.dat" 06/10/2014 03:03:52 DMON Registering service maastb_DGB with listener(s) Broker Configuration: "dgrac" Protection Mode: Maximum Availability Fast-Start Failover (FSFO): Disabled, flags=0x0, version=0 Primary Database: maa (0x01010000) Standby Database: maastb, Enabled Physical Standby (0x02010000) 06/10/2014 03:04:03 Creating process RSM0 06/10/2014 03:04:07 SQL Execution error=604, sql=[alter system set log_archive_dest_1='']. See error stack below. ORA-00604: error occurred at recursive SQL level 1 ORA-02097: parameter cannot be modified because specified value is invalid ORA-16028: new LOG_ARCHIVE_DEST_1 causes less destinations than LOG_ARCHIVE_MIN_SUCCEED_DEST requires 06/10/2014 03:04:26 Notifying Oracle Clusterware to buildup standby database after SWITCHOVER Data Guard notifying Oracle Clusterware to start services and other instances change 06/10/2014 03:04:29 Command SWITCHOVER TO maastb completed
Observe acima que após receber o comando de switchover o Broker requisitou que somente uma instância do atual primary (maastb2) estive online. A solução da Oracle não é muito elegante (como vocês irão ver abaixo), mas resolve o problema. O log acima tem mais informações, mas vamos pular para o alertlog da instância para ver a solução da Oracle.
Como vimos acima, quem está orquestrando o switchover é a instância maastb2, desta forma (provavelmente) a instância “maastb1” terá que ser desligada para que o switchover tenha sequência. Assim temos o alerlog da instância maastb1:
Tue Jun 10 03:02:59 2014 Shutting down instance (abort) License high water mark = 6 USER (ospid: 29396): terminating the instance Instance terminated by USER, pid = 29396 Tue Jun 10 03:03:00 2014 Instance shutdown complete Tue Jun 10 03:04:35 2014 Starting ORACLE instance (normal) ... ...
Observe acima a solução “elegante” escolhida pela Oracle para reduzir o número de instâncias do primary ativas, um belo abort sem dó nem piedade. Depois disso, vamos ao alertlog da instância maastb2 (única do primary ativa e responsável pelo switchover):
Tue Jun 10 03:03:01 2014 Reconfiguration started (old inc 4, new inc 6) List of instances: 2 (myinst: 2) Global Resource Directory frozen * dead instance detected - domain 0 invalid = TRUE Communication channels reestablished Master broadcasted resource hash value bitmaps Non-local Process blocks cleaned out Tue Jun 10 03:03:01 2014 LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived Set master node info Submitted all remote-enqueue requests Dwn-cvts replayed, VALBLKs dubious All grantable enqueues granted Tue Jun 10 03:03:01 2014 minact-scn: master found reconf/inst-rec before recscn scan old-inc#:4 new-inc#:4 Post SMON to start 1st pass IR Tue Jun 10 03:03:01 2014 Starting background process CJQ0 Submitted all GCS remote-cache requests Post SMON to start 1st pass IR Fix write in gcs resources Tue Jun 10 03:03:01 2014 Instance recovery: looking for dead threads Reconfiguration complete Beginning instance recovery of 1 threads Tue Jun 10 03:03:02 2014 CJQ0 started with pid=60, OS id=14435 parallel recovery started with 2 processes Archivelog destination LOG_ARCHIVE_DEST_2 validated for no-data-loss recovery ARCH: Standby redo logfile selected for thread 1 sequence 185 for destination LOG_ARCHIVE_DEST_2 Started redo scan Completed redo scan read 1916 KB redo, 735 data blocks need recovery Started redo application at Thread 1: logseq 185, block 20041 Recovery of Online Redo Log: Thread 1 Group 1 Seq 185 Reading mem 0 Mem# 0: +DG01/maastb/onlinelog/group_1.257.844716051 Mem# 1: +FRA/maastb/onlinelog/group_1.568.844716053 Completed redo application of 1.67MB Completed instance recovery at Thread 1: logseq 185, block 23874, scn 11510215 676 data blocks read, 778 data blocks written, 1916 redo k-bytes read Tue Jun 10 03:03:08 2014 Setting recovery pair for thread 1: nab 23874 seq 185 SMON: Completed archiving log 1 thread 1 sequence 185 (maastb2) Thread 1 advanced to log sequence 186 (thread recovery) Redo thread 1 internally disabled at seq 186 (SMON) minact-scn: master continuing after IR minact-scn: Master considers inst:1 dead Tue Jun 10 03:03:12 2014 ALTER DATABASE COMMIT TO SWITCHOVER TO PHYSICAL STANDBY WITH SESSION SHUTDOWN ALTER DATABASE COMMIT TO SWITCHOVER TO PHYSICAL STANDBY [Process Id: 11540] (maastb2) Tue Jun 10 03:03:13 2014 ****************************************************************** LGWR: Setting 'active' archival for destination LOG_ARCHIVE_DEST_2 ****************************************************************** LGWR: Standby redo logfile selected to archive thread 2 sequence 163 LGWR: Standby redo logfile selected for thread 2 sequence 163 for destination LOG_ARCHIVE_DEST_2 Thread 2 advanced to log sequence 163 (LGWR switch) Current log# 3 seq# 163 mem# 0: +DG01/maastb/onlinelog/group_3.269.844716063 Current log# 3 seq# 163 mem# 1: +FRA/maastb/onlinelog/group_3.562.844716065 Tue Jun 10 03:03:13 2014 Archived Log entry 1019 added for thread 1 sequence 185 ID 0x2b1de48e dest 1: Thread 2 cannot allocate new log, sequence 164 Checkpoint not complete Current log# 3 seq# 163 mem# 0: +DG01/maastb/onlinelog/group_3.269.844716063 Current log# 3 seq# 163 mem# 1: +FRA/maastb/onlinelog/group_3.562.844716065 Stopping background process QMNC Tue Jun 10 03:03:14 2014 Stopping background process CJQ0 Tue Jun 10 03:03:14 2014 Archived Log entry 1020 added for thread 2 sequence 162 ID 0x2b1de48e dest 1: Tue Jun 10 03:03:14 2014 ARC3: LGWR is actively archiving destination LOG_ARCHIVE_DEST_2 CLOSE: killing server sessions. Active process 14386 user 'oracle' program 'oracle@rac11stb02.tjsc.jus.br (W000)' Active process 14386 user 'oracle' program 'oracle@rac11stb02.tjsc.jus.br (W000)' ... ... CLOSE: all sessions shutdown successfully. Waiting for all non-current ORLs to be archived... All non-current ORLs have been archived. Waiting for all FAL entries to be archived... All FAL entries have been archived. Waiting for dest_id 2 to become synchronized... Active, synchronized Physical Standby switchover target has been identified Switchover End-Of-Redo Log thread 2 sequence 163 has been fixed Switchover: Primary highest seen SCN set to 0x0.0xaff3c1 ARCH: Noswitch archival of thread 2, sequence 163 ARCH: End-Of-Redo Branch archival of thread 2 sequence 163 ARCH: LGWR is actively archiving destination LOG_ARCHIVE_DEST_2 ARCH: Standby redo logfile selected for thread 2 sequence 163 for destination LOG_ARCHIVE_DEST_2 Archived Log entry 1023 added for thread 2 sequence 163 ID 0x2b1de48e dest 1: ARCH: Archiving is disabled due to current logfile archival Primary will check for some target standby to have received alls redo Final check for a synchronized target standby. Check will be made once. LOG_ARCHIVE_DEST_2 is a potential Physical Standby switchover target Active, synchronized target has been identified Target has also received all redo Backup controlfile written to trace file /u01/app/oracle/diag/rdbms/maastb/maastb2/trace/maastb2_rsm0_11540.trc Clearing standby activation ID 723379342 (0x2b1de48e) The primary database controlfile was created using the 'MAXLOGFILES 192' clause. There is space for up to 188 standby redo logfiles Use the following SQL commands on the standby database to create standby redo logfiles that match the primary database: ALTER DATABASE ADD STANDBY LOGFILE 'srl1.f' SIZE 52428800; ALTER DATABASE ADD STANDBY LOGFILE 'srl2.f' SIZE 52428800; ALTER DATABASE ADD STANDBY LOGFILE 'srl3.f' SIZE 52428800; ALTER DATABASE ADD STANDBY LOGFILE 'srl4.f' SIZE 52428800; ALTER DATABASE ADD STANDBY LOGFILE 'srl5.f' SIZE 52428800; Archivelog for thread 2 sequence 163 required for standby recovery Switchover: Primary controlfile converted to standby controlfile succesfully. Switchover: Complete - Database shutdown required Completed: ALTER DATABASE COMMIT TO SWITCHOVER TO PHYSICAL STANDBY WITH SESSION SHUTDOWN Tue Jun 10 03:03:19 2014 Process (ospid 11546) is suspended due to switchover to physical standby operation. Tue Jun 10 03:03:26 2014 Performing implicit shutdown abort due to switchover to physical standby Shutting down instance (abort) License high water mark = 11 USER (ospid: 14458): terminating the instance Instance terminated by USER, pid = 14458 Tue Jun 10 03:03:28 2014 Instance shutdown complete ORA-1092 : opitsk aborting process Tue Jun 10 03:03:29 2014 Starting ORACLE instance (normal) ... ...
Analisando atentamente o log acima temos algumas informações importantes (destacadas em negrito). Primeiro, a instância “maastb2” detectou que a “maastb1” não está mais ativa e iniciou o recovery de qualquer informação perdida, iniciou as 03:03:01 e conclui as 03:03:08. Após esse ajuste necessário, a maastb2 recebeu o comando de commit para swicthover (lembre-se que no artigo anterior isso foi manual). Na sequência tudo o que precisava ser arquivado foi e registrado em todos os FAL’s. Além disso o possível destino (dest_2) foi identificado e o Oracle detectou que o target estava sincronizado (no log acima: Active, synchronized target has been identified). Depois disso, a instância e o banco atual foram convertidos para physical standby (No log acima: Switchover: Primary controlfile converted to standby controlfile succesfully). Observe que o swicthover iniciou as 03:03:12 e as 03:03:19 estava tudo concluído, esperando que a nova primary estivesse disponível.
Dando sequência temos o log da instância “maa1”, responsável por aplicar os redo recebidos da primary. Por isso, ela será a instância que coordenará a transformação de standby em primary do banco “maa”. O alertlog abaixo foi “cortado”, as informações sobre falhas de TNS foram retiradas. Estas falhas estavam ocorrendo devido ao banco standby (o “maastb”) ainda não estar disponível para receber os redo do “maa”.
Tue Jun 10 06:22:05 2014 RFS[2]: Possible network disconnect with primary database RFS[5]: Possible network disconnect with primary database Tue Jun 10 06:22:06 2014 RFS[4]: Possible network disconnect with primary database Tue Jun 10 06:22:06 2014 RFS[6]: Possible network disconnect with primary database Tue Jun 10 06:22:09 2014 Primary database is in MAXIMUM AVAILABILITY mode Standby controlfile consistent with primary Identified standby redo log 6 for instance/crash recovery Standby controlfile consistent with primary RFS[11]: Assigned to RFS process 7457 RFS[11]: Selected log 6 for thread 1 sequence 185 dbid 722024964 branch 847284763 Tue Jun 10 06:22:15 2014 Archived Log entry 979 added for thread 1 sequence 185 ID 0x2b1de48e dest 1: Tue Jun 10 06:22:19 2014 Archived Log entry 980 added for thread 2 sequence 162 ID 0x2b1de48e dest 1: Tue Jun 10 06:22:19 2014 Primary database is in MAXIMUM AVAILABILITY mode Standby controlfile consistent with primary Standby controlfile consistent with primary RFS[12]: Assigned to RFS process 7459 RFS[12]: Selected log 8 for thread 2 sequence 163 dbid 722024964 branch 847284763 Tue Jun 10 06:22:21 2014 Media Recovery Waiting for thread 1 sequence 186 (in transit) Tue Jun 10 06:22:21 2014 RFS[13]: Assigned to RFS process 7461 RFS[13]: Opened log for thread 1 sequence 186 dbid 722024964 branch 847284763 Archived Log entry 981 added for thread 1 sequence 186 rlc 847284763 ID 0x2b1de48e dest 2: Media Recovery Log +FRA/maa/archivelog/2014_06_10/thread_1_seq_186.1103.849853341 Media Recovery Waiting for thread 2 sequence 163 (in transit) Recovery of Online Redo Log: Thread 2 Group 8 Seq 163 Reading mem 0 Mem# 0: +DATA/maa/onlinelog/group_8.259.843615383 Mem# 1: +FRA/maa/onlinelog/group_8.703.843615385 RFS[12]: Possible network disconnect with primary database Tue Jun 10 06:22:23 2014 RFS[14]: Assigned to RFS process 7465 RFS[14]: Selected log 8 for thread 2 sequence 163 dbid 722024964 branch 847284763 Tue Jun 10 06:22:24 2014 Archived Log entry 982 added for thread 2 sequence 163 ID 0x2b1de48e dest 1: Resetting standby activation ID 723379342 (0x2b1de48e) Media Recovery End-Of-Redo indicator encountered Media Recovery Continuing Media Recovery Waiting for thread 2 sequence 164 Tue Jun 10 06:22:25 2014 ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL MRP0: Background Media Recovery cancelled with status 16037 Errors in file /u01/app/oracle/diag/rdbms/maa/maa1/trace/maa1_pr00_3715.trc: ORA-16037: user requested cancel of managed recovery operation Tue Jun 10 06:22:25 2014 Managed Standby Recovery not using Real Time Apply Recovery interrupted! Tue Jun 10 06:22:25 2014 MRP0: Background Media Recovery process shutdown (maa1) Managed Standby Recovery Canceled (maa1) Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY WAIT WITH SESSION SHUTDOWN ALTER DATABASE SWITCHOVER TO PRIMARY (maa1) Maximum wait for role transition is 15 minutes. Tue Jun 10 06:22:26 2014 krsv_proc_kill: Killing 3 processes (all RFS) Backup controlfile written to trace file /u01/app/oracle/diag/rdbms/maa/maa1/trace/maa1_rsm0_3682.trc SwitchOver after complete recovery through change 11531201 Online log +DATA/maa/onlinelog/group_1.272.843488553: Thread 1 Group 1 was previously cleared Online log +FRA/maa/onlinelog/group_1.286.843488555: Thread 1 Group 1 was previously cleared Online log +DATA/maa/onlinelog/group_2.271.843488555: Thread 1 Group 2 was previously cleared Online log +FRA/maa/onlinelog/group_2.285.843488555: Thread 1 Group 2 was previously cleared Online log +DATA/maa/onlinelog/group_3.257.843489101: Thread 2 Group 3 was previously cleared Online log +FRA/maa/onlinelog/group_3.284.843489101: Thread 2 Group 3 was previously cleared Online log +DATA/maa/onlinelog/group_4.262.843489103: Thread 2 Group 4 was previously cleared Online log +FRA/maa/onlinelog/group_4.283.843489103: Thread 2 Group 4 was previously cleared Standby became primary SCN: 11531199 Switchover: Complete - Database mounted as primary Completed: ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY WAIT WITH SESSION SHUTDOWN ALTER DATABASE OPEN Data Guard Broker initializing... This instance was first to open Picked broadcast on commit scheme to generate SCNs Thread 2 advanced to log sequence 165 (before internal thread enable) Tue Jun 10 06:22:33 2014 ARC3: Becoming the 'no SRL' ARCH ARCH: LGWR is scheduled to archive destination LOG_ARCHIVE_DEST_2 after log switch ... ... ... Error 12514 received logging on to the standby ARCH: Error 12514 Creating archive log file to 'maastb' Archived Log entry 983 added for thread 2 sequence 164 ID 0x0 dest 1: Thread 2 advanced to log sequence 166 (after internal thread enable) ARC1: Becoming the 'no SRL' ARCH Archived Log entry 984 added for thread 2 sequence 165 ID 0x0 dest 1: Tue Jun 10 06:22:34 2014 Assigning activation ID 728027883 (0x2b64d2eb) LGWR: Primary database is in MAXIMUM AVAILABILITY mode Destination LOG_ARCHIVE_DEST_2 is UNSYNCHRONIZED LGWR: Destination LOG_ARCHIVE_DEST_1 is not serviced by LGWR Thread 1 advanced to log sequence 188 (thread open) Thread 1 opened at log sequence 188 Current log# 2 seq# 188 mem# 0: +DATA/maa/onlinelog/group_2.271.843488555 Current log# 2 seq# 188 mem# 1: +FRA/maa/onlinelog/group_2.285.843488555 Successful open of redo thread 1 MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set Tue Jun 10 06:22:34 2014 ... ... ... Error 12514 received logging on to the standby PING[ARC2]: Heartbeat failed to connect to standby 'maastb'. Error is 12514. Instance recovery: looking for dead threads Instance recovery: lock domain invalid but no dead threads Tue Jun 10 06:22:40 2014 minact-scn: Inst 1 is now the master inc#:3 mmon proc-id:3517 status:0x7 minact-scn status: grec-scn:0x0000.00000000 gmin-scn:0x0000.00000000 gcalc-scn:0x0000.00000000 minact-scn: Master returning as live inst:2 has inc# mismatch instinc:0 cur:3 errcnt:0 Tue Jun 10 06:22:40 2014 [3682] Successfully onlined Undo Tablespace 2. Undo initialization finished serial:0 start:385367814 end:385369304 diff:1490 (14 seconds) Dictionary check beginning Dictionary check complete Verifying file header compatibility for 11g tablespace encryption.. Verifying 11g file header compatibility for tablespace encryption completed SMON: enabling tx recovery Database Characterset is WE8MSWIN1252 No Resource Manager plan active Starting background process GTX0 Tue Jun 10 06:22:44 2014 GTX0 started with pid=27, OS id=7470 Starting background process RCBG Tue Jun 10 06:22:44 2014 RCBG started with pid=32, OS id=7472 replication_dependency_tracking turned off (no async multimaster replication found) Starting background process QMNC Tue Jun 10 06:22:45 2014 QMNC started with pid=40, OS id=7474 LOGSTDBY: Validating controlfile with logical metadata LOGSTDBY: Validation complete Completed: ALTER DATABASE OPEN ALTER SYSTEM SET log_archive_trace=0 SCOPE=BOTH SID='maa1'; ALTER SYSTEM SET log_archive_format='arch_%t_%s_%r.arc' SCOPE=SPFILE SID='maa1'; ALTER SYSTEM SET standby_file_management='AUTO' SCOPE=BOTH SID='*'; ALTER SYSTEM SET archive_lag_target=0 SCOPE=BOTH SID='*'; ALTER SYSTEM SET log_archive_max_processes=4 SCOPE=BOTH SID='*'; ALTER SYSTEM SET log_archive_min_succeed_dest=1 SCOPE=BOTH SID='*'; ALTER SYSTEM SET db_file_name_convert='+DG01/maastb','+DATA/maa','+FRA/maastb','+FRA/maa' SCOPE=SPFILE; ALTER SYSTEM SET log_file_name_convert='+DG01/maastb','+DATA/maa','+FRA/maastb','+FRA/maa' SCOPE=SPFILE; Tue Jun 10 06:22:48 2014 ARC2: STARTING ARCH PROCESSES Tue Jun 10 06:22:48 2014 ARC4 started with pid=49, OS id=7486 ALTER SYSTEM SET log_archive_dest_state_2='ENABLE' SCOPE=BOTH; ALTER SYSTEM ARCHIVE LOG Tue Jun 10 06:22:49 2014 ********************************************************************* WARNING: Detected remote instance standby database network disconnect WARNING: Current instance standby database disconnect may be required ********************************************************************* ARC4: Archival started ARC2: STARTING ARCH PROCESSES COMPLETE ... ... ... Error 12528 received logging on to the standby Error 12528 for archive log file 1 to 'maastb' LGWR: Failed to archive log 1 thread 1 sequence 189 (12528) Thread 1 advanced to log sequence 189 (LGWR switch) Current log# 1 seq# 189 mem# 0: +DATA/maa/onlinelog/group_1.272.843488553 Current log# 1 seq# 189 mem# 1: +FRA/maa/onlinelog/group_1.286.843488555 Tue Jun 10 06:22:52 2014 Starting background process CJQ0 Tue Jun 10 06:22:53 2014 CJQ0 started with pid=52, OS id=7510 Tue Jun 10 06:22:53 2014 ALTER SYSTEM SET log_archive_dest_state_2='ENABLE' SCOPE=MEMORY SID='*'; Tue Jun 10 06:22:53 2014 Archived Log entry 988 added for thread 1 sequence 188 ID 0x2b64d2eb dest 1: Tue Jun 10 06:22:54 2014 Starting background process SMCO Tue Jun 10 06:22:54 2014 SMCO started with pid=54, OS id=7516 Tue Jun 10 06:22:55 2014 Active Session History (ASH) performed an emergency flush. This may mean that ASH is undersized. If emergency flushes are a recurring issue, you may consider increasing ASH size by setting the value of _ASH_SIZE to a sufficiently large value. Currently, ASH size is 4194304 bytes. Both ASH size and the total number of emergency flushes since instance startup can be monitored by running the following query: select total_size,awr_flush_emergency_count from v$ash_info; Thread 1 cannot allocate new log, sequence 190 Checkpoint not complete Current log# 1 seq# 189 mem# 0: +DATA/maa/onlinelog/group_1.272.843488553 Current log# 1 seq# 189 mem# 1: +FRA/maa/onlinelog/group_1.286.843488555 Tue Jun 10 06:22:56 2014 ... ... ... Setting Resource Manager plan SCHEDULER[0x3189]:DEFAULT_MAINTENANCE_PLAN via scheduler window Setting Resource Manager plan DEFAULT_MAINTENANCE_PLAN via parameter Tue Jun 10 06:22:57 2014 Starting background process VKRM Tue Jun 10 06:22:57 2014 VKRM started with pid=41, OS id=7530 Tue Jun 10 06:23:01 2014 LGWR: Standby redo logfile selected to archive thread 1 sequence 190 LGWR: Standby redo logfile selected for thread 1 sequence 190 for destination LOG_ARCHIVE_DEST_2 Thread 1 advanced to log sequence 190 (LGWR switch) Current log# 2 seq# 190 mem# 0: +DATA/maa/onlinelog/group_2.271.843488555 Current log# 2 seq# 190 mem# 1: +FRA/maa/onlinelog/group_2.285.843488555 Tue Jun 10 06:23:03 2014 Archived Log entry 989 added for thread 1 sequence 189 ID 0x2b64d2eb dest 1: Tue Jun 10 06:23:06 2014 Begin automatic SQL Tuning Advisor run for special tuning task "SYS_AUTO_SQL_TUNING_TASK" Tue Jun 10 06:23:07 2014 Shutting down archive processes ARCH shutting down ARC4: Archival stopped End automatic SQL Tuning Advisor run for special tuning task "SYS_AUTO_SQL_TUNING_TASK" Thread 1 cannot allocate new log, sequence 191 Checkpoint not complete Current log# 2 seq# 190 mem# 0: +DATA/maa/onlinelog/group_2.271.843488555 Current log# 2 seq# 190 mem# 1: +FRA/maa/onlinelog/group_2.285.843488555 Tue Jun 10 06:23:13 2014 Destination LOG_ARCHIVE_DEST_2 is SYNCHRONIZED LGWR: Standby redo logfile selected to archive thread 1 sequence 191
Novamente analisando com calma o log acima (destaquei pontos importantes em negrito) observamos que o processo começa com a detecção da indisponibilidade do primary e a garantia de que tudo está sincronizado. O último redo recebido é aberto e detecta-se um End-Of-Redo. Após isso, as 06:22:25, o standby cancela a aplicação dos redo que estava aplicando e realiza o switchover para primary. Como tudo está sincronizado o banco inicia o processo de open. Ainda assim, o log_archive_dest_2 não está sincronizado, mas quando o novo standby subir, tudo ficará sincronizado. O processo de finaliza as 06:22:49 logo após o open, o que vem após isso é a espera pelo novo standby estar disponível para sincronização.
Analisando abaixo temos o alertlog da outra instância do novo primary, a “maa2”. As informações contidas são mais simples, um simples abort e sincronizações com a nova standby (deste log somente cortei os erros de TNS/indisponibilidade do standby):
Tue Jun 10 06:22:25 2014 Managed Standby Recovery not using Real Time Apply Switchover: Complete - Database mounted as primary Tue Jun 10 06:22:32 2014 ALTER DATABASE OPEN Data Guard Broker initializing... Picked broadcast on commit scheme to generate SCNs Tue Jun 10 06:22:35 2014 LGWR: Primary database is in MAXIMUM AVAILABILITY mode Destination LOG_ARCHIVE_DEST_2 is UNSYNCHRONIZED LGWR: Destination LOG_ARCHIVE_DEST_1 is not serviced by LGWR Thread 2 advanced to log sequence 167 (thread open) Tue Jun 10 06:22:35 2014 ARC2: Becoming the 'no SRL' ARCH Thread 2 opened at log sequence 167 Current log# 4 seq# 167 mem# 0: +DATA/maa/onlinelog/group_4.262.843489103 Current log# 4 seq# 167 mem# 1: +FRA/maa/onlinelog/group_4.283.843489103 Successful open of redo thread 2 MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set Tue Jun 10 06:22:35 2014 SMON: enabling cache recovery Tue Jun 10 06:22:36 2014 ... ... ... Error 12514 received logging on to the standby PING[ARC1]: Heartbeat failed to connect to standby 'maastb'. Error is 12514. Archived Log entry 986 added for thread 2 sequence 166 ID 0x2b64d2eb dest 1: [3843] Successfully onlined Undo Tablespace 4. Undo initialization finished serial:0 start:385365424 end:385366584 diff:1160 (11 seconds) Verifying file header compatibility for 11g tablespace encryption.. Verifying 11g file header compatibility for tablespace encryption completed SMON: enabling tx recovery Database Characterset is WE8MSWIN1252 Tue Jun 10 06:22:41 2014 minact-scn: Inst 2 is a slave inc#:3 mmon proc-id:3731 status:0x2 minact-scn status: grec-scn:0x0000.00000000 gmin-scn:0x0000.00000000 gcalc-scn:0x0000.00000000 No Resource Manager plan active Tue Jun 10 06:22:44 2014 Starting background process GTX0 Tue Jun 10 06:22:44 2014 GTX0 started with pid=47, OS id=12888 Starting background process RCBG Tue Jun 10 06:22:44 2014 RCBG started with pid=48, OS id=12890 replication_dependency_tracking turned off (no async multimaster replication found) Starting background process QMNC Tue Jun 10 06:22:45 2014 QMNC started with pid=49, OS id=12892 Completed: ALTER DATABASE OPEN ALTER SYSTEM SET log_archive_trace=0 SCOPE=BOTH SID='maa2'; ALTER SYSTEM SET log_archive_format='arch_%t_%s_%r.arc' SCOPE=SPFILE SID='maa2'; Tue Jun 10 06:22:48 2014 ARC1: STARTING ARCH PROCESSES Tue Jun 10 06:22:48 2014 ARC4 started with pid=52, OS id=12903 Tue Jun 10 06:22:49 2014 ****************************************************************** LGWR: Setting 'active' archival for destination LOG_ARCHIVE_DEST_2 ****************************************************************** Tue Jun 10 06:22:49 2014 ... ... ... Error 12528 received logging on to the standby Error 12528 for archive log file 3 to 'maastb' LGWR: Failed to archive log 3 thread 2 sequence 168 (12528) Thread 2 advanced to log sequence 168 (LGWR switch) Current log# 3 seq# 168 mem# 0: +DATA/maa/onlinelog/group_3.257.843489101 Current log# 3 seq# 168 mem# 1: +FRA/maa/onlinelog/group_3.284.843489101 ARC4: Archival started ARC1: STARTING ARCH PROCESSES COMPLETE ********************************************************************* WARNING: Detected remote instance standby database network disconnect WARNING: Current instance standby database disconnect may be required ********************************************************************* Tue Jun 10 06:22:51 2014 Archived Log entry 987 added for thread 2 sequence 167 ID 0x2b64d2eb dest 1: Tue Jun 10 06:22:54 2014 Starting background process CJQ0 Tue Jun 10 06:22:54 2014 CJQ0 started with pid=53, OS id=12920 Thread 2 cannot allocate new log, sequence 169 Checkpoint not complete Current log# 3 seq# 168 mem# 0: +DATA/maa/onlinelog/group_3.257.843489101 Current log# 3 seq# 168 mem# 1: +FRA/maa/onlinelog/group_3.284.843489101 Tue Jun 10 06:22:58 2014 Setting Resource Manager plan DEFAULT_MAINTENANCE_PLAN via parameter Tue Jun 10 06:22:58 2014 Starting background process VKRM Tue Jun 10 06:22:58 2014 VKRM started with pid=54, OS id=12930 Tue Jun 10 06:22:59 2014 LGWR: Standby redo logfile selected to archive thread 2 sequence 169 LGWR: Standby redo logfile selected for thread 2 sequence 169 for destination LOG_ARCHIVE_DEST_2 Thread 2 advanced to log sequence 169 (LGWR switch) Current log# 4 seq# 169 mem# 0: +DATA/maa/onlinelog/group_4.262.843489103 Current log# 4 seq# 169 mem# 1: +FRA/maa/onlinelog/group_4.283.843489103 Tue Jun 10 06:23:05 2014 Archived Log entry 990 added for thread 2 sequence 168 ID 0x2b64d2eb dest 1: Tue Jun 10 06:23:08 2014 Errors in file /u01/app/oracle/diag/rdbms/maa/maa2/trace/maa2_j001_12934.trc: ORA-12012: error on auto execute of job "ORACLE_OCM"."MGMT_CONFIG_JOB_2_2" ORA-29280: invalid directory path ORA-06512: at "ORACLE_OCM.MGMT_DB_LL_METRICS", line 2436 ORA-06512: at line 1 Tue Jun 10 06:23:12 2014 Thread 2 cannot allocate new log, sequence 170 Checkpoint not complete Current log# 4 seq# 169 mem# 0: +DATA/maa/onlinelog/group_4.262.843489103 Current log# 4 seq# 169 mem# 1: +FRA/maa/onlinelog/group_4.283.843489103 Destination LOG_ARCHIVE_DEST_2 is SYNCHRONIZED LGWR: Standby redo logfile selected to archive thread 2 sequence 170 LGWR: Standby redo logfile selected for thread 2 sequence 170 for destination LOG_ARCHIVE_DEST_2 Thread 2 advanced to log sequence 170 (LGWR switch) Current log# 3 seq# 170 mem# 0: +DATA/maa/onlinelog/group_3.257.843489101 Current log# 3 seq# 170 mem# 1: +FRA/maa/onlinelog/group_3.284.843489101 Tue Jun 10 06:23:16 2014 Archived Log entry 1002 added for thread 2 sequence 169 ID 0x2b64d2eb dest 1: Tue Jun 10 06:23:22 2014 Thread 2 cannot allocate new log, sequence 171 Checkpoint not complete Current log# 3 seq# 170 mem# 0: +DATA/maa/onlinelog/group_3.257.843489101 Current log# 3 seq# 170 mem# 1: +FRA/maa/onlinelog/group_3.284.843489101 LGWR: Standby redo logfile selected to archive thread 2 sequence 171 LGWR: Standby redo logfile selected for thread 2 sequence 171 for destination LOG_ARCHIVE_DEST_2 Thread 2 advanced to log sequence 171 (LGWR switch) Current log# 4 seq# 171 mem# 0: +DATA/maa/onlinelog/group_4.262.843489103 Current log# 4 seq# 171 mem# 1: +FRA/maa/onlinelog/group_4.283.843489103 Tue Jun 10 06:24:17 2014 ALTER SYSTEM SET log_archive_dest_state_2='ENABLE' SCOPE=BOTH; Tue Jun 10 06:27:49 2014 Starting background process SMCO Tue Jun 10 06:27:49 2014 SMCO started with pid=39, OS id=13041
Vamos voltar ao alertlog do banco “maastb” que parei a análise após o seu shutdown e que no ambiente atual este banco deve ser iniciado para atuar como standby a partir de agora. Abaixo a continuação do alertlog da instância “maastb2” que estava coordenando esta troca de papeis:
Tue Jun 10 03:03:28 2014 Instance shutdown complete ORA-1092 : opitsk aborting process Tue Jun 10 03:03:29 2014 Starting ORACLE instance (normal) ****************** Large Pages Information ***************** Total Shared Global Region in Large Pages = 0 KB (0%) Large Pages used by this instance: 0 (0 KB) Large Pages unused system wide = 0 (0 KB) (alloc incr 4096 KB) Large Pages configured system wide = 0 (0 KB) Large Page size = 2048 KB RECOMMENDATION: Total Shared Global Region size is 1026 MB. For optimal performance, prior to the next instance restart increase the number of unused Large Pages by atleast 513 2048 KB Large Pages (1026 MB) system wide to get 100% of the Shared Global Region allocated with Large pages *********************************************************** LICENSE_MAX_SESSION = 0 LICENSE_SESSIONS_WARNING = 0 Private Interface 'eth2:1' configured from GPnP for use as a private interconnect. [name='eth2:1', type=1, ip=169.254.212.125, mac=00-0c-29-0a-ae-e2, net=169.254.0.0/16, mask=255.255.0.0, use=haip:cluster_interconnect/62] Public Interface 'eth0' configured from GPnP for use as a public interface. [name='eth0', type=1, ip=10.17.42.50, mac=00-0c-29-0a-ae-ce, net=10.17.42.0/26, mask=255.255.255.192, use=public/1] Public Interface 'eth0:1' configured from GPnP for use as a public interface. [name='eth0:1', type=1, ip=10.17.42.34, mac=00-0c-29-0a-ae-ce, net=10.17.42.0/26, mask=255.255.255.192, use=public/1] Public Interface 'eth0:3' configured from GPnP for use as a public interface. [name='eth0:3', type=1, ip=10.17.42.33, mac=00-0c-29-0a-ae-ce, net=10.17.42.0/26, mask=255.255.255.192, use=public/1] Public Interface 'eth0:5' configured from GPnP for use as a public interface. [name='eth0:5', type=1, ip=10.17.42.51, mac=00-0c-29-0a-ae-ce, net=10.17.42.0/26, mask=255.255.255.192, use=public/1] Picked latch-free SCN scheme 3 Autotune of undo retention is turned on. LICENSE_MAX_USERS = 0 SYS auditing is disabled Starting up: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production With the Partitioning, Real Application Clusters and Real Application Testing options. ORACLE_HOME = /u01/app/oracle/product/11.2.0.3/db_1 System name: Linux Node name: rac11stb02.tjsc.jus.br Release: 2.6.39-400.17.1.el6uek.x86_64 Version: #1 SMP Fri Feb 22 18:16:18 PST 2013 Machine: x86_64 VM name: VMWare Version: 6 Using parameter settings in server-side pfile /u01/app/oracle/product/11.2.0.3/db_1/dbs/initmaastb2.ora System parameters with non-default values: processes = 150 spfile = "+DG01/maastb/spfilemaastb.ora" sga_target = 1G control_files = "+DG01/maastb/controlfile/current.273.844715953" control_files = "+FRA/maastb/controlfile/current.579.844715953" db_file_name_convert = "+DATA/maa" db_file_name_convert = "+DG01/maastb" db_file_name_convert = "+FRA/maa" db_file_name_convert = "+FRA/maastb" log_file_name_convert = "+DATA/maa" log_file_name_convert = "+DG01/maastb" log_file_name_convert = "+FRA/maa" log_file_name_convert = "+FRA/maastb" db_block_size = 8192 compatible = "11.2.0.0.0" log_archive_dest_1 = "LOCATION=USE_DB_RECOVERY_FILE_DEST VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=maastb" log_archive_dest_2 = "service="maa"" log_archive_dest_2 = "LGWR SYNC AFFIRM delay=0 optional compression=disable max_failure=0 max_connections=1 reopen=300 db_unique_name="maa" net_timeout=30" log_archive_dest_2 = "valid_for=(all_logfiles,primary_role)" log_archive_dest_state_2 = "ENABLE" log_archive_min_succeed_dest= 1 fal_server = "maa" log_archive_trace = 0 log_archive_config = "DG_CONFIG=(maa,maastb)" log_archive_format = "arch_%t_%s_%r.arc" log_archive_max_processes= 4 archive_lag_target = 0 cluster_database = TRUE db_create_file_dest = "+DG01" db_recovery_file_dest = "+FRA" db_recovery_file_dest_size= 10G standby_file_management = "AUTO" thread = 2 undo_tablespace = "UNDOTBS2" instance_number = 2 remote_login_passwordfile= "EXCLUSIVE" db_domain = "" remote_listener = "rac11stb-scan.tjsc.jus.br:1521" audit_file_dest = "/u01/app/oracle/admin/maastb/adump" audit_trail = "DB" db_name = "maa" db_unique_name = "maastb" open_cursors = 300 pga_aggregate_target = 256M dg_broker_start = TRUE dg_broker_config_file1 = "+DG01/maastb/dr1maastb.dat" dg_broker_config_file2 = "+FRA/maastb/dr2maastb.dat" diagnostic_dest = "/u01/app/oracle" Cluster communication is configured to use the following interface(s) for this instance 169.254.212.125 cluster interconnect IPC version:Oracle UDP/IP (generic) IPC Vendor 1 proto 2 Tue Jun 10 03:03:31 2014 PMON started with pid=2, OS id=14478 Tue Jun 10 03:03:31 2014 PSP0 started with pid=3, OS id=14480 Tue Jun 10 03:03:32 2014 VKTM started with pid=4, OS id=14482 at elevated priority VKTM running at (1)millisec precision with DBRM quantum (100)ms Tue Jun 10 03:03:32 2014 GEN0 started with pid=5, OS id=14486 Tue Jun 10 03:03:32 2014 DIAG started with pid=6, OS id=14488 Tue Jun 10 03:03:32 2014 DBRM started with pid=7, OS id=14491 Tue Jun 10 03:03:32 2014 PING started with pid=8, OS id=14493 Tue Jun 10 03:03:32 2014 ACMS started with pid=9, OS id=14496 Tue Jun 10 03:03:32 2014 DIA0 started with pid=10, OS id=14498 Tue Jun 10 03:03:32 2014 LMON started with pid=11, OS id=14500 Tue Jun 10 03:03:33 2014 LMD0 started with pid=12, OS id=14502 * Load Monitor used for high load check * New Low - High Load Threshold Range = [1920 - 2560] Tue Jun 10 03:03:33 2014 LMS0 started with pid=13, OS id=14504 at elevated priority Tue Jun 10 03:03:33 2014 RMS0 started with pid=14, OS id=14508 Tue Jun 10 03:03:33 2014 MMAN started with pid=16, OS id=14512 Tue Jun 10 03:03:33 2014 LMHB started with pid=15, OS id=14510 Tue Jun 10 03:03:33 2014 DBW0 started with pid=17, OS id=14514 Tue Jun 10 03:03:33 2014 LGWR started with pid=18, OS id=14516 Tue Jun 10 03:03:33 2014 CKPT started with pid=19, OS id=14518 Tue Jun 10 03:03:33 2014 SMON started with pid=20, OS id=14520 Tue Jun 10 03:03:33 2014 RECO started with pid=21, OS id=14522 Tue Jun 10 03:03:33 2014 RBAL started with pid=22, OS id=14524 Tue Jun 10 03:03:33 2014 ASMB started with pid=23, OS id=14526 Tue Jun 10 03:03:33 2014 MMON started with pid=24, OS id=14528 Tue Jun 10 03:03:33 2014 MMNL started with pid=25, OS id=14532 lmon registered with NM - instance number 2 (internal mem no 1) NOTE: initiating MARK startup Starting background process MARK Tue Jun 10 03:03:33 2014 MARK started with pid=26, OS id=14534 NOTE: MARK has subscribed Reconfiguration started (old inc 0, new inc 2) List of instances: 2 (myinst: 2) Global Resource Directory frozen * allocate domain 0, invalid = TRUE Communication channels reestablished Master broadcasted resource hash value bitmaps Non-local Process blocks cleaned out LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived Set master node info Submitted all remote-enqueue requests Dwn-cvts replayed, VALBLKs dubious All grantable enqueues granted Post SMON to start 1st pass IR Submitted all GCS remote-cache requests Post SMON to start 1st pass IR Fix write in gcs resources Reconfiguration complete Tue Jun 10 03:03:34 2014 LCK0 started with pid=28, OS id=14540 Starting background process RSMN Tue Jun 10 03:03:34 2014 RSMN started with pid=29, OS id=14542 ORACLE_BASE from environment = /u01/app/oracle Tue Jun 10 03:03:34 2014 DMON started with pid=30, OS id=14546 Tue Jun 10 03:03:35 2014 alter database mount This instance was first to mount Tue Jun 10 03:03:35 2014 ALTER SYSTEM SET local_listener='(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=10.17.42.51)(PORT=1521))))' SCOPE=MEMORY SID='maastb2'; NOTE: Loaded library: /opt/oracle/extapi/64/asm/orcl/1/libasm.so NOTE: Loaded library: System SUCCESS: diskgroup DG01 was mounted NOTE: dependency between database maastb and diskgroup resource ora.DG01.dg is established SUCCESS: diskgroup FRA was mounted NOTE: dependency between database maastb and diskgroup resource ora.FRA.dg is established Tue Jun 10 03:03:42 2014 NSS2 started with pid=31, OS id=14576 ARCH: STARTING ARCH PROCESSES Tue Jun 10 03:03:42 2014 ARC0 started with pid=34, OS id=14578 ARC0: Archival started ARCH: STARTING ARCH PROCESSES COMPLETE ARC0: STARTING ARCH PROCESSES Tue Jun 10 03:03:43 2014 ARC1 started with pid=35, OS id=14580 Tue Jun 10 03:03:43 2014 ARC2 started with pid=36, OS id=14582 Tue Jun 10 03:03:43 2014 ARC1: Archival started ARC2: Archival started ARC3 started with pid=37, OS id=14584 ARC1: Becoming the 'no FAL' ARCH ARC1: Becoming the 'no SRL' ARCH ARC1: Thread not mounted ARC2: Becoming the heartbeat ARCH ARC2: Thread not mounted Tue Jun 10 03:03:43 2014 Successful mount of redo thread 2, with mount id 728394567 Allocated 3981120 bytes in shared pool for flashback generation buffer Starting background process RVWR Tue Jun 10 03:03:43 2014 RVWR started with pid=38, OS id=14586 Physical Standby Database mounted. Lost write protection disabled ARC2: Becoming the active heartbeat ARCH ARC2: Becoming the active heartbeat ARCH ARC3: Archival started ARC0: STARTING ARCH PROCESSES COMPLETE Completed: alter database mount Tue Jun 10 03:03:47 2014 Starting Data Guard Broker (DMON) Tue Jun 10 03:03:47 2014 INSV started with pid=40, OS id=14604 Tue Jun 10 03:03:50 2014 Using STANDBY_ARCHIVE_DEST parameter default value as USE_DB_RECOVERY_FILE_DEST Tue Jun 10 03:03:52 2014 NSV0 started with pid=43, OS id=14622 Tue Jun 10 03:03:52 2014 Primary database is in MAXIMUM AVAILABILITY mode Changing standby controlfile to RESYNCHRONIZATION level Standby controlfile consistent with primary RFS[1]: Assigned to RFS process 14624 RFS[1]: Selected log 8 for thread 2 sequence 169 dbid 722024964 branch 847284763 Tue Jun 10 03:03:54 2014 Primary database is in MAXIMUM AVAILABILITY mode Standby controlfile consistent with primary Standby controlfile consistent with primary RFS[2]: Assigned to RFS process 14626 RFS[2]: Selected log 5 for thread 1 sequence 190 dbid 722024964 branch 847284763 Tue Jun 10 03:03:58 2014 RFS[3]: Assigned to RFS process 14628 RFS[3]: Opened log for thread 2 sequence 164 dbid 722024964 branch 847284763 Tue Jun 10 03:03:58 2014 RFS[4]: Assigned to RFS process 14630 RFS[4]: Opened log for thread 2 sequence 165 dbid 722024964 branch 847284763 Tue Jun 10 03:04:01 2014 RFS[5]: Assigned to RFS process 14606 RFS[5]: Opened log for thread 1 sequence 187 dbid 722024964 branch 847284763 Tue Jun 10 03:04:01 2014 RFS[6]: Assigned to RFS process 14612 RFS[6]: Opened log for thread 2 sequence 166 dbid 722024964 branch 847284763 Archived Log entry 1025 added for thread 2 sequence 165 rlc 847284763 ID 0x0 dest 2: Tue Jun 10 03:04:01 2014 RFS[7]: Assigned to RFS process 14632 RFS[7]: Opened log for thread 1 sequence 188 dbid 722024964 branch 847284763 Archived Log entry 1026 added for thread 2 sequence 164 rlc 847284763 ID 0x0 dest 2: RFS[4]: Opened log for thread 2 sequence 167 dbid 722024964 branch 847284763 RFS[3]: Opened log for thread 2 sequence 168 dbid 722024964 branch 847284763 Archived Log entry 1027 added for thread 1 sequence 187 rlc 847284763 ID 0x2b64d2eb dest 2: Archived Log entry 1028 added for thread 2 sequence 166 rlc 847284763 ID 0x2b64d2eb dest 2: Archived Log entry 1029 added for thread 1 sequence 188 rlc 847284763 ID 0x2b64d2eb dest 2: Tue Jun 10 03:04:03 2014 RSM0 started with pid=54, OS id=14652 RFS[5]: Opened log for thread 1 sequence 189 dbid 722024964 branch 847284763 Archived Log entry 1030 added for thread 2 sequence 167 rlc 847284763 ID 0x2b64d2eb dest 2: Archived Log entry 1031 added for thread 2 sequence 168 rlc 847284763 ID 0x2b64d2eb dest 2: Archived Log entry 1032 added for thread 1 sequence 189 rlc 847284763 ID 0x2b64d2eb dest 2: Tue Jun 10 03:04:07 2014 Changing standby controlfile to MAXIMUM AVAILABILITY level ALTER SYSTEM SET log_archive_dest_state_1='ENABLE' SCOPE=BOTH; ALTER SYSTEM SET log_archive_trace=0 SCOPE=BOTH SID='maastb2'; ALTER SYSTEM SET log_archive_format='arch_%t_%s_%r.arc' SCOPE=SPFILE SID='maastb2'; RFS[2]: Selected log 6 for thread 1 sequence 191 dbid 722024964 branch 847284763 ALTER SYSTEM SET standby_file_management='AUTO' SCOPE=BOTH SID='*'; ALTER SYSTEM SET archive_lag_target=0 SCOPE=BOTH SID='*'; Tue Jun 10 03:04:08 2014 ALTER SYSTEM SET log_archive_max_processes=4 SCOPE=BOTH SID='*';Archived Log entry 1033 added for thread 1 sequence 190 ID 0x2b64d2eb dest 1: ALTER SYSTEM SET log_archive_min_succeed_dest=1 SCOPE=BOTH SID='*'; ALTER SYSTEM SET db_file_name_convert='+DATA/maa','+DG01/maastb','+FRA/maa','+FRA/maastb' SCOPE=SPFILE; ALTER SYSTEM SET log_file_name_convert='+DATA/maa','+DG01/maastb','+FRA/maa','+FRA/maastb' SCOPE=SPFILE; ALTER SYSTEM SET fal_server='maa' SCOPE=BOTH; ALTER DATABASE RECOVER MANAGED STANDBY DATABASE THROUGH ALL SWITCHOVER DISCONNECT USING CURRENT LOGFILE Attempt to start background Managed Standby Recovery process (maastb2) Tue Jun 10 03:04:08 2014 MRP0 started with pid=55, OS id=14655 MRP0: Background Managed Standby Recovery process started (maastb2) Tue Jun 10 03:04:09 2014 Standby controlfile consistent with primary RFS[1]: Selected log 9 for thread 2 sequence 170 dbid 722024964 branch 847284763 Tue Jun 10 03:04:09 2014 Archived Log entry 1034 added for thread 2 sequence 169 ID 0x2b64d2eb dest 1: started logmerger process Tue Jun 10 03:04:13 2014 Managed Standby Recovery starting Real Time Apply Parallel Media Recovery started with 2 slaves Waiting for all non-current ORLs to be archived... All non-current ORLs have been archived. Clearing online redo logfile 1 +DG01/maastb/onlinelog/group_1.257.844716051 Clearing online log 1 of thread 1 sequence number 190 Tue Jun 10 03:04:15 2014 Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE THROUGH ALL SWITCHOVER DISCONNECT USING CURRENT LOGFILE Tue Jun 10 03:04:21 2014 Standby controlfile consistent with primary RFS[1]: Selected log 8 for thread 2 sequence 171 dbid 722024964 branch 847284763 Tue Jun 10 03:04:21 2014 Clearing online redo logfile 1 complete Clearing online redo logfile 2 +DG01/maastb/onlinelog/group_2.268.844716057 Tue Jun 10 03:04:21 2014 Archived Log entry 1035 added for thread 2 sequence 170 ID 0x2b64d2eb dest 1: Clearing online log 2 of thread 1 sequence number 191 Tue Jun 10 03:04:24 2014 Standby controlfile consistent with primary RFS[2]: Selected log 5 for thread 1 sequence 192 dbid 722024964 branch 847284763 Tue Jun 10 03:04:25 2014 Archived Log entry 1036 added for thread 1 sequence 191 ID 0x2b64d2eb dest 1: Clearing online redo logfile 2 complete Clearing online redo logfile 3 +DG01/maastb/onlinelog/group_3.269.844716063 Clearing online log 3 of thread 2 sequence number 171 Tue Jun 10 03:04:35 2014 Clearing online redo logfile 3 complete Clearing online redo logfile 4 +DG01/maastb/onlinelog/group_4.270.844716067 Clearing online log 4 of thread 2 sequence number 171 Clearing online redo logfile 4 complete Tue Jun 10 03:04:42 2014 Media Recovery Log +FRA/maastb/archivelog/2014_06_10/thread_2_seq_162.709.849841395 Media Recovery Log +FRA/maastb/archivelog/2014_06_09/thread_1_seq_183.717.849834887 Tue Jun 10 03:04:45 2014 Reconfiguration started (old inc 2, new inc 4) List of instances: 1 2 (myinst: 2) Global Resource Directory frozen Communication channels reestablished Tue Jun 10 03:04:45 2014 * domain 0 valid = 0 according to instance 1 Master broadcasted resource hash value bitmaps Non-local Process blocks cleaned out Tue Jun 10 03:04:45 2014 LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived Set master node info Submitted all remote-enqueue requests Dwn-cvts replayed, VALBLKs dubious All grantable enqueues granted Post SMON to start 1st pass IR Submitted all GCS remote-cache requests Post SMON to start 1st pass IR Fix write in gcs resources Reconfiguration complete Media Recovery Log +FRA/maastb/archivelog/2014_06_09/thread_1_seq_184.713.849834919 Tue Jun 10 03:04:56 2014 Media Recovery Log +FRA/maastb/archivelog/2014_06_10/thread_1_seq_185.710.849841391 Media Recovery Log +FRA/maastb/archivelog/2014_06_10/thread_1_seq_186.707.849841395 Media Recovery Log +FRA/maastb/archivelog/2014_06_10/thread_2_seq_163.706.849841397 Identified End-Of-Redo (switchover) for thread 2 sequence 163 at SCN 0x0.aff3c1 Resetting standby activation ID 0 (0x0) Tue Jun 10 03:05:07 2014 Media Recovery End-Of-Redo indicator encountered Media Recovery Continuing Media Recovery Log +FRA/maastb/archivelog/2014_06_10/thread_2_seq_164.704.849841439 Media Recovery Log +FRA/maastb/archivelog/2014_06_10/thread_2_seq_165.702.849841439 Media Recovery Log +FRA/maastb/archivelog/2014_06_10/thread_1_seq_187.700.849841441 Media Recovery Log +FRA/maastb/archivelog/2014_06_10/thread_2_seq_166.696.849841441 Media Recovery Log +FRA/maastb/archivelog/2014_06_10/thread_1_seq_188.481.849841441 Media Recovery Log +FRA/maastb/archivelog/2014_06_10/thread_2_seq_167.693.849841443 Tue Jun 10 03:05:20 2014 Media Recovery Log +FRA/maastb/archivelog/2014_06_10/thread_2_seq_168.691.849841443 Media Recovery Log +FRA/maastb/archivelog/2014_06_10/thread_1_seq_189.689.849841443 Media Recovery Log +FRA/maastb/archivelog/2014_06_10/thread_2_seq_169.444.849841449 Media Recovery Log +FRA/maastb/archivelog/2014_06_10/thread_1_seq_190.685.849841447 Tue Jun 10 03:05:31 2014 Media Recovery Log +FRA/maastb/archivelog/2014_06_10/thread_1_seq_191.681.849841465 Media Recovery Log +FRA/maastb/archivelog/2014_06_10/thread_2_seq_170.682.849841461 Media Recovery Waiting for thread 2 sequence 171 (in transit) Recovery of Online Redo Log: Thread 2 Group 8 Seq 171 Reading mem 0 Mem# 0: +DG01/maastb/onlinelog/group_8.261.844716089 Mem# 1: +FRA/maastb/onlinelog/group_8.611.844716093 Media Recovery Waiting for thread 1 sequence 192 (in transit) Recovery of Online Redo Log: Thread 1 Group 5 Seq 192 Reading mem 0 Mem# 0: +DG01/maastb/onlinelog/group_5.271.844716073 Mem# 1: +FRA/maastb/onlinelog/group_5.553.844716075
Acima temos alguns detalhes interessantes, começou com um o startup de uma instância do banco maastb mas neste caso o banco já inicia e fica comente em modo mount. Depois disso o processo do Broker inicializa e seta vários parâmetros, detecta o modo que o primary opera. Por fim, o novo standby inicia a sincronização de redo’s e efetivamente começa a operar com um physical standby. Os redo’s são aplicados (incluindo o End-Of-Redo) e tudo fica sincronizado.
Se você notou acima, durante o processo de startup (após o comando do switchover) a outra instância foi inciada. Isso ocorre pois o Broker deixa iniciada todas as instâncias que compõem o Oracle RAC do standby. Abaixo temos o alertlog desta outra instância do cluster, basicamente comandos simples (um simples mount), pois o serviço pesado já foi feito pela primeria instância que iniciou:
Tue Jun 10 03:03:00 2014 Instance shutdown complete Tue Jun 10 03:04:35 2014 Starting ORACLE instance (normal) ****************** Large Pages Information ***************** Total Shared Global Region in Large Pages = 0 KB (0%) Large Pages used by this instance: 0 (0 KB) Large Pages unused system wide = 0 (0 KB) (alloc incr 4096 KB) Large Pages configured system wide = 0 (0 KB) Large Page size = 2048 KB RECOMMENDATION: Total Shared Global Region size is 1026 MB. For optimal performance, prior to the next instance restart increase the number of unused Large Pages by atleast 513 2048 KB Large Pages (1026 MB) system wide to get 100% of the Shared Global Region allocated with Large pages *********************************************************** LICENSE_MAX_SESSION = 0 LICENSE_SESSIONS_WARNING = 0 Private Interface 'eth2:1' configured from GPnP for use as a private interconnect. [name='eth2:1', type=1, ip=169.254.172.112, mac=00-0c-29-3a-65-3a, net=169.254.0.0/16, mask=255.255.0.0, use=haip:cluster_interconnect/62] Public Interface 'eth0' configured from GPnP for use as a public interface. [name='eth0', type=1, ip=10.17.42.48, mac=00-0c-29-3a-65-26, net=10.17.42.0/26, mask=255.255.255.192, use=public/1] Public Interface 'eth0:1' configured from GPnP for use as a public interface. [name='eth0:1', type=1, ip=10.17.42.52, mac=00-0c-29-3a-65-26, net=10.17.42.0/26, mask=255.255.255.192, use=public/1] Public Interface 'eth0:2' configured from GPnP for use as a public interface. [name='eth0:2', type=1, ip=10.17.42.35, mac=00-0c-29-3a-65-26, net=10.17.42.0/26, mask=255.255.255.192, use=public/1] Picked latch-free SCN scheme 3 Autotune of undo retention is turned on. LICENSE_MAX_USERS = 0 SYS auditing is disabled Starting up: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production With the Partitioning, Real Application Clusters and Real Application Testing options. ORACLE_HOME = /u01/app/oracle/product/11.2.0.3/db_1 System name: Linux Node name: rac11stb01.tjsc.jus.br Release: 2.6.39-400.17.1.el6uek.x86_64 Version: #1 SMP Fri Feb 22 18:16:18 PST 2013 Machine: x86_64 VM name: VMWare Version: 6 Using parameter settings in server-side pfile /u01/app/oracle/product/11.2.0.3/db_1/dbs/initmaastb1.ora System parameters with non-default values: processes = 150 spfile = "+DG01/maastb/spfilemaastb.ora" sga_target = 1G control_files = "+DG01/maastb/controlfile/current.273.844715953" control_files = "+FRA/maastb/controlfile/current.579.844715953" db_file_name_convert = "+DATA/maa" db_file_name_convert = "+DG01/maastb" db_file_name_convert = "+FRA/maa" db_file_name_convert = "+FRA/maastb" log_file_name_convert = "+DATA/maa" log_file_name_convert = "+DG01/maastb" log_file_name_convert = "+FRA/maa" log_file_name_convert = "+FRA/maastb" db_block_size = 8192 compatible = "11.2.0.0.0" log_archive_dest_1 = "LOCATION=USE_DB_RECOVERY_FILE_DEST VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=maastb" log_archive_dest_2 = "service="maa"" log_archive_dest_2 = "LGWR SYNC AFFIRM delay=0 optional compression=disable max_failure=0 max_connections=1 reopen=300 db_unique_name="maa" net_timeout=30" log_archive_dest_2 = "valid_for=(all_logfiles,primary_role)" log_archive_dest_state_1 = "ENABLE" log_archive_dest_state_2 = "ENABLE" log_archive_min_succeed_dest= 1 fal_server = "maa" log_archive_trace = 0 log_archive_config = "DG_CONFIG=(maa,maastb)" log_archive_format = "arch_%t_%s_%r.arc" log_archive_max_processes= 4 archive_lag_target = 0 cluster_database = TRUE db_create_file_dest = "+DG01" db_recovery_file_dest = "+FRA" db_recovery_file_dest_size= 10G standby_file_management = "AUTO" thread = 1 undo_tablespace = "UNDOTBS1" instance_number = 1 remote_login_passwordfile= "EXCLUSIVE" db_domain = "" remote_listener = "rac11stb-scan.tjsc.jus.br:1521" audit_file_dest = "/u01/app/oracle/admin/maastb/adump" audit_trail = "DB" db_name = "maa" db_unique_name = "maastb" open_cursors = 300 pga_aggregate_target = 256M dg_broker_start = TRUE dg_broker_config_file1 = "+DG01/maastb/dr1maastb.dat" dg_broker_config_file2 = "+FRA/maastb/dr2maastb.dat" diagnostic_dest = "/u01/app/oracle" Cluster communication is configured to use the following interface(s) for this instance 169.254.172.112 cluster interconnect IPC version:Oracle UDP/IP (generic) IPC Vendor 1 proto 2 Tue Jun 10 03:04:41 2014 PMON started with pid=2, OS id=29428 Tue Jun 10 03:04:41 2014 PSP0 started with pid=3, OS id=29430 Tue Jun 10 03:04:42 2014 VKTM started with pid=4, OS id=29432 at elevated priority VKTM running at (1)millisec precision with DBRM quantum (100)ms Tue Jun 10 03:04:42 2014 GEN0 started with pid=5, OS id=29436 Tue Jun 10 03:04:42 2014 DIAG started with pid=6, OS id=29438 Tue Jun 10 03:04:42 2014 DBRM started with pid=7, OS id=29440 Tue Jun 10 03:04:43 2014 PING started with pid=8, OS id=29442 Tue Jun 10 03:04:43 2014 ACMS started with pid=9, OS id=29444 Tue Jun 10 03:04:43 2014 DIA0 started with pid=10, OS id=29446 Tue Jun 10 03:04:43 2014 LMON started with pid=11, OS id=29448 Tue Jun 10 03:04:43 2014 LMD0 started with pid=12, OS id=29450 * Load Monitor used for high load check * New Low - High Load Threshold Range = [1920 - 2560] Tue Jun 10 03:04:43 2014 LMS0 started with pid=13, OS id=29452 at elevated priority Tue Jun 10 03:04:43 2014 RMS0 started with pid=14, OS id=29456 Tue Jun 10 03:04:43 2014 LMHB started with pid=15, OS id=29458 Tue Jun 10 03:04:43 2014 MMAN started with pid=16, OS id=29460 Tue Jun 10 03:04:43 2014 DBW0 started with pid=17, OS id=29462 Tue Jun 10 03:04:43 2014 LGWR started with pid=18, OS id=29464 Tue Jun 10 03:04:43 2014 CKPT started with pid=19, OS id=29466 Tue Jun 10 03:04:43 2014 SMON started with pid=20, OS id=29468 Tue Jun 10 03:04:43 2014 RECO started with pid=21, OS id=29470 Tue Jun 10 03:04:43 2014 RBAL started with pid=22, OS id=29472 Tue Jun 10 03:04:43 2014 ASMB started with pid=23, OS id=29474 Tue Jun 10 03:04:43 2014 MMON started with pid=24, OS id=29476 Tue Jun 10 03:04:43 2014 MMNL started with pid=25, OS id=29478 lmon registered with NM - instance number 1 (internal mem no 0) NOTE: initiating MARK startup Starting background process MARK Tue Jun 10 03:04:44 2014 MARK started with pid=26, OS id=29483 NOTE: MARK has subscribed Reconfiguration started (old inc 0, new inc 4) List of instances: 1 2 (myinst: 1) Global Resource Directory frozen * allocate domain 0, invalid = TRUE Communication channels reestablished * domain 0 not valid according to instance 2 * domain 0 valid = 0 according to instance 2 Master broadcasted resource hash value bitmaps Non-local Process blocks cleaned out LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived Set master node info Submitted all remote-enqueue requests Dwn-cvts replayed, VALBLKs dubious All grantable enqueues granted Post SMON to start 1st pass IR Submitted all GCS remote-cache requests Post SMON to start 1st pass IR Fix write in gcs resources Reconfiguration complete Tue Jun 10 03:04:45 2014 LCK0 started with pid=28, OS id=29489 Tue Jun 10 03:04:45 2014 Starting background process RSMN Tue Jun 10 03:04:46 2014 RSMN started with pid=29, OS id=29491 ORACLE_BASE not set in environment. It is recommended that ORACLE_BASE be set in the environment Reusing ORACLE_BASE from an earlier startup = /u01/app/oracle Tue Jun 10 03:04:46 2014 DMON started with pid=30, OS id=29493 Tue Jun 10 03:04:47 2014 ALTER SYSTEM SET local_listener='(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=10.17.42.52)(PORT=1521))))' SCOPE=MEMORY SID='maastb1'; ALTER DATABASE MOUNT /* db agent *//* {2:12604:3042} */ NOTE: Loaded library: /opt/oracle/extapi/64/asm/orcl/1/libasm.so NOTE: Loaded library: System SUCCESS: diskgroup DG01 was mounted NOTE: dependency between database maastb and diskgroup resource ora.DG01.dg is established SUCCESS: diskgroup FRA was mounted NOTE: dependency between database maastb and diskgroup resource ora.FRA.dg is established ARCH: STARTING ARCH PROCESSES Tue Jun 10 03:04:53 2014 NSS2 started with pid=33, OS id=29514 Tue Jun 10 03:04:53 2014 ARC0 started with pid=34, OS id=29516 ARC0: Archival started ARCH: STARTING ARCH PROCESSES COMPLETE ARC0: STARTING ARCH PROCESSES Tue Jun 10 03:04:54 2014 ARC1 started with pid=35, OS id=29521 Tue Jun 10 03:04:54 2014 ARC2 started with pid=36, OS id=29523 Tue Jun 10 03:04:54 2014 ARC3 started with pid=37, OS id=29525 ARC1: Archival started ARC2: Archival started ARC2: Becoming the 'no FAL' ARCH ARC2: Becoming the 'no SRL' ARCH ARC2: Thread not mounted ARC1: Becoming the heartbeat ARCH Tue Jun 10 03:04:54 2014 Successful mount of redo thread 1, with mount id 728394567 Allocated 3981120 bytes in shared pool for flashback generation buffer Starting background process RVWR Tue Jun 10 03:04:54 2014 RVWR started with pid=38, OS id=29527 ARC3: Archival started ARC0: STARTING ARCH PROCESSES COMPLETE Physical Standby Database mounted. Lost write protection disabled Completed: ALTER DATABASE MOUNT /* db agent *//* {2:12604:3042} */ Tue Jun 10 03:04:59 2014 Starting Data Guard Broker (DMON) Tue Jun 10 03:04:59 2014 INSV started with pid=41, OS id=29542 Tue Jun 10 03:05:03 2014 NSV0 started with pid=40, OS id=29552 Tue Jun 10 03:05:11 2014 RSM0 started with pid=44, OS id=29558 Using STANDBY_ARCHIVE_DEST parameter default value as USE_DB_RECOVERY_FILE_DEST ALTER SYSTEM SET log_archive_trace=0 SCOPE=BOTH SID='maastb1'; ALTER SYSTEM SET log_archive_format='arch_%t_%s_%r.arc' SCOPE=SPFILE SID='maastb1';
De forma simples e com um único comando o switchover foi realizado através do Broker. Comparando com o artigo anterior sobre switchover os passos foram mais simples, mas claro que dependenmos de um Broker configurado e de uma sincronia entre as primary e standby operando em plenitude.
Status do ambiente #2
Como visto acima a troca de papeis entre o primary e standby foi concluída com sucesso. Observe abaixo no Broker que a sincronia entre eles está correta e que não existe nenhum gap entre eles:
DGMGRL> SHOW CONFIGURATION; Configuration - dgrac Protection Mode: MaxAvailability Databases: maa - Primary database maastb - Physical standby database Fast-Start Failover: DISABLED Configuration Status: SUCCESS DGMGRL> DGMGRL> SHOW DATABASE maa; Database - maa Role: PRIMARY Intended State: TRANSPORT-ON Instance(s): maa1 maa2 Database Status: SUCCESS DGMGRL> DGMGRL> SHOW DATABASE maa LogXptStatus; LOG TRANSPORT STATUS PRIMARY_INSTANCE_NAME STANDBY_DATABASE_NAME STATUS maa1 maastb maa2 maastb DGMGRL>
Na nossa tabela de controle (observe o hostname):
[oracle@rac11pri01 ~]$ sqlplus / as sysdba SQL*Plus: Release 11.2.0.3.0 Production on Tue Jun 10 00:34:53 2014 Copyright (c) 1982, 2011, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production With the Partitioning, Real Application Clusters and Automatic Storage Management options SQL> SELECT * FROM testedg; no rows selected SQL>
AMBIENTE FINAL
Acredito que tenha ficado claro que através do Broker a troca de papeis pelo switchover foi bem mais simples. Com um único comando ocorreu a troca sem que tivéssemos que executar manualmente o “cancel” da sincronia e “abort” das bases. Indo além, não nos preocupamos com nada, o Broker fez tudo por nós.
Claro que nem tudo são flores, o Broker não corrigiu o CRS completamente, o novo primary ainda está definido para subir em modo mount e o novo standby aparece no CRS para ficar em open. Já corrigimos isso em outros artigos, mas os passo seriam (na primary):
[oracle@rac11pri01 ~]$ srvctl config database -d maa -v Database unique name: maa Database name: maa Oracle home: /u01/app/oracle/product/11.2.0.3/db_1 Oracle user: oracle Spfile: +DATA/maa/spfilemaa.ora Domain: Start options: mount Stop options: immediate Database role: PRIMARY Management policy: AUTOMATIC Server pools: maa Database instances: maa1,maa2 Disk Groups: DATA,FRA Mount point paths: Services: Type: RAC Database is administrator managed [oracle@rac11pri01 ~]$ [oracle@rac11pri01 ~]$ [oracle@rac11pri01 ~]$ [oracle@rac11pri01 ~]$ srvctl modify database -d maa -s OPEN [oracle@rac11pri01 ~]$ [oracle@rac11pri01 ~]$ [oracle@rac11pri01 ~]$ srvctl config database -d maa -v Database unique name: maa Database name: maa Oracle home: /u01/app/oracle/product/11.2.0.3/db_1 Oracle user: oracle Spfile: +DATA/maa/spfilemaa.ora Domain: Start options: open Stop options: immediate Database role: PRIMARY Management policy: AUTOMATIC Server pools: maa Database instances: maa1,maa2 Disk Groups: DATA,FRA Mount point paths: Services: Type: RAC Database is administrator managed [oracle@rac11pri01 ~]$
E no standby:
[oracle@rac11stb01 ~]$ srvctl config database -d maastb -v Database unique name: maastb Database name: Oracle home: /u01/app/oracle/product/11.2.0.3/db_1 Oracle user: oracle Spfile: Domain: Start options: open Stop options: immediate Database role: PHYSICAL_STANDBY Management policy: AUTOMATIC Server pools: maastb Database instances: maastb1,maastb2 Disk Groups: DG01,FRA Mount point paths: Services: Type: RAC Database is administrator managed [oracle@rac11stb01 ~]$ [oracle@rac11stb01 ~]$ [oracle@rac11stb01 ~]$ srvctl modify database -d maastb -s MOUNT [oracle@rac11stb01 ~]$ [oracle@rac11stb01 ~]$ [oracle@rac11stb01 ~]$ srvctl config database -d maastb -v Database unique name: maastb Database name: Oracle home: /u01/app/oracle/product/11.2.0.3/db_1 Oracle user: oracle Spfile: Domain: Start options: mount Stop options: immediate Database role: PHYSICAL_STANDBY Management policy: AUTOMATIC Server pools: maastb Database instances: maastb1,maastb2 Disk Groups: DG01,FRA Mount point paths: Services: Type: RAC Database is administrator managed [oracle@rac11stb01 ~]$
Pingback: Oracle e MAA – Artigo VIII – Fast-Start Failover | Have you hugged your backup today?
Pingback: Oracle e MAA – Artigo X | Blog Fernando Simon