Grid patching was failing with Failed to bring down CRS servic(Clean of ‘ora.gipcd’ on ‘host’ failed)

We were trying to patch out GI home with Oct 2020 PSU, but the patch was not proceeding and every time it failed , when it was trying to bring the CRS down

Executing patch validation checks on home /oracle/app/12.2.0/grid
Patch validation checks successfully completed on home /oracle/app/12.2.0/grid

Checking shared status of home.....

Bringing down CRS service on home /oracle/app/12.2.0/grid
Prepatch operation log file location: /oracle/app/grid/crsdata/host/crsconfig/crspatch_host_2021-02-10_09-02-31PM.log
Failed to bring down CRS service on home /oracle/app/12.2.0/grid

Execution of [GIShutDownAction] patch action failed, check log for more details. Failures:
Patch Target : host->/oracle/app/12.2.0/grid Type[crs]


CRS-4000: Command Start failed, or completed with errors.
2021/02/10 21:07:02 CLSRSC-117: Failed to start Oracle Clusterware stack
 

Then , we tried to manually bring the cluster down , to see where the issue is actually happening in the process of clean shutdown of CRS.

 root@host (B)<-->{/oracle/app/12.2.0/grid/bin}:$ ./crsctl stop crs -f
 CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'host'
 CRS-2679: Attempting to clean 'ora.gipcd' on 'host'
 CRS-2680: Clean of 'ora.gipcd' on 'host' failed
 CRS-2799: Failed to shut down resource 'ora.gipcd' on 'host'
 CRS-2795: Shutdown of Oracle High Availability Services-managed resources on 'host' has failed
 CRS-4687: Shutdown command has completed with errors.
 CRS-4000: Command Stop failed, or completed with errors.
 root@host (B)<-->{/oracle/app/12.2.0/grid/bin}:$ ./crsctl stat res -t
 CRS-4535: Cannot communicate with Cluster Ready Services
 CRS-4000: Command Status failed, or completed with errors. 

when we tried to forcefully shutdown the cluster , the we came to know that gipcd component of cluster was creating issue in clean shutdown of cluster .

We checked the gipcd status by

root@host (B)<-->{/oracle/app/12.2.0/grid/bin}:$ ./crsctl stat res -t -init
 Name           Target  State        Server                   State details
 Cluster Resources
 ora.asm
       1        ONLINE  OFFLINE                               STABLE
 ora.cluster_interconnect.haip
       1        ONLINE  OFFLINE                               STABLE
 ora.crf
       1        ONLINE  OFFLINE                               STABLE
 ora.crsd
       1        ONLINE  OFFLINE                               STABLE
 ora.cssd
       1        ONLINE  OFFLINE                               STABLE
 ora.cssdmonitor
       1        OFFLINE OFFLINE                               STABLE
 ora.ctssd
       1        ONLINE  OFFLINE                               STABLE
 ora.diskmon
       1        OFFLINE OFFLINE                               STABLE
 ora.drivers.acfs
       1        OFFLINE OFFLINE                               STABLE
 ora.evmd
       1        ONLINE  INTERMEDIATE host                     STABLE
 ora.gipcd
       1        ONLINE  UNKNOWN      host                     STABLE
 ora.gpnpd
       1        ONLINE  ONLINE       host                      STABLE
 ora.mdnsd
       1        ONLINE  ONLINE       host                      STABLE
 ora.storage
       1        ONLINE  OFFLINE                               STABLE

The gipcd was in unknow state , was the reason , it was not allowing cluster to shut down properly

Checked all related logs to check , why it was in unknow state

/oracle/app/grid/diag/crs/host/crs/trace/ohasd.trc
/oracle/app/grid/diag/crs/host/crs/trace/alert
/oracle/app/grid/diag/crs/host/crs/trace/gipcd.trc

The gipcd trace was showing

2021-02-11 11:05:17.651 [OHASD(10617174)]CRS-5828: Could not start agent '/oracle/app/12.2.0/grid/bin/orarootagent'. Details at (:CRSAGF00123:) {0:0:2} in /oracle/app/grid/diag/crs/host/crs/trace/oh
 asd.trc.
 2021-02-11 11:05:17.651 [OHASD(10617174)]CRS-5828: Could not start agent '/oracle/app/12.2.0/grid/bin/orarootagent'. Details at (:CRSAGF00126:) {0:0:2} in /oracle/app/grid/diag/crs/host/crs/trace/oh
 asd.trc.
 2021-02-11 11:05:17.657 [OHASD(10617174)]CRS-2758: Resource 'ora.gipcd' is in an unknown state.
 2021-02-11 11:05:17.688 [GPNPD(20644274)]CRS-2328: GPNPD started on node host.

Tried the below note CRS is not starting after applying the latest RU in 12.2 (Doc ID 2373945.1) , didn’t worked

crsctl disbale crs 
rebooted the node ( after reboot node will not AutoStart, since we disabled it )
As root user , fired the below command 
GRID_HOME/crs/install/rootcrs.sh -unlock
GRID_HOME/crs/install/rootcrs.sh -lock 
Still after staring CRS , the gipcd was in unknown state 

gipcd log after , applying the above workaroud still giving error

2021-02-10 17:19:35.164 : CSSCLNT:1286: clsssCommonClientExit: RPC failure, rc 3

2021-02-10 17:19:35.167 :GIPCDMON:1286: gipcdMonitorCssTerm: Successfully cleanup of connection to CSS
2021-02-10 17:19:35.167 :GIPCDMON:1286: gipcdMonitorThread: GIPCD received a shutdown msg from agent framework or some other thread died
2021-02-10 17:19:35.167 : GIPC:1286: gipcdsMemoryUnsubscribeMap: successfully un-subscribed the map file smid 00000000000018b8 tot 0
2021-02-10 17:19:35.168 : GIPCD:1286: gipcdSetThreadState: changing the status of monitorThread. current status gipcdThreadStatusOnline desired status gipcdThreadStatusOffline
2021-02-10 17:19:35.168 :GIPCDMON:1286: gipcdMonitorThread: Monitor thread is exiting..
2021-02-10 17:19:35.176 : GIPCD:1: gipcdMain: All threads terminated
2021-02-10 17:19:35.176 : GIPCD:1: gipcdMain: GIPCD terminated 

Tried this solution as well ora.gipcd not starting due to wrong ownership/permission (Doc ID 1949541.1) didn’t worked

Remove <ORACLE_BASE>/crsdata/<node>/output/gipcdOUT.trc and <ORACLE_BASE>/crsdata/<node>/output/gipcd.pid and restart. 

Lastly , we tried removing and adding back the gipcd with proper permission with root user

Clusterware fail to start CRS-5809 and CRS-2680 after the roll back of a 12c OCW PSU (Doc ID 2051046.1)

[[ Login as root and execute the below commands ]]
1. verify if ohasd is running 

[$  /oracle/app/12.2.0/grid/bin/crsctl check has
if not start only ohasd 

 #  /oracle/app/12.2.0/grid/bin/crsctl start crs -noautostart 

 2. Delete the GIPCD resource.
#  /oracle/app/12.2.0/grid/bin/crsctl delete res ora.gipcd -init -f 

 3. Delete the GIPCD resource type
#  /oracle/app/12.2.0/grid/bin/crsctl delete type ora.gipc.type -init 

 4. Recreate the GIPCD resource type and the resource. 
 # cd  /oracle/app/12.2.0/grid/crs/template 

root@host(B)<-->{/oracle/app/12.2.0/grid/crs/template}:$ /oracle/app/12.2.0/grid/bin/crsctl add type ora.gipc.type -basetype ora.daemon.type -file /oracle/app/12.2.0/grid/crs/template/gipc.type -init


root@host(B)<-->{/oracle/app/12.2.0/grid/crs/template}:$ /oracle/app/12.2.0/grid/bin/crsctl add res ora.gipcd -type ora.gipc.type -attr "ACL='owner:root:rw-,pgrp:oinstall:rw-,other::r--,user:grid:rwx'" -init -unsupported

After performing the above steps try to shut down and start the CRS and look for gipcd status

root@host (B)<-->{/oracle/app/12.2.0/grid/bin}:$ ./crsctl stop crs -f
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'host'
 CRS-2673: Attempting to stop 'ora.mdnsd' on 'host'
 CRS-2673: Attempting to stop 'ora.gpnpd' on 'host'
 CRS-2673: Attempting to stop 'ora.evmd' on 'host'
 CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'host'
 CRS-2677: Stop of 'ora.drivers.acfs' on 'host' succeeded
 CRS-2677: Stop of 'ora.mdnsd' on 'host' succeeded
 CRS-2677: Stop of 'ora.gpnpd' on 'host' succeeded
 CRS-2677: Stop of 'ora.evmd' on 'host' succeeded
 CRS-2793: Shutdown of Oracle High Availability Services-managed resources   on 'host' has completed
 CRS-4133: Oracle High Availability Services has been stopped.

Now start and verify if all services are up and running

root@host(B)<-->{/oracle/app/12.2.0/grid/bin}:$ ./crsctl start crs
CRS-4123: Oracle High Availability Services has been started.
root@host (B)<-->{/oracle/app/12.2.0/grid/bin}:$ ./crsctl stat res -t -init
 Name           Target  State        Server                   State details
 Cluster Resources
 ora.asm
       1        ONLINE  OFFLINE                               STABLE
 ora.cluster_interconnect.haip
       1        ONLINE  OFFLINE                               STABLE
 ora.crf
       1        ONLINE  OFFLINE                               STABLE
 ora.crsd
       1        ONLINE  OFFLINE                               STABLE
 ora.cssd
       1        ONLINE  OFFLINE      host              STARTING
 ora.cssdmonitor
       1        ONLINE  ONLINE       host              STABLE
 ora.ctssd
       1        ONLINE  OFFLINE                               STABLE
 ora.diskmon
       1        OFFLINE OFFLINE                               STABLE
 ora.drivers.acfs
       1        ONLINE  ONLINE       host              STABLE
 ora.evmd
       1        ONLINE  INTERMEDIATE host              STABLE
 ora.gipcd
       1        ONLINE  ONLINE       host              STABLE
 ora.gpnpd
       1        ONLINE  ONLINE       host              STABLE
 ora.mdnsd
       1        ONLINE  ONLINE       host              STABLE
 ora.storage
       1        ONLINE  OFFLINE                               STABLE

 146 total views,  2 views today

Leave a Reply

Your email address will not be published. Required fields are marked *