Environment: AIX 5.3 + 10.2.0.5 RAC
Scenario Description: In a set of RAC, VIP2 of node2 can be abnormally moved to node1 and 2 cannot be transferred back to node2. at the same time, no fault is found on node2. Even if the server is restarted, VIP2 cannot be restored to normal.
VIP2 is unavailable, so that node2 does not have a new connection, the load is all on node1, and node1 does not have the Failover function, in order to make node2 available, if the original VIP2 problem cannot be solved, the final solution is: Disable VIP2 on node1 and then add a VIP on node2.
1. View Resource Status
Oracle @ gisdb1:/oracle $ crs_stat-t
Name Type Target State Host
------------------------------------------------------------
Ora... B1.lsnr application ONLINE gisdb1
Ora. gisdb1.gsd application ONLINE gisdb1
Ora. gisdb1.ons application ONLINE gisdb1
Ora. gisdb1.vip application ONLINE gisdb1
Ora... B2.lsnr application ONLINE OFFLINE
Ora. gisdb2.vip application ONLINE OFFLINE
Ora. gisdb2.ons application ONLINE gisdb2
Ora. gisdb2.gsd application ONLINE gisdb2
Ora. nxgis. db application ONLINE gisdb2
Ora... s1.inst application ONLINE gisdb1
Ora... s2.inst application ONLINE gisdb2
At this time, the ora. gisdb2.vip resource cannot be started on node node2. If you use crs_start to start the resource, it is automatically started on node1.
Of course, the resource ora. gisdb2.listener _ gisdb2.lsnr on node node2 is dependent on vip2 and cannot be started.
2. During the problem solving process, I tried to track the Startup Process of vip2 multiple times, but there was still no effective result. Then view the ocr information:
Oracle @ gisdb1> ocrdump
OCRDUMPFILE
Oracle @ gisdb1> vi OCRDUMPFILE
No information about vip2 is found. We know that the information in ocr is messy. (Previously executed: srvctl remove nodeapps-f, where the-f parameter indicates forced execution. This parameter may cause ocr information disorder and is not recommended)
3. When there is a problem with ora. gisdb2.vip information in ocr, you should first re-register ora. gisdb2.vip.
The so-called re-registration is to delete the information about ora. gisdb2.vip in ocr, and then recreate the vip. You can use VIPCA or crs_profile, crs_register, or a series of crs _ * commands to recreate the VIP.
4. Run the following command to remove ora. gisdb2.vip:
Run the following command as the root user:
Root @ gisdb2 :. /srvctl remove nodeapps-n gisdb2 (this command will delete vip, ons, and gsd, so you will need to recreate vip and ons later. gsd can choose not to recreate gsd to maintain backward compatibility)
PRKO-02112: "Some or all node applications are not removed successfully onnode: gisdb2"
Obviously, an error is reported here, And nodeapps on node2 cannot be deleted.
Then, view the crs resources:
Oracle @ gisdb1:/oracle $ crs_stat-t
Name Type Target State Host
------------------------------------------------------------
Ora... B1.lsnr application ONLINE gisdb1
Ora. gisdb1.gsd application ONLINE gisdb1
Ora. gisdb1.ons application ONLINE gisdb1
Ora. gisdb1.vip application ONLINE gisdb1
Ora... B2.lsnr application ONLINE OFFLINE
Ora. gisdb2.vip application ONLINE OFFLINE
Ora. nxgis. db application ONLINE gisdb2
Ora... s1.inst application ONLINE gisdb1
Ora... s2.inst application ONLINE gisdb2
We can see that ora. gisdb2.vip still exists, but ora. gisdb2.ong and ora. gisdb2.gsd resources are unavailable.
5. To remove ora. gisdb2.vip from ocr and use crs_unregister, this command deletes the information of the corresponding resource from the ocr file.
Run the following command as the root user:
Root @ gisdb2: // #./crs_unresgiter ora. gisdb2.vip
Cann't unregister 'ora. gisdb2.vip 'because it is required by other resources.
CRS-0214: cocould not unregister resource 'ora. gisdb2.vip'
An error is reported again because ora. gisdb2.vip cannot be registered because it is associated with other resources. To cancel vip settings, you may need to register lsnr and database (not tested)
In short, we failed. At this time, the problem is in a dilemma. If the command deleted by VIP2 fails, we cannot update the VIP2 information in OCR. VIP2 is unavailable on node2,
RAC becomes a single node and loses high availability.
6. Considering that no VIP2 information is available in OCR, can we directly add VIP2 information to OCR? The subsequent experiment still failed. The error message shows that ora. gisdb2.vip already exists and cannot be added.
(The detailed addition process is not displayed here, and the Manual VIP registration process will be later)
7. The failure of the above experiment made people feel a little desperate. You have no choice about VIP2. OCR information disorder can only be caused by unregister and re-register. But now vip2 cannot unregister, maybe it will set instan, listener as unregister
Then vip2 can unregister. However, in the production environment, we are afraid to perform this operation because the time is too late for testing. Then I thought of another conservative approach, which worked well despite being imperfect:
Since there is no way to shake VIP2, it doesn't matter. Then, manually create a new VIP. The new VIP name should not be the same as ora. gisdb2.vip. Here I created a new ora. gisdb2-test.vip.
NOTE: The following commands of crs _ * will be used. For details about the commands, see the following section of Oracle Clusterware and Oracle Real Application cLUSTERS Administraion and Deploymnt Guid.
7.1 first, create a vip resource through crs_profile, that is, the vip configuration information, that is, the information displayed by crs_stat-p.
. /Crs_profile-create ora. gisdb2-test.vip-t application-d "vip on node2"-a $ ORA_CRS_HOME/bin/usrvip-o oi = eth0, ov = 192.168.10.14, on = 255.255.0-h gisdb2-p favored
After creating the vip configuration file, you can view the file ora in the $ ORA_CRS_HOME/crs/profile (or $ ORA_CRS_HOME/crs/public) directory. gisdb2-test.vip.cap file, you can check its content
> Cat ora. gisdb2-test.vip.cap, which displays the same content as the information structure about the vip displayed by crs_stat-p.
7.2 after creating the VIP configuration file, you need to register its information to ocr.
Gisdb2-test.vip.cap. crs_register ora.
After registration, you can see that the following resources have been added:
Crs_stat-t
Ora... B1.lsnr application ONLINE gisdb1
Ora. gisdb1.gsd application ONLINE gisdb1
Ora. gisdb1.ons application ONLINE gisdb1
Ora. gisdb1.vip application ONLINE gisdb1
Ora... B2.lsnr application ONLINE OFFLINE
Ora. gisdb2.vip application ONLINE OFFLINE
Ora. nxgis. db application ONLINE gisdb2
Ora... s1.inst application ONLINE gisdb1
Ora... s2.inst application ONLINE gisdb2
Ora... est. vip application OFFLINE gisdb2
7.3 for the newly added ora. gisdb2-test.vip resources for authorization, in the root user to execute (the permissions of the corresponding resources may be different, if you do not know whether the resource is root or oracle, You can first try to root, if an error is reported when you start the resource later, change it to oracle)
Root @ gisdb2] # crs_setperm ora. gisdb2-test.vip-o root
Root @ gisdb2] # crs_setperm ora. gisdb2-test.vip-u user: oracle: r-x
7.4 Finally starting the resource ora. gisdb2-test.vip through an oracle user
Gisdb2-test.vip. crs_start ora.
7.5 view status
Crs_stat-t
Ora... B1.lsnr application ONLINE gisdb1
Ora. gisdb1.gsd application ONLINE gisdb1
Ora. gisdb1.ons application ONLINE gisdb1
Ora. gisdb1.vip application ONLINE gisdb1
Ora... B2.lsnr application ONLINE OFFLINE
Ora. gisdb2.vip application ONLINE OFFLINE
Ora. nxgis. db application ONLINE gisdb2
Ora... s1.inst application ONLINE gisdb1
Ora... s2.inst application ONLINE gisdb2
Ora... est. vip application ONLINE gisdb2
So far, the new VIP cannot serve us. We also need lsnr resources. If you try to start an existing lsnr resource at this time, an error is returned.
This existing lsnr resource is associated with ora. gisdb2.vip. So we also need to rebuild a new lsnr resource associated with ora. gisdbs2-test.vip.
8. Added lsnr resources. The method for adding this resource is similar to that for adding a vip.
We used crs_profile-create to generate the *. cap file. The content structure of this file is the same as that of crs_stat-p.
If you do not know how to use the parameters in the crs_profile command to generate the *. cap configuration file for lsnr resources. You can use the lsnr resource information of another good node:
8.1 run the following command as the root user:
Crs_stat-p ora. gisdb1.LISTENER _ GISDB1.lsnr> $ ORA_CRS_HOME/crs/profile/ora. gisdb2.LISTENER _ GISDB2_test. Lsnr. cap
8.2 edit ora. gisdb2.LISTENER _ GISDB2_test. Lsnr. cap, the information inside is very simple, change the node information inside to gisdb2, vip information is changed to ora. gisdb2-test.vip.
Set ora. gisdb2.LISTENER _ GISDB2_test here. The group of the lsnr. cap file is changed to oracle: oinstall.
8.3 register ora. gisdb2.LISTENER _ GISDB2_test.lsnr Resources
./Crs_register ora. gisdb2.LISTENER _ GISDB2_test. Lsnr
8.4 check resources with ora. gisdb2.LISTENER _ GISDB2_test.lsnr missing
Because this resource is related to the listener, you must create a listener named LISTENER_GISDB2_test to start the resource.
9. add ons, which is the same as adding lsnr resources. If you understand the previous process, it is no problem to add ONS.
The final resources are as follows:
Oracle @ gisdb2:/oracle $ crs_stat-t
Name Type Target State Host
------------------------------------------------------------
Ora... B1.lsnr application ONLINE gisdb1
Ora. gisdb1.gsd application ONLINE gisdb1
Ora. gisdb1.ons application ONLINE gisdb1
Ora. gisdb1.vip application ONLINE gisdb1
Ora... est. vip application ONLINE gisdb2
Ora... B2.lsnr application ONLINE OFFLINE
Ora... st. lsnr application ONLINE gisdb2
Ora. gisdb2.ons application ONLINE gisdb2
Ora. gisdb2.vip application ONLINE OFFLINE
Ora. nxgis. db application ONLINE gisdb2
Ora... s1.inst application ONLINE gisdb1
Ora... s2.inst application ONLINE gisdb2
Now that we have a new VIP and LSNR, we don't need to worry about the two offlines. In this case, the application can connect to node node2.