Abnormal state handling of Config shard in Oracle

Source: Internet
Author: User
Tags reserved oracle database sqlplus

I have been writing about 12.2 sharding database for a long time, and today I started the environment again, started the host, and then started the Listener and Shardcat databases and the Shard node database instance. When checking the status of Shard, the error is found:

(a) Question one: status display warnings:


Gdsctl>config Shard
Name Shard Group Status State Region Availability
----                -----------         ------    -----       ------    ------------
SH1 primary_shardgroup Ok deployed Region1 ONLINE
SH2 primary_shardgroup warnings deployed Region1-

Gdsctl>

In this case, we can use the Recover Shard command. It is wrong to pay attention to the help recover shard-h.

Gdsctl>recover shard-h
Syntax
RECOVER shard-gdspool pool-database db_name [-skip_first|-ignore_first]
[-full]

Purpose
Executes all DDL statements in specified database starting from the one, which
is previusly executed with errors. The command is intended to perform all
Skipped DDL changes after database administrator fixes shard issues.

Usage Notes
Use Skip_first to SKIP the. This is typically required after manual fix
done by the database administrator. For example, if CREATE TABLE statement fails
Because of lack of space, database administrator fixes the issue and re-executes
CREATE TABLE. To avoid ORA-39151 (table exists) in RECOVER SHARD him has to
Specify-skip_first.

Use the Ignore_first to mark the as obsolete. This are required when wrong DDL
Statement is specified and failed on all shards. Thus you need to mark it down
As obsolete.

Keywords and Parameters
Full:full recovery mode.
Gdspool:the GDS Pool (If not specified and there are only one gdspool with access
Granted to user, it'll be used by default
Ignore_first:make the failed DDL statement obsolete.
Shard:the name of the Shard.
Skip_first:skip the failed DDL statement

Examples
RECOVER Shard-database SHD1

Gdsctl>
The correct usage should be recover shard-shard.

We ran this command and found that the GSM did not start:


Gdsctl>recover Shard-shard SH2
GSM-45076:GSM is not running
Gdsctl>
Gdsctl>start GSM
GSM is started successfully
Gdsctl>

After starting, wait a moment (about 10 seconds), the check status will be restored to normal (here, we have not really used the recover shard, in the following question only normal use):


Gdsctl>config Shard
Name Shard Group Status State Region Availability
----                -----------         ------    -----       ------    ------------
SH1 primary_shardgroup Ok deployed Region1 ONLINE
SH2 primary_shardgroup warnings deployed Region1-

Gdsctl>
gdsctl>--wait about 10 seconds.
Gdsctl>
Gdsctl>config Shard
Name Shard Group Status State Region Availability
----                -----------         ------    -----       ------    ------------
SH1 primary_shardgroup Ok deployed Region1 ONLINE
SH2 primary_shardgroup Ok deployed Region1 ONLINE

Gdsctl>
So, after rebooting, there are database,listener, and GSM that need to be started.

(b) Question two: State shows DDL error
Then, in the following experiment, I drop off a duplicate table on the Shardcat and find that Shard state is not normal at this time. The state of Config Shard shows DDL error.

First, operate on the Shardcat, drop off the duplicate table


Sql> Conn app_schema/oracle
Connected.
sql>
sql> drop table products;
drop table Products
*
ERROR in line 1:
Ora-02557:cannot operate On sharded objects if Shard DDL is disabled
 
 
sql>        ;                         
sql>
sql>
sql> alter session enable Shard DDL;
 
Session altered.
 
Sql> Select COUNT (*) from the products;
 
  COUNT (*)
----------
  &nbs p;   1000
 
sql>
sql> drop table products;
 
table dropped.
 
Sql>
finds that the data is not synchronized at this time and can still be queried on shard node:


Sql>!hostname
Sdb2

Sql>
Sql> Conn App_schema/oracle
Connected.
Sql>
Sql> Select COUNT (*) from the products;

COUNT (*)
----------
1000

Sql>



Sql>!hostname
Sdb3

Sql>
Sql> Conn App_schema/oracle
Connected.
Sql>
Sql> Select COUNT (*) from the products;

COUNT (*)
----------
1000

Sql>
At this point, check the Shard state to become DDL error:


Gdsctl>config Shard
Name Shard Group Status State Region Availability
----                -----------         ------    -----       ------    ------------
SH1 primary_shardgroup Ok DDL error Region1 ONLINE
SH2 primary_shardgroup Ok DDL error Region1 ONLINE

Gdsctl>
We can see some information in the GSM Alertlog:
(Note: GSM log location in/u01/ora12c/app/oracle/diag/gsm/sdb1/sharddirector1/trace/alert_shardcat.log)

......
07-nov-2016 01:35:19 * (Connect_data= (Service_name=gds$catalog.oradbcloud) (Cid= (program=oracle) (HOST=SDB3) (USER= ORACLE12C))) * (Address= (PROTOCOL=TCP) (host=192.168.56.23) (port=34247)) * Establish * gds$catalog.oradbcloud * 0
2016-11-07t01:35:22.021482+08:00
07-nov-2016 01:35:22 * service_update * shardcat%11 * 0
07-nov-2016 01:35:22 * service_update * shardcat%1 * 0
2016-11-07t01:35:35.543868+08:00
Gsm-40135:catalog Request: "DDL request" (80). Target-"33". pool---"". Id= "72".
Payload: "A"
Gsm-40135:catalog Request: "DDL request" (80). Target-"34". pool---"". Id= "73".
Payload: "E"
2016-11-07t01:35:36.652686+08:00
Gsm-40148:database task failed:database: "SH1", Status 4, message: "Ora-01031:insufficient privileges
Ora-06512:at "SYS. Execasuser ", line 35
Ora-06512:at "SYS. Dbms_sys_sql ", line 1434
Ora-06512:at "SYS. Execasuser ", line 23
Ora-06512:at "Gsmadmin_internal. Executeddl ", line 68
Ora-06512:at Line 1 \ (ngsmoci_execute\)

"
Gsm-40148:database task failed:database: "Sh2", Status 4, message: "Ora-01031:insufficient privileges
Ora-06512:at "SYS. Execasuser ", line 35
Ora-06512:at "SYS. Dbms_sys_sql ", line 1434
Ora-06512:at "SYS. Execasuser ", line 23
Ora-06512:at "Gsmadmin_internal. Executeddl ", line 68
Ora-06512:at Line 1 \ (ngsmoci_execute\)

"
2016-11-07t01:35:36.700593+08:00
GSM-40157:GSM response id=72 type=80. Status: "Success with Info".
2016-11-07t01:35:37.010475+08:00
Gsm-40135:catalog Request: "RUNTIME DATABASE" (40). Target-"SH2". pool---"". Id= "74".
Payload: "(parameters= (ddlid=33) (autorcv=0))"
Gsm-40135:catalog Request: "RUNTIME DATABASE" (40). Target-"SH1". pool---"". Id= "75".
Payload: "(parameters= (ddlid=33) (autorcv=0))"
GSM-40157:GSM response id=73 type=80. Status: "Success".
2016-11-07t01:35:41.834561+08:00
07-nov-2016 01:35:41 * service_update * SHARDDIRECTOR1 * 0
......
In fact, from this log, we just see that there are ORA-01031 insufficient privileges error, but what is the lack of authority, we still have to go through Config Shard command to further look:


Gdsctl>config Shard-shard SH1
Name:sh1
Shard Group:primary_shardgroup
Status:ok
State:deployed
Region:region1
Connection string:sdb2:1521/sh1:dedicated
SCAN Address:
ONS Remote port:0
Disk Threshold, ms:20
CPU Threshold,%: 75
version:12.2.0.0
Last Failed ddl:alter database link "productsd ...
DDL error:ora-01031:insufficient Privileges
Ora-06512:at "SYS. Execasuser ", line 35
Ora-06512:at "SYS. Dbms_sys_sql ", line 1434
Ora-06512:at "SYS. Execasuser ", line 23
Ora-06512:at "Gsmadmin_internal. Executeddl ", line 68
Ora-06512:at Line 1 \ (ngsmoci_execute\)


Failed DDL id:33
Availability:online


Supported Services
------------------------
Name Preferred Status
----                                                            --------- ------
OLTP_RW_SRVC Yes Enabled

Gdsctl>

We see this last failed DDL, when ALTER DATABASE link appears with insufficient permissions.

So we found the reason for the fix:


--On the SH1:
[Oracle12c@sdb2 ~]$ Sqlplus "/As SYSDBA"

Sql*plus:release 12.2.0.0.2 Beta on Mon Nov 7 01:50:25 2016

Copyright (c) 1982, 2015, Oracle. All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition release 12.2.0.0.2-64bit Beta
With the partitioning, OLAP, Advanced Analytics and real application testing options

sql> grant ALTER DATABASE link to App_schema;

Grant succeeded.

Sql>

--On the SH2
[Oracle12c@sdb3 ~]$ Sqlplus "/As SYSDBA"

Sql*plus:release 12.2.0.0.2 Beta on Mon Nov 7 01:51:13 2016

Copyright (c) 1982, 2015, Oracle. All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition release 12.2.0.0.2-64bit Beta
With the partitioning, OLAP, Advanced Analytics and real application testing options

sql> grant ALTER DATABASE link to App_schema;

Grant succeeded.

Sql>

--On the Shardcat, gsm_env:
Gdsctl>config Shard
Name Shard Group Status State Region Availability
----                -----------         ------    -----       ------    ------------
SH1 primary_shardgroup Ok DDL error Region1 ONLINE
SH2 primary_shardgroup Ok DDL error Region1 ONLINE

Gdsctl>
Gdsctl>recover Shard-shard SH1
Gdsctl>
Gdsctl>recover Shard-shard SH2
Gdsctl>
Gdsctl>config Shard
Name Shard Group Status State Region Availability
----                -----------         ------    -----       ------    ------------
SH1 primary_shardgroup Ok deployed Region1 ONLINE
SH2 primary_shardgroup Ok deployed Region1 ONLINE

Gdsctl>


--On the SH1:
[Oracle12c@sdb2 ~]$ Sqlplus "/As SYSDBA"

Sql*plus:release 12.2.0.0.2 Beta on Mon Nov 7 01:50:25 2016

Copyright (c) 1982, 2015, Oracle. All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition release 12.2.0.0.2-64bit Beta
With the partitioning, OLAP, Advanced Analytics and real application testing options

sql> grant ALTER DATABASE link to App_schema;

Grant succeeded.

Sql>

--On the SH2
[Oracle12c@sdb3 ~]$ Sqlplus "/As SYSDBA"

Sql*plus:release 12.2.0.0.2 Beta on Mon Nov 7 01:51:13 2016

Copyright (c) 1982, 2015, Oracle. All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition release 12.2.0.0.2-64bit Beta
With the partitioning, OLAP, Advanced Analytics and real application testing options

sql> grant ALTER DATABASE link to App_schema;

Grant succeeded.

Sql>

--On the Shardcat, gsm_env:
Gdsctl>config Shard
Name Shard Group Status State Region Availability
----                -----------         ------    -----       ------    ------------
SH1 primary_shardgroup Ok DDL error Region1 ONLINE
SH2 primary_shardgroup Ok DDL error Region1 ONLINE

Gdsctl>
Gdsctl>recover Shard-shard SH1
Gdsctl>
Gdsctl>recover Shard-shard SH2
Gdsctl>
Gdsctl>config Shard
Name Shard Group Status State Region Availability
----                -----------         ------    -----       ------    ------------
SH1 primary_shardgroup Ok deployed Region1 ONLINE
SH2 primary_shardgroup Ok deployed Region1 ONLINE

Gdsctl>
You can see that, on the basis of fixing the problem, after running recover Shard-shard, you are back to normal.

Note that, if the root cause did not find, do not know is the database link permissions problem, only run recover shard is ineffective.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.