Practical example: Using Strace to analyze database connection problems

Last Update:2015-05-13 Source: Internet

Author: User

Tags htons

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The previous blog post described a number of system call functions, which, combined with accumulated experience, can be used to analyze and solve problems in the actual work.

Problem: A Linux server in Hong Kong computer room, the above installed Sqlplus can not be connected to the Shenzhen company Room Oracle Server, the implementation of Sqlplus Xxxx/[email protected] time directly did not respond, after 2 minutes or so, reported:

Sql*plus:release 11.2.0.1.0 Production on Tue Apr 21 15:44:04 2015

Copyright (c) 1982, Oracle. All rights reserved.

ERROR:
Ora-12547:tns:lost Contact

Environment:

Client machine Address: 172.17.5.13 (redhat5.8+oracle11g client)
Service-side Machine address: 192.168.1.48 (solaris9+oracle10g server)

Tested:

1, tnsping rwdb Normal, this indicates that the client tnsnames configuration is not a problem

2, in other environment testing redhat5.8+oracle 11g Client Connection solaris9+oracle 10g server, success, no compatibility issues.

So, how do you continue to troubleshoot problems? Strace will come in handy at this time.

Execute on the client machine:

#strace-o strace.log sqlplus xxxx/[email protected]

When stuck, the Strace.log shows:

BRK (0x9196000) = 0x9196000
Socket (pf_inet, sock_stream, ipproto_ip) = 9
Fcntl (9, F_SETFL, o_rdonly| O_nonblock) = 0
Connect (9, {sa_family=af_inet, sin_port=htons (1521), sin_addr=inet_addr ("192.168.1.48")}, +) =-1 einprogress ( Operation now in progress)
Times ({tms_utime=1, tms_stime=1, tms_cutime=0, tms_cstime=0}) = 2186561939
Mmap (NULL, 528384, prot_read| Prot_write, map_private| Map_anonymous,-1, 0) = 0x2ae433057000
Poll ([{fd=9, events=pollout}], 1, 60000) = 1 ([{fd=9, revents=pollout}])
GetSockOpt (9, Sol_socket, So_error, [-132438043876392960], [4]) = 0
Fcntl (9, F_GETFL) = 0x802 (Flags o_rdwr| O_nonblock)
Fcntl (9, F_SETFL, O_RDWR) = 0
GetSockName (9, {sa_family=af_inet, sin_port=htons (53136), sin_addr=inet_addr ("172.17.5.13")}, [549755813904]) = 0
GetSockOpt (9, Sol_socket, SO_SNDBUF, [366915001648168960], [4]) = 0
GetSockOpt (9, Sol_socket, SO_RCVBUF, [366915001648239956], [4]) = 0
SetSockOpt (9, Sol_tcp, Tcp_nodelay, [1], 4) = 0
Fcntl (9, f_setfd, fd_cloexec) = 0
Rt_sigaction (Sigpipe, {0x1, ~[ill ABRT BUS FPE SEGV USR2 xcpu xfsz SYS rtmin rt_1], Sa_restorer|sa_restart|sa_siginfo, 0x3 5A240EBE0}, {SIG_DFL, [], 0}, 8) = 0
Write (9, "\0\324\0\0\1\0\0\0\1:\1,\fa \0\177\377\177\10\0\0\1\0\0\232\0:\0\0\10\0" ..., 212) = 212
Read (9,

Analysis:

Here the FD9 is a client Connection server (192.168.1.48) 1521 port socket, this socket is established is normal, but when write sent out the first message out of the card. Here we have reason to start to doubt the network problem, whether the network instability caused the sqlplus anomaly.

Sure enough, through the test, found in Hong Kong to Shenzhen, the VPN network environment is unstable, there are drops:

#ping-S 172.17.5.13 1024x768 >ping.log

14% Packet loss rate:
530 packets transmitted, 455 packets received, 14% packet loss
Round-trip (ms) Min/avg/max = 11/11/17

Contact the network administrator to switch routes from VPN to another leased line after the problem is resolved, the Sqlplus connection is successful. This case also shows that sqlplus to the network environment is very demanding, has always thought that even if there is a packet loss should be able to connect, but it is not true.

Summary:

Through the analysis of strace, it may not be possible to solve the problem at once, but also to consider other factors. This case, even if not analysis strace, through the test network, eventually can find the problem, but Strace can provide a good evidence and reference, here the socket communication problem is evidence, otherwise what is the network caused by the problem?

This article is from "Memory Fragment" blog, declined reprint!

Practical example: Using Strace to analyze database connection problems

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More