Problems and analysis
Today, when the company sanity test (usability test) found that the server failed to start, after finding the log found that an exception occurred at startup caused the server to fail to start.
Because the company's servers and databases are deployed on different hosts, it will be time-checked at startup, and if the difference between the two hosts is more than 600s, an exception will be thrown, and the boot fails. The time difference for this setting is stored in the database to facilitate future modifications.
<! --more--
In the code, the MyBatis to read the time difference stored in the DB, the SQL statement is as follows:
SELECT TO_CHAR(CURRENT_TIMESTAMP,‘YYYY-MM-DD HH24:MI:SS‘) AS "DBTIME";
The local time and time of the server is verified by the following code:
final DateTime dbTime = systemMapper.getDBTime();final long dbTimeMs = dbTime.getMilliseconds(TimeZone.getDefault());final DateTime webAppTime = DateTime.now();final long webAppTimeMs = webAppTime.getMilliseconds(TimeZone.getDefault());// Calculate difference between WebApp time and DB timefinal long timeDifferent = Math.abs(dbTimeMs - webAppTimeMs);
From the code, you can see that the local time of the host on which the server and database resides is converted to the default time zone, and then the absolute value is subtracted, and if the result exceeds the difference set in db (that is, 600s), it throws an exception, which causes the server to fail to start.
Analysis here, then began to verify: connected to two hosts and through the date
command to query their time, found that the difference between the two sides about 15 minutes, indeed more than 600s.
So the question came, why suddenly there is such a big time difference? Obviously yesterday can start normally, but today because of the time difference to become large and lead to failure? To be sure, the time difference set in db is always 600s, and no one is changing it.
In the Working group said the server startup failure reasons, some colleagues said that may be caused by the machine power failure? Baidu, but also some people encounter this similar situation, Linux system time suddenly slowed down a few minutes to more than 10 minutes, there are plenty of time has become faster. Temporarily did not find the specific reason, the solution is basically directly modify the system time.
At present, the answer to this question is not known, I really do not understand this aspect. If any friend knows, welcome to the comment tell Me O (∩_∩) o haha ~
Write an article today to record this problem, I really did not think of the need for different host server and database time check, Baidu, but found a lot of Android app on the client and the service side of the article to check the time, very interesting.
Checksum of server time and database time