Perl reads wtmpx log files

Source: Internet
Author: User

Author: idle
Time:
Blog: blog.csdn.net/cg_ I
Email: B _dx@sohu.com
Environment: sco_sv scosysv 3.2 5.0.6 i386
Perl v5.8.7 built for i586-pc-sco3.2v5.0

Body

In most UNIX variants, logon and logout are tracked to file ① named wtmpx (or wtmp. Generally, if you have doubts about a user's logon behavior (for example, the host from which a user often logs on, but another user logs on from another place), we usually check this file. On different operating systems, the file location may also be different (for example, on sco unix, it is in/etc and/var/adm 2, on Ubuntu, it is in/var/log ).
There are many different types of logs. The most common types of log files are completely composed of text lines. Compared with text lines that are easy to parse, The wtmpx log recording method produces obscure binary files with special formats. Fortunately, Perl is not afraid of these seemingly strange files.

Use unpack ()

Perl has a function called unpack (), which is specially designed to parse binary and structured data. Let's see how to use it to process wtmpx files. The formats of wtmp and wtmpx files vary with UNIX variants. In this regard, I will introduce the wtmpx file on SCO UNIX. The following figure (1-1) shows the pure regionalization of the first two records of the wtmpx file on SCO Unix:

(Figure 1-1 sco unix wtmpx record regionalization style)

Unless you are familiar with the structure of such a file, the data called "ASCII dump" is of no difference to garbled characters. So how do we know this file structure? The simplest way to understand this file format is to view the source code of the program that reads and writes the file. If you are not familiar with the C language, this task may make you feel discouraged. Fortunately, we don't need to know much about it, or even need to view most of the source code. We can just read the part that defines the file format.
Most operating system programs that read and write wtmpx files obtain the file definition from a short C inclusion file. This file is generally/usr/include/utmp. h or utmpx. h. We only need to read the C data structure definition with the relevant file format information. If you search structutmpx, you can find what we need to know. The following lines of struct utmpx define fields in the structure. Each of these rows must have a comment line that complies with the c annotation Convention/* text. To let you know the differences between two different versions of utmpx, let's compare the code fragments related to these two different operating systems.
The following are related fragments of utmp. h and utmpx. h on SCO Unix:

struct ut_exit_status {    short__e_termination ;    /* Process termination status  */    short __e_exit ;           /* Process exit status  */};#defined e_termination__e_termination#defined e_exit__e_exit/* * Structure used to specify timeout in select(2) system call. */struct timeval {longtv_sec;/* seconds */longtv_usec;/* and microseconds */};struct utmpx {charut_user[32];/* user login name  */charut_id[4]; /* inittab id  */charut_line[32];/* device name (console, lnxx)  */#ifdef MOD_FOR_GEMINIlongut_pid;/* process id  */#elsepid_tut_pid;/* process id  */#endifshortut_type; /* type of entry  */struct ut_exit_status ut_exit ;/* process termination/exit status  */structtimeval ut_tv;/* time entry was made  */longut_session;/* session ID, used for windowing  */longut_pad[5];/* reserved for future use  */shortut_syslen; /* significant length of ut_host  *//* including terminating null  */charut_host[257];/* remote host name  */} ;

The following part is from the utmp. h file of ubuntu 12.04 lts 64bit:

struct utmp {short   ut_type;             /* Type of record */     pid_t   ut_pid;              /* PID of login process */     char    ut_line[UT_LINESIZE];/* Device name of tty - "/dev/" */     char    ut_id[4];            /* Terminal name suffix,                                   or inittab(5) ID */     char    ut_user[UT_NAMESIZE];/* Username */     char    ut_host[UT_HOSTSIZE];/* Hostname for remote login, or                                    kernel version for run-level                                    messages */     struct  exit_status ut_exit;/* Exit status of a process                              marked as DEAD_PROCESS; not                              used by Linux init(8) *//* The ut_session and ut_tv fields must be the same size whencompiled 32- and 64-bit.  This allows data files and sharedmemory to be shared between 32- and 64-bit applications. */
#if __WORDSIZE == 64 && defined __WORDSIZE_COMPAT32     int32_t ut_session;    /* Session ID (getsid(2)),                                    used for windowing */        struct {        int32_t tv_sec; /* Seconds */        int32_t tv_usec;/* Microseconds */         } ut_tv;                /* Time entry was made */     #else         long   ut_session;      /* Session ID */         struct timeval ut_tv;   /* Time entry was made */     #endif     int32_t ut_addr_v6[4];        /* Internet address of remote                                 host; IPv4 address uses                                 just ut_addr_v6[0] */     char __unused[20];           /* Reserved for future use */};

These files provide us with all the necessary clues to construct the Unpack () statement. Unpack () uses a data format template as its first parameter. Then it uses this template to determine how to disassemble the binary data obtained from the second parameter (usually binary. Unpack () splits the data by format and returns a list. Each element in the list corresponds to the corresponding element in the provided template.
Based on the C data structure in the sco unix utmpx. h file, let's construct our template step by step. There are several template letters that we can use. I will explain these characters in Table 1-1, but you should check the Pack () Section on the perlfunc manual page for more information. Constructing a template is sometimes not very straightforward, because the C compiler sometimes fills in values to meet alignment requirements. The pstruct Command provided by Perl can help us with such problems.

Table 1-1: Convert the C code of utmpx. H to the Unpack () template

C code

Unpack () template

Description of template letters/duplicates

Char ut_user [32]

A32

ASCII string, with a length of 32 bytes (the remaining part is filled with spaces)

Char ut_id [4]

A4

ASCII string, with a length of 4 bytes (the remaining part is filled with spaces)

Char ut_line [32]

A32

ASCII string, with a length of 32 bytes (the remaining part is filled with spaces)

Pid_t ut_pid

S

Signed short Integer Data

Short ut_type

S

Signed short Integer Data

Short e_termination

S

Signed short Integer Data

Short e_exit

S

Signed short Integer Data

Long TV _sec

L

Signed long integer value (4 bytes, which may be different from the real long integer size on some machines)

Long TV _usec

L

Signed long integer value (4 bytes, which may be different from the real long integer size on some machines)

Long ut_session

L

Signed long integer value (4 bytes, which may be different from the real long integer size on some machines)

Long ut_pad [5]

X20

Skip 20 bytes to fill the space

Short ut_syslen

S

Signed short Integer Data

Char ut_host [2, 257]

Z257

ASCII string, ending with an empty string, containing \ 0, Length: 257 bytes

 

X③

The padding (1 byte) inserted by the compiler ).

After the template is constructed, let's use it in the real code:

# Template for sco unix utmpx #! /Usr/bin/perl-wuse strict; my $ template = 'a32 A4 A32 s l x20 s z257 x '; my $ recordsize = length (pack ($ template, (); open my $ wtmp, '<', '/etc/wtmpx' or die "unable to open wtmpx: $! \ N "; my ($ ut_user, $ ut_id, $ ut_line, $ ut_pid, $ ut_type, $ ut_e_termination, $ ut_e_exit, $ TV _sec, $ TV _usec, $ ut_session, $ ut_syslen, $ ut_host) = (); my $ record; while (read ($ wtmp, $ record, $ recordsize) {($ ut_user, $ ut_id, $ ut_line, $ ut_pid, $ ut_type, $ ut_e_termination, $ ut_e_exit, $ TV _sec, $ TV _usec, $ ut_session, $ ut_syslen, $ ut_host) = unpack ($ template, $ record ); if ($ ut_type = 8) {$ ut_host = '(exit)';} print "$ ut_line: $ ut_user: $ ut_host :". scalar localtime ($ TV _sec ). "\ n";} Close $ wtmp;

The following is the output segment of this applet:

Ttyp0: Root: 11.227.35.199: Sun Jun 3 10:22:54 2012

Ttyp0: :( exit): Sun Jun 3 10:23:41 2012

......

Before proceeding, the third parameter of the read () function is the number of bytes it will read. Compared with the size of the record to be read (for example, "32" bytes), we prefer to use a convenient attribute of the pack () function, then let it tell us the size of the record corresponding to the template:

my $recordsize = length( pack( $template, () ) );

Call the operating system (or other) binary file

Since reviewing wtmpx files is a common task, therefore, the UNIX system has a command named last to print the content of the binary file in readable form ("Perl lists who is on the system" who command reads the utmpx file ). The following output sample is almost the same as the output in the previous example:

Root P0 ttyp0 13435 sunjun 3

......

We can easily call binary files such as last in Perl. The following code will show all user names found in the current wtmpx file repeatedly:

# Path of the last command binary file my # lastexec = '/usr/bin/la'; open my $ last,'-| ', "# lastexec" or die "unable to run $ lastexec: $! \ N "; my % seen; while (my $ line = <$ last>) {last if $ line = ~ /^ $/; My $ user = (split ('', $ line) [0]; print" $ user \ n "unless exists $ seen {$ user }; $ seen {$ user} = '';} Close $ last or die" unable to properly close pipe: $! \ N ";

Since unpack () can satisfy all our requirements, why should we use the method mentioned above? The reason is portability. As you can see, the formats of wtmp/X Files are inconsistent, which directly invalidates your previous perfect unpack () template.
However, you can always rely on the last command to read the file in this format. With this command, you can be relatively independent from the underlying format change without any impact. If you use the Unpack () method, you have to create and manage multiple separate template strings for the wtmp/X Files in different formats to be parsed.
Compared with unpack (), the biggest disadvantage of using this method is the addition of complex programs that parse the required fields in the program. With unpack (), all fields are automatically extracted from the data. Use our last example, which is not always useful. There are other technologies for writing more advanced parser, just like the Perl Philosophy: "There 'smore than one way to do it. (There is more than one way to do this )".

 

Note 1:

SCO Unix has four log files: utmp, utmpx, wtmp, and wtmpx. The first two are used for the WHO command, and the latter is used for the last command.

NOTE 2:

In fact, they all point to the/var/opt/k/sco/Unix/5.0.6ga/etc/wtmpx (or wtmp) file. And,/var/opt/k/sco/Unix/5.0.6ga/etc/utmpx (or utmp) File

NOTE 3:

For those who do not know "C compiler byte alignment", here is a simple description, utmpx. h has a macro command # pragma pack (4) defined structure. The data members of the Union are aligned in 4 bytes, and the first is placed in a place with the offset of 0, in the future, each data member's initial storage location must be divisible by four. If the correct boundary alignment requirements are not met, additional memory space (vacant space filling) may appear between members ). The utmpx structure is well-designed, and each member's initial storage location meets the alignment requirements. However, by calculating the number of bytes occupied by each record, we can find that: 32 + 4 + 32 + 2 + 2 + 2 + 4 + 4 + 4 + 20 + 2 + 257 = bytes byte from figure 1-1, we can see that the range is from 0x000 ~ 0x016e (367 byte), the starting position of the next storage is 0x016f, which does not meet the correct boundary alignment requirements. The Compiler fills in one byte of space (x origin at the end of the template ), through analysis, we can see that each record occupies 368 bytes in total.

 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.