MARIADB Audit log Import hive

Source: Internet
Author: User
Tags gz file hdfs dfs logstash

We use the MARIADB, which is used by this audit tool https://mariadb.com/kb/en/library/mariadb-audit-plugin/

This tool does not take into account the late processing of the data, because his log is like this

20180727 11:40:17,aaa-main-mariadb-bjc-001,user,10.1.111.11,3125928,6493942844,query,account, ' Select    ID, company_id, user_id, Department, title, role, Create_time, Update_time, status,    Is_del, Receive_email    , contact From company     WHERE (  user_id = 101                  and Is_del = 0) ', 0

So it needs to be logstash formatted.

Input {file{Path= ["/data/logs/mariadb/server_audit.log"] Start_position="End"codec=Multiline {CharSet="iso-8859-1"pattern="^[0-9]{8}"negate=true What="previous"}}}filter {if "Quartz" inch[message] {drop {}} mutate {Gsub= [            "message","\s"," ",            "message","\s+"," "]} dissect {mapping= {            "message"="%{ts}%{+ts},%{hostname},%{user},%{dbhost},%{connid},%{queryid},%{operate},%{db},%{object}"}} mutate {replace= = {"message"="%{ts}%{hostname}%{user}%{dbhost}%{operate}%{db}%{object}"}}}output {file{Path="/data/logs/mariadb/%{host}_%{+yyyy-mm-dd_hh}.gz"        gzip=truecodec= = line {format ="%{message}" }    }}

Attention!!!

Replace that place between the fields I use tab-separated, if using vim must not write \ t, this in hive does not know anything else, in Vim, first press CTRL + V, and then press TAB

In Vim, the set list is displayed as follows

MARIADB audit log can not be cut by the hour, above Logstash I put the log by the hour to generate GZ file, the back is pushed to HDFs, during the trial of various methods

Logstash output [Webhdfs] efficiency No, we lost the data.

Syslog-ng

Rsyslog

It doesn't work, and ultimately it's perfect to use the HDFs CLI directly.

I put Logstash configuration and push the command to HDFs to the RPM, the following post rpm SPEC file also record it

name:logstashversion:1.0.0Release:1%{?Dist} Summary:specialize to MySQL audit log collectionLicense:GPLAutoReqProv:no%define __os_install_post%{nil}%Description%Prep%Build%InstallRM-RF $RPM _build_rootmkdir-P%{buildroot}/{apps,etc,usr,var/lib/logstash,var/log/Logstash}CP-R%{_BUILDDIR}/ETC/* %{buildroot}/etc/cp-r%{_builddir}/usr/*%{buildroot}/usr/cp-r%{_builddir}/apps/*%{buildroot}/apps/%postchown-r Root:root/usr/share/logstashchown-r root/var/log/logstashchown-r root:root/var/lib/logstashchown-r Root:root/apps /hadoop-2.6.0/usr/share/logstash/bin/system-install/etc/logstash/startup.optionscat >>/etc/hosts << eof# for Logstash push msyql Audit to HDFs fill in HDFs namenode and Datanode Hostseofecho "$ (shuf-i 3-15-n 1) * * * * *" ' sour Ce/etc/profile;/apps/hadoop-2.6.0/bin/hdfs dfs-copyfromlocal/data/logs/mariadb/${hostname}_$ (date-u + "\%Y-\%m-\% D_\%h "-D" last Hour "). GZ hdfs://active_namenode/mysql_audit/$ (date-u +" \%y-\%m-\%d ")/&& rm-f/data/logs/ mariadb/${hostname}_$ (date-u + "\%y-\%m-\%d_\%h"-D "last Hour"). Gz ' >>/var/spool/cron/rootinitctl start Logstash%files%defattr (-,root,root)/apps/hadoop-2.6.0/etc/logstash/usr/share/logstash/var/lib/logstash/var/log /logstash%preunif [$1-eq 0]; Then # Upstart if [-R '/etc/init/logstAsh.conf "]; Then if [-F "/sbin/stop"]; Then/sbin/stop logstash >/dev/null 2>&1 | | True Else/sbin/service logstash stop >/dev/null 2>&1 | | True fi if [-F "/etc/init/logstash.conf"]; Then rm/etc/init/logstash.conf fi # SYSV Elif [-R "/etc/init.d/logstash"]; Then/sbin/chkconfig--del Logstash If [-F "/etc/init.d/logstash"]; Then Rm/etc/init.d/logstash fi # systemd Else systemctl stop logstash >/dev/null 2>&1 | | True if [-F "/etc/systemd/system/logstash-prestart.sh"]; Then rm/etc/systemd/system/logstash-prestart.sh fi if [-F "/etc/systemd/system/logstash.service"]; Then Rm/etc/systemd/system/logstash.service fi fi if getent passwd logstash >/dev/null; Then Userdel Logstash fi if getent group Logstash >/dev/null; Then Groupdel logstash fifi%postun%cleanrm-rf $RPM _build_root

I put the Hadoop program in, so it's convenient.

Auto start Logstash Jingxiang after installing rpm

Now the log has been written to HDFs in days, and then into hive

Create Hive Table First

string,hostnamestringString, Object string int string ' \ t ';

divided by 3 partition

Load HDFs to Hive

#!/bin/bash# description:load HDFs MySQL audit gz to hive# author:quke# Date:2018- -- -Source/root/. Bash_profilecur_date=$(Date-U +"%y-%m-%d"-D"Last hour") Cur_date_short=$(Date-U +"%y%m%d"-D"Last hour") Cur_hour=$(Date-U +"%H"-D"Last hour") forFninch$ (HDFs DFS-ls/mysql_audit/${cur_date}/*_${cur_hour}.gz|awk ' {print $NF} ');d o host_name=$ (echo $fn |awk-f [/_] ' {print $ (NF-2)} ') MODULE=${HOST_NAME%-BJ c*} echo "Load Data inpath ' HDFS://OSSMONDB${FN} ' into table Mysql_audit partition (Dt=${cur_date_short},hour=${cur_hour },module= ' ${module} '); ">> hive.sqldonehive-f hive.sql && rm-f hive.sql

Have any questions welcome to Exchange

MARIADB Audit log Import hive

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.