Php Statistics image usage, reverse link, etc.

Source: Internet
Author: User
Tags php write apache log cron script nginx server
Php Statistics image usage, reverse link, etc. Recently, I want to make a statistical report on who used images, links, and other data on my website.

This is probably the case for picture statistics.
/Var/www/html/1.jpg
/Var/www/html/tracker. php
/Var/www/html/. htacess


RewriteEngine On
RewriteBase/
RewriteRule ^ (. * Mirror.jpg $ tracker. php? Id = $1 [L]
RewriteCond % {REQUEST_FILENAME }! -F
RewriteCond % {REQUEST_FILENAME }! -D
RewriteRule./index. php [L]


Header ('content-type: image/jpeg ');
Readfile(pai_getpai'id'0000.'.jpg ');
// File_put_contents('log.txt ', $ _ GET ['id'].'. $ _ SERVER ['remote _ ADDR '].'. var_dump (apache_request_headers ()));
?>

The above code can be used to count the number of times the image has been opened, the visitor's IP address and browser data. But how does one count the reverse link? For example, if another website uses this image, how can we count the number of websites that use my images? (Instead of simply opening a browser)

In addition, for example, I make a small plug-in. If users are allowed to embed this plug-in into their websites, what code should script. php write? which websites use my plug-in?

I just want to know how to write the code? The contact between the returned data and the database can be designed separately. Thank you.


Reply to discussion (solution)

How can I count how many websites use my images? (Instead of simply opening a browser)

As long ?? Http? Are you sure you want to use it ?? ?.

You can add $ _ SERVER ['http _ referer'] in tracker. php? ? SOURCE address, that is? Excuse me ??? ? Address.
Then pass? Positive ?,? Get the domain of the url ?.?? You can directly group by photo.
Table ??
Id photo domain

Your code can only count dynamic requests, there is nothing to do with static URLs, such as http://www.mydomain, com/1.jpeg
The correct method is to analyze the log files of the web server.

Right upstairs, if all your images are displayed in dynamic php, your program will be ready. Add $ _ SERVER ['http _ referer'] to obtain the source.
Static images can be analyzed by obtaining apache log.

Your code can only count dynamic requests, there is nothing to do with static URLs, such as http://www.mydomain, com/1.jpeg
The correct method is to analyze the log files of the web server.



Php analysis access_log? That log has nearly MB every day. if the cron script runs every 5 minutes, how can we efficiently read log files? (Time Period: current time-5 minutes to the current time, is it necessary to completely read the entire log file ?)
How can I do foreach? By/r/n or something else? Then regex and put it into the database?

Can you give efficient code to Duan Xia? Thank you.

Right upstairs, if all your images are displayed in dynamic php, your program will be ready. Add $ _ SERVER ['http _ referer'] to obtain the source.
Static images can be analyzed by obtaining apache log.



Oh, thank you. I would like to ask if php will spend more CPU, memory, and I/O than static image reading? Will it increase a lot?

1. static files are directly read by the web server, while php needs to start the php parsing program before being read by the php program.
Boards and toes will know who is more efficient
2. log files are added and deleted, and the existing content is not modified. So you only need to remember where you read it last time. This time you can continue to read it.

?? Faster,
Php? ? Slow.
It is best to use ??.

Not all log storage paths of virtual hosts can be modified by yourself.

So the project you developed can only be used by the webmaster of the cloud server. of course, you don't think there is anything to use the image, do you?

China's network is open, and few people care about these things. they should be in the spirit of sharing.


If one day, we find that images are used a lot. Then, you only need a pseudo-static code to convert the referenced access image into a logo or other well-crafted advertisement image,

Have you found that many images sometimes show "This image is from the XXX station, please visit **". these images are all pseudo-static and can be done in one sentence. As long as someone references a website image, the image will become another image with pseudo-static settings.

1. static files are directly read by the web server, while php needs to start the php parsing program before being read by the php program.
Boards and toes will know who is more efficient
2. log files are added and deleted, and the existing content is not modified. So you only need to remember where you read it last time. This time you can continue to read it.



Now, how can I write a PHP code to analyze the log file? Loop and regular expression reading. how can we remember where the last Read was?
SSH can use cat/var/log/httpd/access_log | grep "1.jpeg" to print logs of hundreds of MB for all 1.jpeg users in one second.
PHP does not understand, but has to pay attention to efficiency and consume less resources.

Only know that the nginx server can configure referer anti-Leech protection, customize the log, and write the referer in
Apache must also have
If you need php to analyze the log... how to import it to SQL at regular intervals and then clear the log?

If you use log files to analyze the image usage, I think it is not very reliable. just now, the group owner said that the log files only increase or decrease, and the analysis accuracy is not sure.

If you use log files to analyze the image usage, I think it is not very reliable. just now, the group owner said that the log files only increase or decrease, and the analysis accuracy is not sure.



For accuracy, you can use date_default_timezone_set () as the server's time zone, and then use getdate (); to obtain the current time. Crontabm is executed every minute, as long as all records from the previous minute of the current time are regressed.

The main problem is how to efficiently Open the last few records of a large file in php. you can read a little more and set the last 1000 records read as access_log each time. My server has 32 GB memory, but it requires a low CPU overhead. Thank you.

1. static files are directly read by the web server, while php needs to start the php parsing program before being read by the php program.
Boards and toes will know who is more efficient
2. log files are added and deleted, and the existing content is not modified. So you only need to remember where you read it last time. This time you can continue to read it.



Find a way, http://httpd.apache.org/docs/2.2/programs/rotatelogs.html
Use rotatelogs to generate a new log file every minute. But why didn't I generate the desired log after I restarted apache?

 
      LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" com                                                                                        bined    LogFormat "%h %l %u %t \"%r\" %>s %b" common    
  
         LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" %I                                                                                         %O" combinedio    
      CustomLog "logs/access_log" combined env=!dontlog    CustomLog "|sbin/rotatelogs -f logs/my_log 60" combined env=!dontlog    SetEnvIf Remote_Addr "127\.0\.0\.1" dontlog    SetEnvIf Remote_Addr "::1" dontlog
 

May be related to Apache version http://apache.chinahtml.com/logs.html

You didn't seem to understand what I meant about reading log files.
The log file is an indefinite record file. If no index is available, the specified row cannot be located.
However, it is not necessary to build an index file by yourself. after all, the logs are "old news", so you do not need to read the index again after reading it.
File functions include:
Ftell -- return the read/write position of the file pointer
Fseek -- Locate in file pointer

You only need to read the offset position with ftell after each fgets and save
When the next read operation is performed, retrieve the offset position saved last time and use fseek to locate the offset.
You can continue to read it.

Some people may say that fgets is too inefficient at processing a row at a time, but it is not easy to process half of the lines at the end when fread is used for a large row at a time.

Thank you.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.