How to get the file format accurately with unpack in PHP

Source: Internet
Author: User
Tags fread unpack
Purpose: To identify user-uploaded XML

Issue 1: User uploaded XML may have a modification suffix, that is, the scripting language itself, but disguised as XML, such as PHP

Resolved: Then I use the following code to get the file suffix exactly

Question 2: The code reads the header two bytes of the file through Fread, it works well in identifying the image, but it is not very clear when it comes to distinguishing between XML and PHP, because their first two bytes are '

            if (($fp = fopen($this->path, 'rb')) == FALSE)            {                throw new \Exception('打开文件失败。');            }            if (!($read = fread($fp, 2)))            {                throw new \Exception('文件内容读取为空或读取失败');            };            $info = unpack('C2chars', $read);            $code = intval($info['chars1'].$info['chars2']);            fclose($fp);            switch ($code)            {                case 3780: return 'pdf';                case 5666: return 'psd';                case 6033: return 'html';                case 6063: return 'xml';    // php                default: throw new \Exception('文件格式超出了系统识别范围。');            }

Reply content:

Purpose: To identify user-uploaded XML

Issue 1: User uploaded XML may have a modification suffix, that is, the scripting language itself, but disguised as XML, such as PHP

Resolved: Then I use the following code to get the file suffix exactly

Question 2: The code reads the header two bytes of the file through Fread, it works well in identifying the image, but it is not very clear when it comes to distinguishing between XML and PHP, because their first two bytes are '

            if (($fp = fopen($this->path, 'rb')) == FALSE)            {                throw new \Exception('打开文件失败。');            }            if (!($read = fread($fp, 2)))            {                throw new \Exception('文件内容读取为空或读取失败');            };            $info = unpack('C2chars', $read);            $code = intval($info['chars1'].$info['chars2']);            fclose($fp);            switch ($code)            {                case 3780: return 'pdf';                case 5666: return 'psd';                case 6033: return 'html';                case 6063: return 'xml';    // php                default: throw new \Exception('文件格式超出了系统识别范围。');            }

In fact, I do not think you think so complicated ah, do not care too much about the suffix of this problem, the key is the file content. You just need to parse the XML class, for example simplexml , if it is not a canonical XML document will be returned false , in addition to the final content can be converted to string prevent the execution of code within the file.

  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.