Purpose: To identify user-uploaded XML
Issue 1: User uploaded XML may have a modification suffix, that is, the scripting language itself, but disguised as XML, such as PHP
Resolved: Then I use the following code to get the file suffix exactly
Question 2: The code reads the header two bytes of the file through Fread, it works well in identifying the image, but it is not very clear when it comes to distinguishing between XML and PHP, because their first two bytes are '
if (($fp = fopen($this->path, 'rb')) == FALSE) { throw new \Exception('打开文件失败。'); } if (!($read = fread($fp, 2))) { throw new \Exception('文件内容读取为空或读取失败'); }; $info = unpack('C2chars', $read); $code = intval($info['chars1'].$info['chars2']); fclose($fp); switch ($code) { case 3780: return 'pdf'; case 5666: return 'psd'; case 6033: return 'html'; case 6063: return 'xml'; // php default: throw new \Exception('文件格式超出了系统识别范围。'); }
Reply content:
Purpose: To identify user-uploaded XML
Issue 1: User uploaded XML may have a modification suffix, that is, the scripting language itself, but disguised as XML, such as PHP
Resolved: Then I use the following code to get the file suffix exactly
Question 2: The code reads the header two bytes of the file through Fread, it works well in identifying the image, but it is not very clear when it comes to distinguishing between XML and PHP, because their first two bytes are '
if (($fp = fopen($this->path, 'rb')) == FALSE) { throw new \Exception('打开文件失败。'); } if (!($read = fread($fp, 2))) { throw new \Exception('文件内容读取为空或读取失败'); }; $info = unpack('C2chars', $read); $code = intval($info['chars1'].$info['chars2']); fclose($fp); switch ($code) { case 3780: return 'pdf'; case 5666: return 'psd'; case 6033: return 'html'; case 6063: return 'xml'; // php default: throw new \Exception('文件格式超出了系统识别范围。'); }
In fact, I do not think you think so complicated ah, do not care too much about the suffix of this problem, the key is the file content. You just need to parse the XML class, for example simplexml
, if it is not a canonical XML document will be returned false
, in addition to the final content can be converted to string
prevent the execution of code within the file.