Introduction
When you use WebClient to download remote resources, you often encounter URLs like this:
http://www.uushare.com/filedownload?user=icesee&id=2205188
http://www.guaishow.com/u/luanfujie/g9675/
We do not know whether this URL specifically represents a Web page, or a file of some kind.
Some URLs, though with an extension, may be the wrong extension, like a GIF file marked with a JPG extension.
If we cannot correctly determine the file type of the download source, we cannot save it as the correct file format, which can be a problem for subsequent operations and manual reading.
Fortunately, WebRequest can give the MIME information of the download source, which allows us to determine the true format of the file and to determine the final storage extension. (what is MIME? )
Creating a MIME mapping dictionary
The first thing we need to do is create a mapping dictionary of MIME types to their corresponding extensions.
I found a list of MIME types from the Internet and converted them to program code through regular expressions, sticking to the program:
The amount of code that is converted through regular expressions is very large.
Note that there are many types of data with the same MIME type but different extensions. When we add to the dictionary, we will ignore unnecessary records, such as the highlight of the three are Audio/x-aiff type, then the latter two extensions will not be added to the dictionary, and will not be used in subsequent operations.
If you think that some types add the corresponding extension is not the most common type of corresponding, you have to manually adjust the code. (This is the case in the following example, where the text/html corresponds to a DHTML extension, image/jpeg corresponds to the JPE extension)
After the dictionary has been built, you can use this method to get the extension of the MIME type:
String gets the corresponding extension (string ContentType)
{
foreach (var f in Mimedic.keys)
{
if (Contenttype.tolower (). IndexOf (f) >= 0) return mimedic[f];
}
return null;
}
The IndexOf method is used here because other information, such as the encoding format, may also be included in the incoming contenttype.
Digression: See the Internet has been complained that WebClient download Web pages easily generated garbled, but also difficult to read the page encoding format, in fact, WebRequest contenttype contains MIME and encoded format information:
Generate download file path
Now that we have the above method, we can determine the file extension by MIME type.
Now we will write a method for generating the path to the download file, which features:
The source URL of the profiling file, with its file name part as the file name for the download file.
If the URL does not contain a filename section (domain name or directory form), its directory name is the file name of the download file.
Automatically determines and replaces the original extension (if any) in the URL based on the incoming MIME type to use as the file name for the download file.
Determine if a file in the incoming storage directory already exists with the same download file name, and rename it if it exists until there is no file of the same name.
The function is a bit much, not suitable for example, but still very practical, so here is to share it out.
The code is:
Copy Code code as follows:
String generates the download file storage path (string storage directory, Uri Uri, string ContentType)
{
var ex = gets the corresponding extension (ContentType);
string up = null;
string upne = null;
if (Uri.localpath = = "/")
{
Handling URLs is the case of a domain name
up = Upne = Uri.host;
}
Else
{
if (Uri.LocalPath.EndsWith ("/"))
{
Handling URLs is a case of a directory
up = Uri.LocalPath.Substring (0, uri.localpath.length-1);
Upne = Path.getfilename (UP);
}
Else
{
Working with regular URLs
up = Uri.localpath;
Upne = path.getfilenamewithoutextension (UP);
}
}
var name = string. IsNullOrEmpty (ex)? Path.getfilename (UP): Upne + "." + ex;
var fn = path.combine (storage directory, name);
var x = 1;
while (File.exists (FN))
{
fn = Path.Combine (storage directory, path.getfilenamewithoutextension (name) + ("+ x + + +") "+ path.getextension (name));
}
return FN;
}
To verify its effect, we evaluate it through a unit test:
Copy Code code as follows:
[TestMethod]
public void filename Generation test ()
{
var d = @ "C:\Users\Public\Downloads";
GIF format file, normal download
Assert.AreEqual (@ "C:\Users\Public\Downloads\35ad5275ed17904d4a2d40f3dacea80b.gif") generates the download file store path (d, New Uri ("/upload /2009-11/20091112231022422.gif ")," image/gif ");
The extension in the URL is gif, but the MIME type is actually a image/jpeg resource. The download extension is JPE because the corresponding extension stored in the dictionary mimedic is JPE.
Assert.AreEqual (@ "C:\USERS\PUBLIC\DOWNLOADS\35AD5275ED17904D4A2D40F3DACEA80B.JPE") generates the download file store path (d, New Uri ("/upload /2009-11/20091112231022422.gif ")," image/jpeg ");
A Web page URL with parameters. The download extension is DHTML, because the corresponding extension stored in the dictionary mimedic is DHTML.
Assert.AreEqual (@ "C:\Users\Public\Downloads\filedownload.dhtml") generates the download file store path (d, New Uri ("http://www.uushare.com/ filedownload?user=icesee&id=2205188 ")," text/html ");
A Web page URL, in the form of a table of contents, without an exact filename.
Assert.AreEqual (@ "C:\Users\Public\Downloads\g9675.dhtml") generates the download file store path (d, New Uri ("http://www.guaishow.com/u/ luanfujie/g9675/")," text/html ");
Domain Name Form
Assert.AreEqual (@ "C:\Users\Public\Downloads\www.g.cn.dhtml") generates the download file store path (d, New Uri ("http://www.g.cn/"), "text/html "));
Assert.AreEqual (@ "C:\Users\Public\Downloads\g.cn.dhtml", Generate download File store path (d, New Uri ("http://g.cn"), "text/html"));
}
File download
Everything is ready, only the East wind, let us complete the download method:
Copy Code code as follows:
<summary>
Download the file to the specified directory and return to the file path that was stored after the download
</summary>
<param name= "Uri" > URL </param>
<param name= "Storage Directory" > storage directory, if there is already a file in the directory with the same name as the file to be downloaded, then automatically rename </param>
<returns> download files stored in the file path </returns>
public string download file (Uri uri, string storage directory)
{
var q = webrequest.create (Uri). GetResponse ();
var s = q.getresponsestream ();
var B = new BinaryReader (s);
var file = Generate download File store path (storage directory, Uri, q.contenttype);
FileStream fs = new FileStream (file, FileMode.Create, FileAccess.Write);
Fs. Write (b.readbytes (int) q.contentlength), 0, (int) q.contentlength);
Fs. Close ();
B.close ();
S.close ();
return file;
}
The code is simple enough to say, let's finish the final test:
Copy Code code as follows:
[TestMethod]
public void File Download Test ()
{
var d = @ "C:\Users\Public\Downloads";
First download
Assert.AreEqual (@ "C:\Users\Public\Downloads\filedownload.dhtml") downloads the file (the new Uri ("http://www.uushare.com/ filedownload?user=icesee&id=2205188 "), d);
Second download, encounter file with same name, auto rename
Assert.AreEqual (@ "C:\Users\Public\Downloads\filedownload (1). DHTML", download file (new Uri ("http://www.uushare.com/ filedownload?user=icesee&id=2205188 "), d);
Download a file that was originally a GIF type
Assert.AreEqual (@ "C:\Users\Public\Downloads\2naqyw8.gif", download file (new Uri ("Http://i38.tinypic.com/2naqyw8.jpg"), D) ;
}
Conclusion
Compared with WebClient, WebRequest has better controllability, in WebClient no solution, try to let WebRequest play.
Sample source code and the XPS version of this article package download
Http://xiazai.jb51.net/200911/yuanma/asp.net_mime_down.rar
Reproduced http://skyd.cnblogs.com/