Saturday, January 13, 2007

The Dark Side of File Uploads

I saw a December MSDN article, entitled Uploading Files in ASP.NET 2.0, and wanted to offer my comments on some gotchas with uploading files. I’ve spent countless hours and tried numerous hacks to tame file uploading and have enough bruses from hitting my head against the wall (figuratively speaking).

ASP.NET 1.x shipped with the HtmlInputFile control, while 2.0 has a brand new, FileUpload control, although its HTML counterpart is still there. As a quick recap, you declare an upload control as follows:

[ASP.NET 1.x]
Select File To Upload to Server:
<input id="MyFile" type="file" runat="server" />

[ASP.NET 2.0]
Select File To Upload to Server:
<asp:FileUpload id="FileUpload1" runat="server" />


Uploading files in ASP.NET is very inefficient. To be fair, IIS is a bigger offender than ASP.NET itself. When you pick a file and submit your form, IIS needs to suck it all in and only then you have access to the properties of uploaded file(s). IIS 5 does it this way. IIS 6 does it this way. IIS 7 promises to be more like Apache in this respect. Until then, there’s not much you can do about the fact that you have to sit through a long upload and wait. Neither can you display a meaningful progress bar because there’s no way to know how much is transmitted at any given
time.

Once IIS buffers your upload, ASP.NET takes it from there. By default, you can upload no more than 4096 KB (4 MB). To raise this limit, you need to adjust maxRequestLength in the <httpRuntime> config section.

The larger the file, the longer it takes to upload. ASP.NET kills requests that take too long; consequently you also need to increase executionTimeout.

In 1.0 and 1.1, the default is 90 seconds, in 2.0—110 seconds.

There’s also a new shutdownTimeout attribute, but I don’t understand its purpose yet.Files That Are Too Large


It gets really interesting if someone uploads a file that is too large. Regardless of what your maxRequestLength setting mandates, IIS has to guzzle it (remember?), and then ASP.NET checks its size against your size limit. At this point it throws an exception. Peek inside the GetEntireRawContent() method of HttpRequest and you see this:

HttpRuntimeConfig config1 =
(HttpRuntimeConfig) this._context.GetConfig("system.web/httpRuntime");

int num1 = (config1 != null) ? config1.MaxRequestLength : 0x400000;
if (this.ContentLength > num1)
{
this.Response.CloseConnectionAfterError();
throw new HttpException (400,
HttpRuntime.FormatResourceString("Max_request_length_exceeded"));
}
The rest of this method assembles the file piece by piece in case the file was preloaded only in part, and then checks if its size exceeds the imposed limit. In either case, if an end-user uploads an oversized file, he/she will see a timeout “white page of death”. I put together a sample project with a custom error page, but I always get a white page instead.

Since it is theoretically possible that the file isn’t loaded in full, I’d like to know if one can configure IIS to read it in chunks. I haven’t seen any guidance on this, and I’ve never seen articles that explain it. If somebody out there knows, please share.

How Do I Save Face?

It’s difficult to explain to an end-user that it’s not their fault that the file happened to be too large or the page took too long to upload a file and timed out. Is there a way to tap into this process early and save face by failing gracefully? I have a couple of ideas, none of them perfect.


You may override Page.OnError and inspect the HTTP code, which should be 400, if the exception happens to be of type HttpException. This is kludgy.

You may also implement an HttpModule, set up its BeginRequest handler and compare Request.ContentLength with the size limit (which you can read straight from web.config). If ContentLength is too high, redirect to a page with a meaningful error message. I believe ContentLength may or may not reflect the total size of the uploaded file, so this approach is
not 100% accurate.

Custom HttpModule to Track Progress

I think the best and most accurate solution would be to implement an HttpModule whose sole purpose would be to read a file in chunks and keep the page alive. This way it won’t time out, and you’ll be able to track progress and cancel an upload. Telerik has such a server control + HttpModule combo for sure. Other vendors should have similar offerings.

Uploading Multiple Files

The samples I’ve seen demonstrate 3–5 file field controls, all statically declared. Why 3? Why 5? What’s the magic number? There’s none. Since none of them show how to add file fields on the fly, I decided to write a sample that does.

When you upload several files from the same page, you can access them all via the Request.Files collection:

HttpFileCollection uploads = HttpContext.Current.Request.Files;

Let’s declare one file field which you can treat as an instance of HtmlInputFile thanks to the runat="server" attribute.

<p id="upload-area">
<input type="file" runat="server" size="60" />
</p>
<p>
<a href="#" onclick="addFileUploadBox(); return false;">Add file</a>
</p>
<p>
<asp:Button ID="btnSubmit" runat="server"
Text="Upload"
OnClick=
"btnSubmit_Click" />
</p>

This snippet has a link which ads a file field on the fly when clicked. addFileUploadBox is a JavaScript function that performs some DOM manipulation. I noticed that as long as you have at least one HtmlInputFile or FileUpload contol on the page, you can add as many other file fields as you want, and ASP.NET will nicely package them into the Request.Files collection. Go figure.

By clicking the Add file link you add multiple file fields, assign their id and name attributes (otherwise corresponding files won’t be thrown into Request.Files), and add them to the download area.

All this with only a single server-side upload control! The server-side code on the bottom shows how to process all uploaded files. Remember to check for zero file size in case someone added an upload box but didn’t pick a file. Feel free to copy and paste sample code and play with it.

Conclusion

File uploading in ASP.NET is a very imprecise and imperfect science. This is one area I’d love to see improved in the future. If you find yourself struggling with it, don’t worry—you’re in good company. Stick with stock server controls for rudimentary uploads. Otherwise look around for third-party products.