A long journey through chunked transfer and file uploading

It has been a while since I wrote the last post … I know, I’ve still quite a lot of stuff left opened but sooner or later you’ll get it. Don’t worry 😉

During the last days I’ve been working on improving some server side API for a client I’m working for.

Actually they’re using a mix of AMF and HTTP based API but, as long as the AMF implementation is unfortunately a bit unstable, they want to move to a full HTTP based API … that sounded quite easy: take a bit of HTTP, dress it with POST and maybe XML (or whatever other ingredient you like most) and you’ll have a good base to start porting the API.

But I’ve ignored a small particular that turned out to be really painful to manage by using just HTTP: sending efficently to the server big chunks of binary data.

Those were the requirements I had to fulfill:

I have potentially really big binary files to be sent quickly to the server;
I have to track upload progress somehow;
I must be able to trigger the upload without the user interaction;
I should rely on HTTP only;

And here the solutions I’ve tried to implement, and the conclusions I’ve achieved (well, I didn’t try all the options because I already knew that a few of them were not suitable for my situation, but they might be useful to someone and so I’m reporting them anyway).

Single POST by using URLLoader

That sounds a good starting point, but unfortunately it doesn’t meet requirements 1 and 2 for a few reasons. The data, to be sent correctly to the server without using multipart (see below) must be encoded into an ASCII string somehow (your data may contain characters that may be interpreted as part of the HTTP protocol even if part of the data …); the easiest way seems to be base64 encoding, but unfortunately it increases the size of the data by ~1/3 which is way toooooo much for large files. The other bad thing is that by using URLLoader you cannot track upload progress.

discarded

Multiple POST by using URLLoader

That sounds better than before, because we can implement a simple API on the server side to accept multiple chunks of data. This way we can easilly track progress by splitting the data to be sent into small chunks and sending them one by one to the server along with an unique key that can be used on to assemble them together (a lot of other security options must be taken into account … but don’t talk about those now).

Unfortunately there’s still one problem left: data must be encoded using base64 …

discarded

Multiple file uploads by using a URLLoader

Things start to become more complicated, but that might be a good option. As long as we simulate a file upload, we can send the data without enconding it with base64 by using multipart/form-data. The server can assemble the chunks together when the last one is uploaded and as before we can track upload progress.

But … well, flash doesn’t give you the permission to simulate a file upload using URLLoader without the user interaction. This means that if you trigger the first upload using a button it works, but it won’t for the following ones. Uff, we were near to an easy solution …

discarded

Multiple file uploads by using a Socket connection

The bigger problem on the last solution was the security check … but what if we don’t use URLLoader class and just simulate an HTTP upload by using a Socket connection on port 80 (or whatever else)? Sounds interesting and seems to work locally, but once you upload the script on the server our big friend (mr security) comes into play again: obviously a Socket cannot connect to a server port without an explicit permission from the server. And how to provide a permission ? You must reply with a valid policy file on the same port as the connection or on port 843, to the message <policy-file-request/>.

Well not that difficult but there are two big issues: first of all I’m connecting to port 80 and I cannot reply to an invalid message (well … I can hack the webserver … but that doesn’t sound like an option :P) and, even if I implemented a simple socket server running on port 843 it was not called by the Flash Player …

discarded (even if still a valid option port 843 was called)

Chunked transfer encoding by using a Socket connection

First a quick note: we cannot simulate chunked transfer encoding by using URLLoader because after the first valid HTTP message we must sent a list of invalid messages untill the data has not been sent completely.

On the top of that we must encode the data with base64 and we suffer from the security issues noted before.

discarded (even if possibly an elegant option :P)

Streaming data to the server by using a Socket connection

I’ve thought a bit about how to get rid of the security issue generated by using a Socket connection (without hacking the HTTP server …) and the only solution I’ve found was to implement a custom socket streaming server to manage data transfer on a separate channel then HTTP, and use a simple HTTP API to trigger the right functions on the server when the data has been transferred correctly.

Unfortunately this breaks requirement number 4 …

discarded

So what to do ? It seems that there are no really valid options for my situation. And so I’ve opted for the on the breaks just the less important requirement for my situation (as long as I’ve full access to the server and I can run whatever program I want there): streaming data to the server by using a Socket connection. This breaks requirement number 4 but the good thing is that I’m totally free of optimizing the communication as I want as long as I’ve full access to both the client and the server code related to the data transfer.

That sounds like a good tradeoff even if probably the best option might be to continue to rely on HTTP and just find a way to make Flash Player to call port 843 when asking for the policy file … maybe the solutions is just behind the corner and I can’t see it … any hint ?

Share with...