Nodejs implements resumable upload protocol (TUS resumable upload Protocol)

Source: Internet
Author: User

Recently, nodejs has been used to implement a resumed upload protocol for Tus, with several intermittent changes in the middle.
CodeAddress: https://github.com/maddemon/tus-demo
It's just a demo, so the code is not complete, such as exception capture or something, but I just want to implement the core code. After all, it is not a real project development.

Tus Protocol address.
This is an HTTP-based resumable data transfer protocol, mainly using different httpmethods and related custom headers to implement the resumable data transfer function.
Specific methods and headers:
1. Head
The head request is used to obtain the status of the uploaded file. The server returns the range of the files that have been written and those that have not been written.
(PS, I just found that the Protocol has changed. It was not offset in the past)
However, my implementation is slightly different from tus. I support multi-threaded simultaneous upload of a file, so I cannot fully pursue his protocol.
Example of the header received by the client:
Range: bytes = 0-9, 20-29
The table names 0-9 and 20-29 have been written, so you do not need to upload them during the resume.
If the received file is range: bytes = 0-1023 (if the file length is 1024, it indicates that the file has been uploaded)

2. Post
The post request is used to create a file. The header must contain the size of the declared file, for example
Content-range: */1023 indicates that the file size is 1024.

Return result: Tell the client the address of the file so that the client can know which address to put data. For example:
Location: http://tus.example.org/files/24e533e02ec3bc40c387f1a0e460e216

3. Put
Put requests are used to upload files. If you want to implement resumable upload, You need to split the file into N small parts and upload them by block. In this case, if the file is disconnected, next time, you can see which parts have been uploaded through the head request.

4. Get
The GET request is to download the object. You don't have to mention it here.

The Protocol is roughly the same as the original tus protocol (multiple parts can be uploaded at the same time, so it is not sequential upload, and the return of the head request is naturally changed ).

Protocol Implementation ideas:
The client uploads data in multiple parts. The server receives each part and stores it on the server. How do I remember which parts have been uploaded?
My implementation idea is to need a metadata file of metadata to remember the upload status, file size, original file name, and so on.
Currently, metadata is stored in a file. This is just a demo. If it is actually used, I think it is more appropriate to use a database.

My detour:
First time:
How can I write a block into the target file? I also talked with my colleagues about this problem, but I didn't find a solution at the time. Unless it is written in sequence and then in append mode.
But how does the server ensure the order of multi-thread and non-sequential uploads on the client?
The final solution is to save each block uploaded by the client as a single small file. After each block is uploaded, check whether all the parts have been uploaded. If the upload is complete, merge the parts.

To determine whether a file block is written, I need to traverse every small file and determine whether it is continuous. Because node. js file operations need to be asynchronous, various recursion is required for implementation, which is very painful. You can refer to the historical versions submitted by GitHub. There are many recursion in the oldest versions. The code is ugly and difficult to maintain.

Second:
To reduce recursion, I use global Global cache, and the size of the block uploaded by the file must be the same. After each block is uploaded, I store it in the global cache. In this way, you do not need to traverse small files, which reduces many asynchronous operations and provides much clearer code.

Third time:
Although the small file merge solution is feasible, it feels silly after all. If it is a very large file, it can be imagined that the merge and traversal are slow, and the management is not convenient. Therefore, if it can solve the problem of writing stream by offset in nodejs, it can be solved. I flipped through the node. js document and found that W and W + were used in the past to open the file. In this way, the file content will be cleared, but R and R + will not.
So I tried to modify the code. Because the code is written concurrently, file writing is not concurrent, so I need to lock it. When writing a block, I can determine whether to lock it.
This removes a lot of code and makes the code clearer and brighter.

Why nodejs? Because this is an interview assignment, nodejs must be used.
I still use FS. readsync during filemetadata initialization. My former colleague told me that nodejs cannot use the Sync method, which will hold the entire process. This is terrible.
Therefore, I personally think nodejs has obvious drawbacks, and the increase in Code complexity is obvious. As for the inherent advantage of JavaScript non-blocking, other languages can do the same. Why is nodejs still used? In addition, JavaScript syntax is relatively weak, and many implementations are difficult. I don't like nodejs much.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.