Let's use offline (HTML5 offline storage)

Source: Internet
Author: User

What is an offline network application? At first glance, it is like a contradiction in the following ways. The Web page is downloaded and rendered. Downloading means a network connection. How can you download it offline? Of course, you can't. But you can download it when you are online. This is how the HTML5 offline application works.

In the simplest terms, an offline Web application is a list of URLs--html,css,javascript, images, or other types of resources. Attention to offline Web applications points to a list called list file and is used to locate any text file on the Web server. The Web browser used to perform HTML5 offline to use the program will read the list of URLs from the list file, download these resources, cache them locally, and automatically keep them updated when those local replicas change. When you try to access a Web application without a network connection, your Web browser will automatically switch and use local instead.

From now on, most of the work depends on your web developer. A tag in the DOM will tell you whether it's online or offline. When your offline status changes there will be an event trigger (one time offline, next online, or vice versa). But this is very much in line with the situation. If your application creates data or storage state, you should decide to store the data locally when you are offline and synchronize with the remote server when you are back online. The update is connected to the bucket when it is created. In other words, HTML5 can make your Web application available offline. It's up to you to decide what to do when you're in that situation.

Offline support Scenarios

IE Firefox Safari Chrome Opera iPhone Android

X√√√x√√

Who has been using offline?

The idea of offline Web applications is actually earlier than HTML5, and some of them are implemented earlier. In other words, HTML5 has a way to achieve offline, but there are other ways. I will talk about one of these methods later in this chapter: Gears. Some of these early adopters have been replaced with HTML5, and others are switching.

Gmail,google Web-based mailbox

Zoho, online productivity and co-operative applications productivity and collaboration apps

remember the Milk, online mission management system

WordPress, an open-source personal publishing platform

Cache List

Offline network applications are closely linked to cache list files. What is a list file? It is a list of all the resources your Web application needs to access when it loses its network connection. In order to boot the download process and cache These resources, you need to use the manifest attribute in your element to point to the list file.

<! DOCTYPE html>

<body>

...

</body>

Your cache list file can be placed anywhere on your Web server, but he needs the type Text/cache-manifest content category support. If you are using an Apache-based Web server, you probably just need to add a AddType directive to the. htaccess file in your network root directory:

AddType text/cache-manifest. manifest

Then confirm that your cache file has a. manifest extension. If you are using a different Web server or a different Apache configuration, consult the Server Documentation section on configuring the Content-type header.

FAQ:

Q: My Web application contains a number of pages. Do I need to use the Manifest property on every page, or can I use it only on the home page?

A: Every page in your Web server requires a manifest attribute that points to the cache list for all applications.

Each of your HTML pages points to your cache list file, and your cache list file is supported by the appropriate Content-type header. But what's in the list file? It's an interesting thing.

The first line of each cache list file is this:

CACHE MANIFEST

Then, all the list files are divided into three sections: "Explicit" segment, "Fallback" segment, and "Online Whitelist" segment. Each section has a header, which takes up a single row. If the list file does not contain any paragraph headers, all listed resources default to the "explicit" segment. Try not to think about these terms so that you don't crash.

Here is a valid list file. It lists three resources: a CSS file, a JavaScript file, and a JPEG image.

CACHE MANIFEST

/clock.css

/clock.js

/clock-face.jpg

This cache list file does not have any paragraph headers, so all listed resources default to the "explicit" segment. The resources in the "explicit" segment will be downloaded and cached locally, and will be used in place of their online copy when you do not have a network connection. Therefore, while downloading this list, your browser will download Clock.css,clock.js and clock-face.jpg from your Web server's root directory. Then you can unplug your network cable and refresh the page, all of which are available offline.

FAQ:

Q: Do I need to list my HTML pages in my cache list?

Answer: Yes or No. If all your Web applications are included in a single page, just confirm the page by using the manifest attribute to point to the cache list. When you visit an HTML page that contains a manifest attribute, the page itself is assumed to be part of the Web application, so you don't need to include it in the list file itself. However, if your Web application contains multiple pages, you should list all the HTML pages in the list file, otherwise the browser will not know that there are other HTML pages that need to be downloaded and cached.

Network segment

Here is a slightly more complicated example. Here are a slightly more complicated example. If you need your timer application to track the user, use a tracking.cgi script that dynamically loads from the property. Caching this resource defeats the purpose of the trace, so this resource needs to never be cached and is absolutely invalid when offline. You should do this:

CACHE MANIFEST

NETWORK:

/tracking.cgi

CACHE:

/clock.css

/clock.js

/clock-face.jpg

This cache list file contains the paragraph header. The line that says "NETWORK:" is the beginning of the "Online Whitelist" section. The resources in this paragraph will never be cached and are not valid offline. (Attempting to load them offline will return an error.) The line that says "CACHE:" is the beginning of the "explicit" section. The rest of the cache list file is exactly the same as the previous example. The three resources listed will be cached when they are cut offline.

Fallback segment

Here's another one. The paragraph type of the cache list file: "Fallback" segment. In a "Fallback" section, you can define an alternative file for any online resource that cannot be cached or successfully cached for any reason. The HTML5 specification provides a cool example of using the "fallback" segment:

CACHE MANIFEST

FALLBACK:

//offline.html

NETWORK:

*

What the hell is this thing doing? First, consider a site that contains countless pages, such as Wikipedia. Of course you can't download the entire site, and you should not want to. But suppose you need to make some pages of your site available offline, how do you decide which pages to cache? So how: every page you visit on an assumed support offline Wikipedia site will be downloaded and cached. It will contain each encyclopedia entry you have ever visited, each topic page (for a temporary discussion of the specific encyclopedia entries), and each edit page (used to modify a specific entry).

This is what this cache list does. Assuming that each Wikipedia HTML page (entry, topic page, edit page, history page) points to a cache list, your browser says, "Hey, this page is part of an offline Web application, is that one I know?" If your browser does not download this particular cache list file, it will create a new offline application cache, download all the resources listed in the cache list, and then add the current page to the application cache. If your browser recognizes this cache list, it will simply add the current page to the existing application cache. In summary, the page you just visited ends in the application cache. This is important. This means that you can have an offline Web application that "sluggishly" adds pages when you visit. You don't need to list each of your individual HTML pages in the cache list.

Now, look at this "fallback" segment. The "fallback" segment in this cache list has only one row. The first part of the line (before a space) is not a URL. It is actually a URL template. This single character (/) will match any page in your site, not just the homepage. When you try to access a page while offline, your browser looks in the application cache. If your browser finds this page in the application cache (because you accessed it online and the page was secretly added to the application cache), your browser will display a cached copy of the page. If your browser does not find this page in the application cache, it will display the "/offline.html" page defined in the second section of the "Fallback" section, instead of displaying an error message.

Finally, let's take a closer look at the "network" segment. The "Network" segment in this cache list also has only one row, with a single line of characters (*). This character has a special meaning in the "network" segment. It is called "online whitelist wildcard." This well-designed method is used to indicate that any resources that are not in the application cache will still be downloaded from the original network address as long as you have an Internet connection. This is important for "open" Offline network applications. This means that when you are online browsing this hypothetical support offline Wikipedia, your browser will normally get pictures, videos and other embedded resources, even if they are under another domain name. (This is common in large web sites, even if it is not part of an offline Network application.) When pictures and videos are executed in the CDN of another domain, the HTML page is generated locally and works. Without this wildcard, when you're online, we assume that Wikipedia, which supports offline, will behave strangely-it will not load images or videos under any of the different domain names.

Is this example complete? Wikipedia is more than just HTML files. It uses the usual css,javascript, and pictures on each page. These resources need to be explicitly listed in the "CACHE:" section of the list file for the correct display and execution of the page when it is offline. But the intent of the "fallback" paragraph is that you can have an open offline Web application that is not limited to the resources explicitly listed in the list file.

Event Flow

So far, I've talked about offline Web applications, cache lists, and offline application caches in vague, semi-confusing terms. Resources are downloaded, browsers make judgments, and everything works fine. You know that, right? I mean, that's what we're talking about. Web development. Nothing, just a normal operation.

First, let's talk about the flow of events. Especially DOM events. When your browser accesses a page that points to a cache list, it triggers a sequence of events on the Window.applicationcache object. I think it looks complicated, but believe me, this is the simplest version of all the important information I can provide.

1. Once your browser has noticed that the element contains the manifest attribute, it will trigger a checking event. (All the events listed here are triggered on the Window.applicationcache object.) The checking event is always triggered, regardless of whether you have visited this page before, or any other page that points to the same cache list.

2. If your browser has never met this cache list ...

O It will trigger a downloading event and then start downloading the resources listed in this cache list.

o while downloading, your browser will periodically trigger the progress event, which contains information such as how many files have been downloaded, how many files are still in the download queue, and so on.

o When all listed resources in the cache list have been successfully downloaded, the browser triggers the last event, cached. This is a sign that your offline Web application is fully cached and ready for offline use. That's it, you're finished.

3. On the other hand, if you have previously visited this page or other pages pointing to the same cache list, your browser already knows the cache list. There may already be some resources in the application cache. Network applications that may all work are already in the application cache. Now the question is, has the cache list changed since your browser was last detected?

o If the answer is no, the cache list has not changed and your browser will immediately trigger a noupdate event. That's it, you're finished.

o If the answer is yes, the cache list has changed and your browser will trigger a downloading event and start downloading each resource listed in the cache list again.

o while downloading, your browser will periodically trigger the progress event, which contains information such as how many files have been downloaded, how many files are still in the download queue, and so on.

o When all listed resources in the cache list have been successfully re-downloaded, the browser triggers the last event, Updateready. This is a sign that your new version of offline Web application is fully cached and ready for offline use. The new version will not be used immediately. To use the new version immediately without forcing the user to reload the page, you can call the Window.applicationCache.swapCache () function manually.

If a horrible error occurs at any point in the process, your browser will trigger an error event and terminate immediately. This is a complete and brief list of errors that can be thrown:

• Cache list Returns a HTTP404 error (page not found), or 410 error (permanent disappearance).

• The cache list is found and not changed, but the HTML page pointing to the list is not downloaded correctly.

• The cache list is found and changed, but the browser does not download the resources listed in a cache list.

The Art of debugging

I want to say two important things here. The first thing you've just read, but I guess you're not really fully understood, so again: if any of the resources listed in your cache list is not properly downloaded, the entire process of acquiring an offline network application will fail. Your browser will trigger an error event, but there will be no indication of which problem is specific. This could make debugging an offline network application very much more likely to crash.

The second important thing from the technical level is that it's not a mistake, but it looks like a serious browser error until you understand what's going on. It has a bit of an exact relationship with how your browser detects if the cache list is changed. This is a three-phase process. This is annoying, but important, so be careful.

1. With standard HTTP semantics, your browser will detect if the cache list has expired. Just like any other HTTP service file, your Web server will contain a typical meta-information about this file in the HTTP response header. Some of these HTTP headers (expires and Cache-control) will tell your browser how to allow the file to be cached without asking the server if the file has changed. There is no relationship between this type of cache and the offline Network application. It happens in almost every HTML page, style sheet, picture or other network resource.

2. If the cache list has expired (depending on its HTTP header), your browser will ask the server if it has a new version, and if so, the browser will download it. To do this, your browser generates an HTTP request that contains the last-modified data for this cache list, and your Web server includes the last time the browser downloads the list file in the HTTP response header. If the network server determines that it has not been changed since that time, it will simply return a 304 (unchanged) state. Again, this is not unique to offline Web applications. It occurs in essentially each type of network resource.

3. If the network server believes that the list file has been changed after that time, it will return a (OK) HTTP status code, followed by the new file's contents and the new Cache-control header, and a new last-modified time, so Steps 1th and 2nd are likely to occur next time. (HTTP is cool and Web servers are always planning for the future.) If your Web server absolutely needs to send you a file, he does everything possible to confirm that he does not need to teleport for a second time. Once the new cache list file is downloaded, your browser will detect the content based on its last downloaded copy. If the contents of the cache list file are the same as the previous one, your browser will not re-download any of the resources listed in this list.

When you develop and test your offline Web application, any of these steps can make you make a mistake. For example, if you publish a new version of the cache list file, 10 minutes later, you find that you need to add another resource inside. No problem, right? Just add another row and republish. This is what is going to happen: you reload the page, your browser discovers the manifest property, it triggers the checking event, and then ... Nothing more. Your browser insists that the cache list file has not been changed. Why? Because your network server may be configured by default, tell the browser to cache static files for several hours (via HTTP semantics, using the Cache-control header). This means that your browser will not pass the 1th step in the three-phase process. Of course, the Web server knows that the file has changed, but your browser won't even have enough time to ask the Web server. Why? Because your browser last downloaded the cache list, the Web server tells it to cache the resource for a few hours (via HTTP semantics, using the Cache-control header). So after 10 minutes, that's exactly what your browser will do.

You must be aware that this is not a mistake, but a feature. Everything works in the way it should. If the Web server has no way to tell the browser (and intermediate proxies) to cache resources, the network may crash quickly. But no one's going to comfort you. Spend hours trying to figure out why your browser hasn't noticed your updated cache list. (In fact, if you wait long enough, it will mysteriously start working again!) because the HTTP cache has expired! Just like it should! Kill me now!)

So here's one thing you have to do: Reconfigure your Web server so that your cache list files are not cached because of HTTP semantics. If you use an Apache-based Web server, the two lines in your. htaccess file will achieve the goal:

Expiresactive on

ExpiresDefault "Access"

This will invalidate each file cache in this directory and all its subdirectories. This may not be desirable in your production, so you should use an instruction to restrict it so that it only works on your cache list file, or create a subdirectory that contains only this. htaccess file and cache list file. In general, the configuration details vary with the network server, so check your server documentation to confirm how to control the HTTP cache header.

Once you invalidate the HTTP cache for the cache list file itself, you will still see a resource that is outdated but has changed in the application cache, just because it still exists on your Web server with the same URL. Here, the 2nd step in the three-phase process will deceive you. If your cache list file is not changed, the browser will not notice that a resource that was previously cached has been changed. Note the following example:

CACHE MANIFEST

# Rev 42

Clock.js

Clock.css

If you change the CLOCK.CSS and republish, you will not see the change because the cache list file itself has not changed. Every time you make a change to a resource in an offline application, you need to make changes to the cache list file itself. This is as simple as changing a single character. I find that the simplest way to do this is to include a comment line with a revision number. Change the revision number in this comment, the Web server will return the most recent changed cache list file and will cause the process to start re-download all the resources listed in the list.

CACHE MANIFEST

# Rev 43

Clock.js

Clock.css

Let's use offline (HTML5 offline storage)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.