Http://www.zhihu.com/question/20455165gu linging,
Baidu front-end engineer http://Lync.inWhat is semantics? In fact, simply put, the machine can understand the content.
First
Whatever you want. For the current web, HTML is the link between most web resources and the content carrier. When the web was just designed, Tim
Berners-Lee may not think of the scale it is going to achieve and how deep it is into so many aspects of our lives. Maybe the idea was simple at first: To publish the Web
Index of content and resources for easy viewing.
However
As the scale expands, the amount of information is no longer within the scope of human processing. At this time, people began to use machines to process the web.
The various content released on, the search engine was born. Later, people designed various Intelligent programs to process and mine indexed content. So that the machine can better understand the Web
The various content published on is becoming more and more important.
In fact, html
At the beginning of the design, there was a certain "semantic", including paragraphs, tables, pictures, titles, and so on. However, these are more convenient for UA, such as browsers.
Handle them as appropriate. But gradually, machines also need to use the semantics provided by HTML and natural language processing techniques to "read" the HTML they get from the Internet.
Documents, but they cannot understand the meaning of content such as "Red text" or deep nested table la s, because too many existing content is designed specifically for visual browsers. In the face of this situation
There are two ideas:
- We can make the machine's understanding more and more close to human beings. People can understand and understand what machines can understand;
- We should describe the content with machine-readable and widely recognized semantic information when releasing the content, (HTML itself is a small step in this direction ).
Me
This figure indicates that the semantic expression ability of the content and the intelligence level of AI determine the level of web content ability for machine analysis and processing. Opinion 1 above
The direction is towards human-level artificial intelligence, while the direction of viewpoint 2 is exactly the direction of Tim Berners-Lee, founder of the World Wide Web.
The Jazz's idea is semantic web. I will not talk much about Semantic Web. Simply put, it is to make all content and descriptions of the relationship become resources on the web, and the unique URI can be used.
Definition, clear semantics, and machine readable. Obviously, the ultimate goal of both roads is far away, and the first road is hard to achieve technically, while the second road has too many obstacles to implement.
I think we can
The visible and tangible web semantics is actually a small step in the second direction, that is, the widely recognized html
Improve the standards. At the beginning, we realized that we must return to the content itself, reasonably express the semantics of the content itself, and design different style descriptions for different user agents, that is, what we said and the sample.
Type separation. In this way, the first thing we need to do when providing the content is to reasonably describe the content itself, so we do not have to consider what the final presentation will look like for the moment.
Html
In fact, the standardization has been working in the semantic direction. Many elements and attributes have been designed to better understand HTML documents by allowing various user agents and even web crawlers. HTML5
On the basis of previous specifications, the semantic descriptions of all presentation layers have been modified or deleted, and many elements can be added to express richer semantics. Why?
Is the sample semantic element meaningful? Because they are widely recognized. The semantics itself is a consensus on symbols. The higher the degree of recognition and the wider the scope, the more people can rely on it to implement various functions.
HTML5 is not the only specification that web semantics relies on. In addition to W3C and whatwg, other organizations are contributing to the expansion and standardization of Web semantics. As long as browser vendors and search engines are supported, their specifications can become common infrastructure. For example, the microformats community and http://Schema.org have extensions to HTML and microdata (http://www.w3.org/TR/html5/microdata.html) specifications, Google, Bing, Yahoo! And other search engines and mainstream browsers have accepted the semantic extensions defined in them to varying degrees and applied them in production.
The following are two examples of Google app extended semantics.
Google's search results can identify the characters on the captured Page Based on the microformats hcard Syntax:
You can also use the microdata embedded in the web page to read the file score and other information:
For the description of the element semantics of HTML5, I have made a slides before, the above examples are all there, you can also refer to: semantic HTML (http://justineo.github.com/slideshows/semantic-html ).