问题 单项选择题

One of the difficulties in building an SQL-like query language for the Web is the absence of a database (131) for this huge, heterogeneous repository of information. However, if we are interested in HTML documents only, we can construct a virtual schema from the implicit structure of these files. Thus, at the highest level of (132) , every such document is identified by its Uniform Resource Locator (URL), and a (133) and a text. Also, Web severs provide some additional information such as the type, length, and the last modification date of a document. So for data mining purposes, we can consider the set of all HTML documents as a relation:
Document (url, rifle, text, type, length, modif)
Where all the (134) are character strings. In this framework, an individual document is identified with a (135) in this relation. Of course, if some optional information is missing from the HTML document, the associate fields will be left blank, but this is not uncommon in any database.






