|
This page discusses:
|
|
|
| |
|
| |
|
Metadata is information about a document, such as who wrote it, who
published it etc.
|
|
HTML allows you to embed metadata about a document within the document
itself. Browsers do not display the metadata information but you can
see it if you select view page source. Metadata may also
be represented externally.
|
|
There are many schemes for representing metadata. Panoptic supports
the Dublin Core standard, the AGLS (Australian Government Locator Service)
standard and also the informal "Netscape" style of metadata. It is
designed to permit metadata search over email folders.
It also treats the URL of the web page and links, mailtos, referring
anchortext etc. as useful metadata.
|
|
A general web search engine like Panoptic should allow metadata search
over pages which use different metadata schemata. Panoptic does this
by mapping multiple ways of representing particular metadata into a
single metadata search class, represented by a lowercase letter. For example
the Netscape author and the Dublin Core DC.Creator
fields are mapped to class a.
|
|
The Panoptic query language supports
search over metadata classes. For example the query t:vice-chancellor
means, "look for documents with vice-chancellor" in their
title metadata, regardless of how the title metadata is actually
represented.
|
|
Everyday users are not expected to remember the letters assigned to
the various metadata classes! Normally, these letters are
automatically filled in by the appropriately customised version of the
advanced search form. The metadata classes are documented here for
power users and for search form designers.
|
| |
|
| |
| Metadata Class
| Explanation
| Metadata fields included
| | a
| Author
| Author, DC.Creator,
DC.Author, DC.Contributor, from: (email)
| | b
| Rights
| DC.Rights
| | c
| Description
| DC.Description
| | d
| Date (numeric qty)
| DC.Date or
Last Modified Date (from HTTP header)
| | e
| Type
| DC.Type
| | f
| Format
| DC.Format
| | g
| Relation
| DC.Relation
| | h
| Outgoing links
| href
| | i
| Images
| image names, ALT text
| | j
| Availability/Identifier
| DC.Identifier, AGLS.Availability
| | k
| Anchor text referring to this document
| text between <a> and </a>
| | l
| Language
| DC.Language
| | m
| Mailto references
| href mailto:
| | n
| Source
| DC.Source
| | o
| Coverage
| DC.Coverage
| | p
| Publisher
| DC.Publisher
| | q
| Function
| AGLS.Function
| | r
| Recipients
| to: (email),AGLS.audience
| | s
| Subject/Keywords
| keywords, DC.Subject, subject: (in the case of email)
| | t
| Title
| title, DC.Title
| | u
| Hostname part of URL
| from http header
| | v
| Filename part of URL
| from http header
| | w
| available for customer definition
| AGLS.Mandate
| | x
| available for customer definition
| ?
| | y
| available for customer definition
| ?
| | z
| available for customer definition
| ?
| | *
| Anywhere. In any metadata field or in the page content.
| -
| | | |
| |
|