Panoptic Metadata Classes
Click to go backClick to go home

This page discusses:
(dot)
(dot)
 
Metadata and Panoptic
 
Metadata is information about a document, such as who wrote it, who published it etc.
HTML allows you to embed metadata about a document within the document itself. Browsers do not display the metadata information but you can see it if you select view page source. Metadata may also be represented externally.
There are many schemes for representing metadata. Panoptic supports the Dublin Core standard, the AGLS (Australian Government Locator Service) standard and also the informal "Netscape" style of metadata. It is designed to permit metadata search over email folders. It also treats the URL of the web page and links, mailtos, referring anchortext etc. as useful metadata.
A general web search engine like Panoptic should allow metadata search over pages which use different metadata schemata. Panoptic does this by mapping multiple ways of representing particular metadata into a single metadata search class, represented by a lowercase letter. For example the Netscape author and the Dublin Core DC.Creator fields are mapped to class a.
The Panoptic query language supports search over metadata classes. For example the query t:vice-chancellor means, "look for documents with vice-chancellor" in their title metadata, regardless of how the title metadata is actually represented.
Everyday users are not expected to remember the letters assigned to the various metadata classes! Normally, these letters are automatically filled in by the appropriately customised version of the advanced search form. The metadata classes are documented here for power users and for search form designers.
 
Metadata Prefix Codes
 
Metadata Class Explanation Metadata fields included
a Author Author,
DC.Creator,
DC.Author,
DC.Contributor,
from: (email)
b Rights DC.Rights
c Description DC.Description
d Date (numeric qty) DC.Date or
Last Modified Date (from HTTP header)
e Type DC.Type
f Format DC.Format
g Relation DC.Relation
h Outgoing links href
i Images image names,
ALT text
j Availability/Identifier DC.Identifier,
AGLS.Availability
k Anchor text referring to this document text between <a> and </a>
l Language DC.Language
m Mailto references href mailto:
n Source DC.Source
o Coverage DC.Coverage
p Publisher DC.Publisher
q Function AGLS.Function
r Recipients to: (email),AGLS.audience
s Subject/Keywords keywords,
DC.Subject,
subject: (in the case of email)
t Title title,
DC.Title
u Hostname part of URL from http header
v Filename part of URL from http header
w available for customer definition AGLS.Mandate
x available for customer definition ?
y available for customer definition ?
z available for customer definition ?
* Anywhere. In any metadata field or in the page content. -
 


Panoptic Search Engine

© Copyright CSIRO Australia, 1997-2004.