The Internet provides access to a wide variety of resources in a wide variety of formats. Hence the problem is not lack of resources but how to connect a user with the resources appropriate to that user's needs. Search engines such as Google or Yahoo! only index a small portion of the resources available through the Internet. Much of what is out there is available through propriety sources such as EBSCOhost or Cambridge Scientific Abstracts (CSA). However, many users turn to free search engines as primary means to find information on the World Wide Web. This makes such search engines major power brokers in the provision of information.
The actual algorithms used by Google, AltaVista, or Yahoo! are not disclosed to the public. However, we do know from their publications some of the basic concepts they use in indexing websites. One of the elements of a website examined by the "search bots" is that of the metadata tags included in a web page. Metadata tags (the term "metadata" just means data about data) are used to tell the automatic indexing systems who created a resource available on the website, the type of resource, when it was created, and the subject matter.
In this phase you will be adding metadata tags to the web page you began in Phase One. We will use the Dublin Core metadata scheme. Dublin Core was created by librarians in an attempt to facilitate accurate automated indexing of websites by search engines. Dublin in this case refers to Dublin, Ohio—the location of OCLC Headquarters and the site where the Dublin Core initiative was formalized. You can read more about the Dublin Core Metadata Initiative (DCMI) at the Initiative website: http://dublincore.org/. You will be using Dublin Core to describe the following elements of your web page: the title, the creator, the subject, and a very brief description of the content.
First establish a connection to your ISP or use a computer with a permanent Internet connection.
As you did in Phase One, use the SSH Secure Shell Client to log onto UHUNIX.
You can move down two levels into your infocomm subdirecty in one step. At the % prompt type:
cd public_html/infocomm then press the Enter key.
You can check to see where you are in the directory structure by typing pwd and then pressing the Enter key. Reading from left to right will show you where you are from the topmost directory down to the subdirectory in which you are now working.
Unfortunately, it is all too easy to introduce significant errors into a web page. Rather than having to start over from scratch, we can ensure that we have a back-up version of our file to return to when things go really haywire by making a copy of our file before we work on it. We do this with the copy command.
The basic syntax of the command is:
cp [file name] [name of back-up file]
Remember to substitute the name of your file for galadriel_experiment.html in the example:
At the % prompt type:
cp galadriel_experiment.html galadriel_experiment_bk.html
then press the Enter key.
Use the list command (ls) to verify that your back-up copy was created.
At the % prompt type:
(Substitute the name of your file for galadriel_experiment.html.)
We need to include links to information about the coding and naming schema we are using in order for the indexing "bots" to accurately interpret the indexing information we are going to add to our web page.
At the end of phase one, the head portion of your web page looked like this:
Metadata tags must be entered between your beginning and ending head tags. Use your down arrow key to move your cursor to the beginning of the line with the ending head tag (</head>). Add a couple of line spaces to give yourself some space to work.
Enter the links to the schema information as shown below.
You can read the above addition as: Here is a link related to the Dublin Core schema. The hypertext reference (web address) of the destination is a persistent URL (purl), meaning that if the owners of the destination web page reorganize their directories those owners will simply enter the new address as the purl destination in the purl database. The purl will remain the same.
Now add your first metadata tag. The tag must indicate the scheme used, the property to be described, the language of the value for the property, and the value itself. The name of the property has two parts: the scheme used and the actual property. We indicate the scheme (Dublin Core) and the property (title) as "DC.title". The language of our title in this case is English so we indicate that by adding lang="en" within our tag. (For a listing of the 2-character language codes see: http://xml.coverpages.org/iso639a.html.) Finally we indicate the value of the property by adding content="Galadriel's LIS 694 Experiment". (Remember to substitute the name of your character for Galadriel).
Enter it as follows:
The creator of your web page is you. So substitute your name for mine when adding your creator tag as in the example below.
For our subject metadata tag you can use the Library of Congress Subject Heading "Automatic indexing."
The last metadata tag we will add is a description. You can either use the one below or create your own.
Adding metadata tags should not change the appearance of your web page but the lack of an ending angle bracket could do so. Check to see that the appearance of your web page has not changed. Then send the URL of your web page to your instructor with the message that you have completed phase 2.