Using Search Engines and Web Directories Effectively to Promote Your Site

How to Get Clear Listings and Accurate Keyword Coverage
from the Most Popular Index Services
by Chip Chapin

 

Counter Visitors since October 28, 2000

Updated January 15, 2001
(minor rev May 6, 2003)


This is a reference guide for webmasters who want clearly readable listings and accurate, fair keyword coverage in the most popular search engines and directory services. It offers step by step instructions on how to prepare your pages and submit them to more than a dozen of the most popular search engines and directory services. There are no "secret tricks" here for fooling search engines, just how to do a good job.

What's Here?

The first major section, Before You Submit Your Site, contains fundamental tips and techniques that will help you ensure that your site is properly and effectively indexed.  Included are sections on how to effectively use the document Title, Meta Keywords and Meta Description tags.  The next major section is the Checklist of Index Services. More than a dozen of the most important index services are described in detail, including the procedure for submitting a site. The remaining sections cover what to do after you submit your site to ensure the quality of your results, services and tools that can help you with submittals to multiple search engines, and finally references to other web sites that may be able to help you.

Why This Document?

Like most of my web documents, I began writing this guide for my own use.  Just like you, I needed to get my web pages properly indexed on search engines and web directories. This document was how I organized my notes on the subject as I learned.  I didn't write this because I was an expert. Just the opposite: I wrote this because I wasn't an expert, and I needed to learn.

Contents


Types of Index Services

I use the term index service to refer to any on-line service that provides some sort of index to other web sites.  There are two main types of index services:

Search Engines
A search engine is a system which systematically "crawls" through the world-wide-web looking for documents to index.  Part of its value is the unexpected nature of the things it turns up, and the completeness of its listings.  AltaVista and Google are among the best-known search engines (though AltaVista also offers a directory service).  The important thing here is that a search engine is constantly scouring the web for new listings or updates to old ones.  Your web site may already be listed, especially if there are links to it from other sites.  A submittal to a search engine just helps to ensure that your site is listed.  A number of names have been used to describe the actual programs which search engines use to scour the web, including spider, web crawler, robot, or just bot.  The terms do not appear to differentiate in a meaningful way, and I'll just use spider for them all.
Directories
An on-line directory can be thought of as similar to the Yellow PagesTM.  One must submit their listing to the directory service, and often even pay for it.  The directory service does not crawl the web to automatically add new listings, or even to update old ones.  But since the listings are created by human editors, web sites can often be better-categorized.  Yahoo! is the leading example of a directory.

Directories usually concentrate on indexing web sites, while search engines typically index individual web pages.  The user of a search engine enters keywords and gets back a list of pages that contain those words.  The user of a directory might not use keywords at all but can browse the topical index for what interests him or her, such as Home > Entertainment > Music > Artists > By Genre > Blues > Allison, Luther.  If they do search for keywords, they will only get a hit if the keyword appears in the web site's description.

Before You Submit Your Site...

You've devoted time and effort to creating a web site and now you want people to find it. Whether yours is a commercial site, an information resource, a political manifesto or a simple personal home page, you want to connect with others who are searching the web for what you have to offer. You could skip this whole section, go straight to the Checklist of Index Services, and start submitting your site to all of them. But you run the risk of receiving poor listings, unreadable listings, or no listing at all.

One of the earliest and most famous maxims of the computer era is "Garbage In, Garbage Out". If you submit a sloppy representation of your site to the search engines and directories you will get sloppy listings, if any.

The designers and builders of search engines devote a great deal of science and engineering to their art just to answer such questions as: "What are the most important keywords on this page?"  and "Which pages are the best response to a query for certain keywords?".  On the other hand, builders of directories rely on their human editors make decisions about whether or not to include a page, which category to put it in, and to write a description. In both cases you, as a web page owner, will get much better results if you make it very, very easy for man or machine to make these determinations.

The requirements of a good submittal are surprisingly simple. There are no tricks here, it's just a matter of knowing how to make the essence of your site very clear.

To achieve these goals your first task is to create a high quality web site or document that offers genuine value to others.  Before you spend much time on promotion, ask yourself if what you have is worth promoting.  Now ask yourself, "To whom?" and "How will it benefit them?"  The answers to these questions will help with your second task: to ensure that those answers are readily self-evident, either to a busy human editor or a search engine's keyword weighting algorithm.

Choose a Good Title

We've all heard the stories about how to manage a modern engineering project:

Or the first task in starting a rock band: Pick a great name.

These jokes are funny because we realize how silly it is to put external trappings so far ahead of actual content. And for web pages, quality content is certainly essential. But beyond that, the Title is probably the most critical and most often-neglected element in getting a high quality index listing.

How do most search engines list their entries? By Title. How do most web directories list their entries? By Title. Take some time and get it right. Visitors to an index site usually have only two clues to go on when they choose which sites in a list they are going to visit: the title and description. The title comes first, is typically bold faced, and is sometimes even in a larger font. Choose a good title.

Use the TITLE Tag

Some search engines and directories require you to enter the Title on a submittal form, but most read it directly from the TITLE tag on the page itself. The TITLE tag appears within the HEAD section of an HTML document (please don't confuse it with the H1 title that appears at the top of your page). It is absolutely essential that your carefully chosen page title appear within the TITLE tag.

Besides its role in the results listings, the words in the page title are often heavily weighted by search engines which use heuristics to decide which keywords are most important on a page. 

Use the H1 Tag

It is also helpful to have your title appear at or near the top of the page within an H1 tag.  Some search engines pay close attention to the structure of the document as defined by the header tags (H1, H2, etc.).  Words appearing in header tags may receive greater weight than other words.

Of course the contents of the TITLE and H1 tags don't have to be identical. Typically the highly visible H1 text is more succinct, and the less visible TITLE text may include additional keywords or phrases.

Effective Page Titles
  • Choose a title that captures what your page is all about.
  • The title should contain the most important keywords for your page.
  • Set it in the header using the TITLE tag.
  • Show the title in the body, near the top of the page, using an H1 tag.

Here's what the AltaVista help pages have to say about the TITLE:

In the ranking rules that determine which pages will appear near the top of a list of matches, the HTML title is the most important element of the page. When people search for your kind of information, they are most likely to use those words that belong in your HTML title. When you leave the title blank, use unimportant words, or use the same title on many different pages, you've just thrown away the best way to attract traffic to your site.

Write a Good Description

Every search engine and directory listing includes a description of the page. Search engines get their descriptions directly from you, usually through the META Description tag. Directory editors write their own description based on what they see on your page. If your page contains a succinct statement of what's on it and why it's valuable, many editors will use that statement as a starting point. Directory submittal forms typically require you to include a description with the submittal -- you'll want something you can copy and paste.

There are always limits to the sizes of the descriptions that are displayed in listings. It's never more than about 250 characters, so that should be your absolute limit. The limits are often less than 200 characters. Strive to capture the essence of your page within the first 150 characters.

Your description and your title will comprise your entire listing on most search engines. Make it count. Make sure it is readable, and not too cryptic. Consider using complete sentences. Make sure it contains the appropriate keywords for your page. If you feel comfortable describing your page without using any of your keywords, then something is very wrong. I find that the process of writing the Title, Description and Keywords is iterative -- changes in one place suggest changes in the others.

Use the META Description Tag

Your description must appear in the <META name="Description"> tag in the HTML header. The majority of search engines will pick it up from there automatically and use it in their listings.

Include an Introduction

The META Description itself will not appear to readers of your page because it is part of the page header, not the body. A few search engines actually ignore the META Description and instead just describe your page using the first 150-250 characters of text directly from the body of the page. This can look pretty silly sometimes. For these cases, you can control how your page is listed by putting an introductory paragraph at the top of the document. The text should be similar to your META description.

There are other good reasons for having an introduction. First, put yourself in the reader's shoes.  You are clicking through lots of web pages trying to find the information you need.  The easier it is for you to determine that a web site includes what you want, the more likely you are to spend time there.  Make it easy for your readers by telling them right up front what they are getting.

Second, some search engines use heuristics (clever rules that attempt to deduce meaning from the page text) to construct their indices. Many common heuristics give heavier weight to words that appear in titles and near the top of your page.  Good place for a descriptive introduction!  Make sure it uses the keywords that you've already identified.

A great example of an introduction is found at SelfPromotion.com.

Writer Lise Elsliger had some interesting cautions:

Although you can repeat the same words that you have mentioned on the title and Meta tag once or twice, do not do it too often since it is considered spamming.  Finally, if you wish to include pictures on the first page, include them after the first body of text and do not forget the alt tag.

Select Good Meta Keywords

Part of the optional "meta data" that can appear in your HTML header is a list of keywords.  This gives you, the web page author, the opportunity to supply a list of terms that you think would be particularly helpful and relevant to indexing your site.  Unfortunately, keyword lists have been heavily abused by authors seeking to cheat on their placement in search lists. Search engines have adapted to this. Today, many search engines just ignore the keyword lists. Others still treat them the same as text, so it at least gives you a chance to weight the proper keywords.

Getting the keyword list "just right" probably isn't worth the effort anymore, despite the claims of so many search engine consultants. But it is still worthwhile to spend a few minutes or more to come up with a good list of terms that might be used in a search that, you think, should turn up your site. Now check to see if these terms are used effectively in your page? Do they appear in the title and the introduction? Do they appear in other headings on your page? If not, then either these are not the right keywords or your writing needs to lean on these words more heavily.

Using the META Keywords Tag

Keywords are given in a list separated by commas.  They don't have to be single words, but can be short phrases as well.  If you find yourself coming up with words that don't actually appear in your document, stop and ask yourself  "why?".  Perhaps these are not appropriate keywords.  Or more likely you should be using these words in your introduction or in your text.  Several of the words in your Title should probably appear as keywords.

What Consultants Say About Keywords

As noted above, I think search engine consultants place too much emphasis on keywords. This is probably an echo of how things were a few years ago. But they still have interesting things to say. According to Lise Elsliger, "Some of the search engines that use Meta Tags when classifying sites include: Alta Vista, Excite, Infoseek, HotBot and WebCrawler."  She goes on to list several creative tips for generating keywords, my own comments are in italics:

  1. Visit sites that rank high in your area of interest and look to see what keywords they use. [hint: look at the Page Source]
  2. Ask your friends, family, and around the office to find out what words people would use. This is a good way to make sure you don't overlook some important words.  [Personally, I would never do this, but then I don't stop to ask for directions either :-)]
  3. Finally, you can stick to the words of advice of Jim Sterne, author of World Wide Web Marketing, who said "people will look you up by company name, product name, product type." [I.e. be sure to include these obvious keywords.]
  4. Place words in order of importance (the most important first) [Possibly some index services use heuristics that weight the order]
  5. Put your keywords in the plural. The word "computers" will cover both the singular and the plural form during a search, but the same can't be said for the opposite (i.e. if your keyword is computer and the person types in computers it won't work...Picky, huh?) [This is because "computer" is a proper substring of "computers".  Note that it wouldn't work with all plurals, so you may need to make multiple entries.]

I found an outstanding article on keyword selection by Sumantra Roy. Originally titled "Finding the Perfect Keywords", the latest version is Which keywords should you optimize your site for?.  He takes the suggestions above and kneads them into a full-fledged 8-step algorithm for finding the right keywords. He even includes a spreadsheet for calculating the Keyword Effectiveness Index (KEI)!  The article itself is clearly written and well worth reading.  It reveals the heights and depths to which people will go to get this right.

Via the original version of Sumantra's article I learned of two useful tools.

  1. JimTools Keyword Analyzer is a handy tool for analyzing keyword densities on specific web pages. It replaces an older tool that would use your keywords to query various search engines, and list the actual keywords that it finds in the top ranking pages returned. I don't know why the old one is removed.
     
  2. http://selfpromotion.com/susser.t?GUEST=1stSearchRanking
    The "keyword susser" is similar in that you enter a list of keywords, plus your email address.  The susser then queries the GoTo.com Search Term Suggestions Tool with each of the terms you've entered and emails you a report, sorted according to how frequently the terms were actually used in a query during the past month.  If you try the GoTo.com tool once yourself, this will make more sense.  The advantage of the Susser is that it combines the GoTo results for multiple keywords that you enter, and that it mails you the report.

Don't "Dress Like Spam"

There are techniques for "spamming" search engines, essentially lying to them about the contents of your document.  These techniques are of some interest, not so we can use them ourselves, but so we can avoid the appearance of using them: there are reports of search engines retaliating against perceived spammers by simply de-listing them.

Here is a short list of things to avoid. Do not...

Make Every Page an Entry Page

When we build our web sites, we build them hierarchically. A visitor links from the home page to a document, a document tree, or a sequence of pages. When we start navigating at the home page, it's pretty easy to keep some mental context of where we are. And there's always the Back button on the browser...

But search engines will index every page on your site (if you let them). Visitors can and will enter your site on every page. I discovered this the hard way, only when I began to study my web logs (a whole separate topic). I don't think I can make this point any more clearly than whoever wrote this in the AltaVista help pages:

All pages are created equal, as far as search engines are concerned. Because of the volume of traffic that could come to your site by way of search engines, you should pay attention to every page. Make sure that whatever page a visitor sees first, the focus and purpose of the site and the context of that particular page are clear. You also should be sure to provide navigation buttons on every page.

Checklist of Index Services

This section contains a list of important indexing sites to which you should submit your web site.  For a more exhaustive listing, see the The Major Search Engines on Search Engine Watch.  There is an excellent article on Search Engine Sizes, also on Search Engine Watch, that not only gives some indication of which have the largest indexes, but also has many fascinating insights and useful links.

It is helpful to understand at the outset that consolidation in the search engine industry means that some of these index services actually share data.  For example, several search engines use the Inktomi data base, and many directories are based on the Open Directory Project. You can use this knowledge to prune the number of required submittals.

Yahoo

Yahoo!

Submit: See below...
Type: Directory

Yahoo! is currently the acknowledged 800 lb gorilla of portal sites.  They now offer many different services, but the original basis of their business was the Yahoo! directory. It continues to be the most popular directory service in the industry.

Like all other directories that I have seen, Yahoo! requires the submitter to first determine the most appropriate category for their page. You have to navigate the directory to that category and actually make the tubmittal from that specific category. Again, this is burdensome, but customary for directories. Because of this, directory submittals always require more effort than than search engine submittals. If you want to be in directories, get used to it.

The other important way in which directories differ from search engines is that nothing gets listed without being reviewed by a human editor. The editor decides whether or not to list it and how to write the description. They may even change the category. This approach helps to ensure their overall quality, which is one reason why directories are popular.

Submit Your Site To Yahoo!

The Yahoo! "Suggest Your Site" document offers a step by step procedure for submitting a site to their directory. The following series of steps is based on that document.

Step 1: Check to See if Your Site is Already in Yahoo!

"Check by searching Yahoo! for your site by title, or by browsing appropriate categories. "  Obviously "browsing" Yahoo could take forever, so I just search. What you're looking for are actual "Yahoo!" matches, not "Web Page" matches, which come from a separate search engine (currently Google). If you're not sure what the difference is, try doing a Yahoo search for some well-known public figure, say "Van Morrison". These are the kind of hits you're looking for.

If your site is already listed in Yahoo! but you want to change its comments, title, and/or placement, you need to use their change form.

Step 2: Find the Appropriate Category in Yahoo!

Yahoo! has devoted an entire document to describing factors you need to consider in choosing a category.  The Yahoo! directory starts with 14 Top-Level Categories such as Arts & Humanities or Reference which are more or less immutable. These are the categories which appear on the Yahoo! home page. Each of these contains any number of sub-categories.  New sub-categories are created all the time (by Yahoo!) to better organize information "from the general to the specific."

Yahoo! does not allow one to submit directly to the top-level categories.  Their advice is to "get as specific as possible. Dig deep into the directory, looking for the appropriate sub-category" where sites that are similar to yours are listed. Unfortunately, this is a labor-intensive task. Worse, it is usually a fruitless task because Yahoo! is very selective about adding submitted sites to its directory. I've heard that only 10% of submittals result in a new directory entry.

Commercial or Non-Commercial?

The first and most important category decision is to choose between commercial and non-commercial.  Yahoo! makes a fundamental distinction between these types of sites:

If you're suggesting a site that is in any way commercial--it sells something, promotes goods and services, or promotes a company that sells goods and services--then the site belongs somewhere in the Business and Economy section of Yahoo! If the site is promoting goods or services to other businesses, schools, or organizations, it probably belongs under Business and Economy > Business to Business. If the site is promoting goods or services for the general public, it probably belongs under Business and Economy > Shopping and Services.
Regionally Specific?

The next category decision is between regional and non-regional.  Yahoo! wants to place every site that is specific to a particular locale, or primarily local interest under the Regional category. Remember that a local business belongs under the Business and Economy category.

Personal Home Page?

Yahoo! has a specific sub-category dedicated to personal home pages: Society and Culture > People > Personal Home Pages.  All personal home pages ("Is the page about you?") are supposed to live there.  Yahoo staffers may then link the site into related categories if you have sufficient content about a specific topic.  You can help by suggesting these related categories on your submission form.

Look Familiar?

The final word of advice from Yahoo! is to look for other sites that are similar to yours.  When you've found them, then that's a good place to drop anchor.

Step 3. Suggest your site from the Appropriate Category

You actually have to go to the Appropriate Category page in order to suggest your site.  At the bottom of the page is a "Suggest a Site" link (It's just to the right of the copyright and the "Company Information" link).  The form will require your site's title, URL, and a brief description.

  Very clever of Yahoo! to require that one actually navigate their hierarchy in order to make a submittal.

Step 4. Choose Normal or "Business Express" Submittal

Like many other directory services, Yahoo! now offers "expedited consideration" of your submittal for a payment of US$199. This service is called Business Express, and it guarantees that your site will be reviewed within seven business days. This review does not guarantee that your site will be included in the directory! The payment is non-fundable by the way...

Fortunately, they still offer their no-cost standard site submittal, but naturally there is no guarantee as to when they'll get to it.

Let's Try An Example

I have a web site devoted to Compact Disc technology (http://www.chipchapin.com/CDMedia/) that I would like to have listed in the Yahoo! directory. The page is neither commercial nor regional, which rules out two categories right away. The first step is to try to choose the appropriate top level category from the remaining 12. If there was a Technology category I'd go for it immediately, but since there isn't, I'm torn between Science and Computers. Another possibility, would seem to be Reference, since my CD site is designed to be a long-term reference on the subject.

I briefly browse each of these three categories and see what subcategories are common. I rule out Reference since it seems to be primarily dictionaries and what-not. Even the FAQ section is very abbreviated. Science has an Engineering subcategory which seems promising, but I see that it's organized by discipline. I don't really want my CD site listed under "Optical Engineering".

That leaves Computers and Internet. Some further browsing leads me to Hardware > Peripherals, where the best match seems to be the CD-ROM category.

At the bottom right corner of the CD-ROM page is the Suggest a Site link. Clicking the link brings up a screen where they simply ask me to confirm that my site is not already listed, and that I've read their guidelines for submittal. Of course the answer is "yes" and I push the button.

I'm greeted with a page asking me to choose between standard and expedited ($199) processing. Naturally I choose standard...

This puts me in "Step 1" of their four-step submittal procedure. These four steps are different from the four steps of "getting ready to submit" that we just went through. The Step 1 page simply asks me to confirm the submittal category: Computers and Internet/Hardware/Peripherals/CD-ROM and provide some basic information about the site:

  • Title
  • URL
  • Description

Since I've already taken my own advice and prepared my site properly, I have already written a proper Title and Description for it. I cut and paste, agonize over the wording a bit, and I'm done with Step 1.

Whoops! Not quite done. My description exceeds their 200 character limit. After a couple of tries I finally get it trimmed enough. On to the next step.

Step 2 offers the submitter an opportunity to suggest:

Additional Categories
I think my site is pretty well categorized in terms of Yahoo!'s framework, so I leave this blank. If I had been torn over which of several categories to use, then this is where I would put the alternatives.
New Category?
I think Yahoo!'s "CD-ROM" category is misnamed, since it already includes information on other CD types such as CD-R and CD-RW. I suggest that they rename it "optical media" and give some reasons.

Continuing to Step 3, I enter my name and email address. There is also a section for "Geographical Location of the Site (if applicable)" which I leave blank.

Step 4 wraps things up. Here one can enter the "expiration date" if the site is going to go away after a period of time, or the dates of an event, if that's what one's site is about. Neither of these apply to me, so I leave them blank. There is also a place for any final comments, which I also leave blank. I click the "submit" button, get a "thank you" screen and that's that.

My experiencing in submitting a number of sites to Yahoo! is that it goes pretty fast once you get the hang of it, provided that you already have a good Title and Description.

Cross-Referencing

Yahoo! staff performs cross-referencing into multiple categories.  You have to submit your site under a single category, but there is a place on the submittal form to suggest other suitable categories.  For example, a genealogy conference in Denver appears in the Denver > Events category and is also linked to the category for Genealogy conferences.

Google

Google

Submit: http://www.google.com/addurl.html
Type: Search Engine / Spider

Google has been my personal favorite search engine for several years (though the experience of writing this paper has opened my eyes to the advantages of several others).  It is currently the web search engine that Yahoo! users will get if their search terms aren't found in the Yahoo! directory. Google is very easy to submit to, and costs nothing.  Best of all, you only have to submit the URL for a single top-level page. They will follow, or "spider", all of your links to the other pages on your site and index them as well. For some time, Google has been the champion in terms of pages indexed, and they are very popular. In my opinion, if you only submit to one search engine, submit to Google.

Goggle is quite proud of the title of having the largest (most pages indexed) search engine. They are thus motivated to index as many pages as possible. Their site disclaims, "we do not add all submitted URLs to our index, and we cannot make any predictions or guarantees about when or if they will appear."  But when I made my first submittal to Google, around June 2000, it appeared in less than a day. 

The submission form is very simple: just the URL and a line of comments.

Updates

Regarding updates, Google notes that, "Our crawler, Googlebot, ... updates its index on a regular basis, so updated or outdated link submissions are not necessary".  In fact they may even be ignored. After I changed all the filenames in one of my websites I kept getting hits on the old page names from Google. I tried resubmitting the site, but it took at least 3 weeks for the index to be properly updated.

Beware of Unintended Consequences

Like all crawlers, Google may index things that you did not intend. For example, my home page used to include links to some files that I used as reference material on my local computer. This included an HTML Guide, documentation for various programs, and cached copies of certain large web pages. When I moved the page out to the web, I rather absent-mindedly moved those files as well so I wouldn't have broken links. But when Google indexed my site it indexed these pages as well and I began to get many hits directly on them. To make matters worse, they didn't even have links back to my own pages!

You can use the Robot Exclusion Protocol to prevent this, or just make sure that your pages are devoid of any links you really don't want to support. In any case, make certain that all of your "leaf" pages have clear links to their parents or to your home page.

Lycos

Lycos

Submit: http://home.lycos.com/addasite.html
Type: Directory, Search Engine / Spider
Experience
MyLycos!
Enter Zip Code:


Our Privacy Vow

Lycos includes both a search engine and a directory.  The directory is based on the Open Directory system, which, like Yahoo, requires submittals from the Appropriate Category. The remainder of this article deals only with the Lycos search engine. Submittals to Open Directory are covered in a separate section below.

Lycos also owns HotBot, but they appear to use separate search engines. HotBot uses Inktomi while Lycos apparently uses their own.

The Lycos Search Engine

The Lycos search engine ignores Meta tags and creates their database from the words that it finds in the text of the web page.  Their site notes, "Once your site has been spidered, it will be entered into our catalog within 2 to 3 weeks." But my experience was that it was entered immediately.

The Lycos submission form couldn't be much simpler: just the URL and your email address. I submitted a page and in seconds got back a screen that said "We have successfully spidered your page." Subsequent lookup on Lycos showed that they had indeed already indexed my page! This real-time indexing is an enormously valuable feature that should make Lycos a "must" for your submittals.

On their site they give some suggestions for better indexing, which offer important clues to how their keyword analysis works:

Decide on two or three terms or phrases on which to focus your efforts-the hot keywords that our 7 million users are most likely to search for. Give those words priority in your HTML. Important words should appear more frequently in larger headings and close to the top of the screen if not actually on the page title. An introductory paragraph with descriptive text that mentions your "hot" keywords will help our software create a better abstract of your site.

Will Lycos "spider" your whole site?

According to Lycos:

The Lycos spider will try to travel through links contained in the webpage you submit. A good rule of thumb is to count on the spider traveling down one level from the page you submit.

Lycos says "you may submit more than one URL from your site." But doesn't encourage it. My reading of the above is that the Lycos spider can be relied on to index pages that are a single link away from the page you submit, but should not be expected to index anything two or more links away. My observations of Lycos' indexing results bear this out.

This means that one should think carefully about the structure of one's pages before making a Lycos submittal. For example, I have a site devoted to the JavaScript programming language (Chip Chapin's JavaScript Tutorials) that consists of about a dozen pages. The pages are linked to each other so that you can go through them all using "Previous" and "Next" links, but there is no link directly from each page to each of the other pages. If I asked Lycos to index one of these pages, it would presumably index only that page and its immediately adjacent neighbors. But my JavaScript site also has an index page containing links to all the other pages. Submitting the index page to Lycos results in the entire site being indexed.

Appearance

Lycos listings start with the TITLE of the page, but they do not use your Meta Description. Instead, the first 200 or so text characters on the page are used. If your pages tend to start with nav bars and visit counters, you should consider reorganizing them so that some descriptive text comes first.

Updates

The Lycos spider will revisit each page "on a periodic basis" and should update the index automatically at that time. Because Lycos spiders new submittals in real time, I would assume that resubmitting a page will indeed hasten the updating process.

AltaVista

AltaVista

Submit: http://doc.altavista.com/addurl/
Type: Directory, Search Engine / Spider
AltaVista
 Find this:
 
 

AltaVista was one of the original web indexing sites, offered by good old DEC to tout their speedy hardware. They got off to an early start with good technology, and staked themselves a claim to the hearts and minds of many net users in the days when search engines were still a novelty, the words "e-commerce" had yet to ooze from anybody's lips, and AOL users had no direct access to the Internet.

AltaVista includes two distinct services: the AltaVista Search Index (a search engine) and the LookSmart directory.  Submission to the search engine is free, but it will cost you $199 ("Express Submit") to submit to LookSmart. The remainder of this section deals only with submittals to the AltaVista Search Index. There is a separate section for LookSmart.

Submit Your Site to AltaVista (Search Engine)

Submitting a page to AltaVista is as easy as entering the URL and clicking "Submit". They don't even ask for your email address. Click on the submittal address above to begin. The submittal page includes links for both types of submittal, Express Submit and Submit. Unless you're interested in paying to be in the LookSmart directory, just click Submit. The new page looks much like the previous one, but if you scroll down you'll see the submittal form and a box to enter the URL. Enter and click submit. That's about it.

Some Experience with AltaVista

The AltaVista index is maintained by a web crawler they call "Scooter". Scooter is in perpetual motion following all the links it can find. So if there are any links to your site from elsewhere, then it may already be indexed. At least that's the plan. All this sounds very similar to Google, but in practice my results with the two have been very different.

I submitted a single page on my site to both AltaVista and Google on the same day. Google proceeded to index my entire site and has returned about once a month to update its links (I know this because I read my web logs). AltaVista appears to have indexed only the page which I submitted and has not followed links to any of my other pages. So far as I can tell, "Scooter" has not been back since. So I think it's necessary to actively submit to AltaVista all the pages which you want indexed.

It's also desirable to "clean up" manually when your pages change their names. Just "submit" the old page: AltaVista is smart enough to realize that if a page (a) was already indexed and (b) is now "404", then it should be removed from the index.

Appearance

AltaVista listings consist of your TITLE and the first 150 characters of your Meta Description. This presents problems for those of us who are too lazy to update these fields in a multi-page document tree. I often write long documents and then split them into multiple pages. This has many benefits. But while I have gotten in the habit of updating the TITLE on each page, I confess to generally leaving the Meta Description unmodified. Of course this means that each of these document pages not only have the same "look and feel", they will also have nearly identical AltaVista listings! Not good. The moral of the story: write succint descriptions that are unique for each page and which complement the Title.

Excite

Excite

Submit: http://www.excite.com/info/add_url
Type: Directory, Search Engine / Spider

Like AltaVista, Excite is tied into the LookSmart directory network, and has the same $199 deal, which they also call Express Submit.  Your $199 buys you a guarantee that your site will be reviewed for the directory within two business days. They also offer Basic Submit for $99, which gets you a review within "approximately 8 weeks". See the section on LookSmart for further information.

However, also like AltaVista, Excite runs a search engine. You can still submit to the Excite Search Index at no charge. All of the following discussion relates only to the Excite search engine, not the directory.

How Excite Indexes Your Page

Like most current search engines, Excite will index your page according to a heuristic analysis of the actual page text, and will display the listing according to the TITLE and Meta Description that you supply. This illustrates once again how vital it is prepare a good Title and Description. It's not clear whether or not Excite uses the Meta Keywords are used for anything.

"Heuristic", by the way, is how computer scientists describe algorithms that are inexact hackery. It fools the bosses and makes the marketing folks happy. One of the rules appears to be that the most important words in the page are those that appear near the top. Their page includes the following example...

Suppose you want users searching for "Hawaiian bed and breakfasts" to find your site ... Relegate unrelated topics to subsidiary pages. If you're advertising your Hawaiian bed and breakfast, don't use the homepage to emphasize price, the way the ocean looks from a bedroom window, or your famous pineapple rum concoction. Instead, emphasize bed, breakfast, Hawaii, and vacation.

The Excite instructions claim that they do not index FRAME contents, however I found at least one framed pages that had been indexed even though there was nothing in its NOFRAMES tag. If this issue were important to me (it isn't right now because I rarely use frames), I'd try some experiments before investing the effort in creating NOFRAMES versions of everthing.

It's not clear whether or not Excite requires each page on your site to be submitted, or if it will follow your links.

Submit Your Site To Excite

To submit to Excite, click on the "addurl" link above. Ignore the two buttons for "express submit" and "basic submit". What you want is almost all the way at the bottom, where it says "Click here to add your site to the Excite Search Index only." Currently this takes you to http://www.excite.com/info/add_url_form

The submittal form requires your URL and email. There are also list boxes for Language, Geographical Location and Category. I assume the category box is for their own statistical purposes, it lets you choose from 18 broad categories, including "Other". Then click the submit button.

You next get a Thank You screen. Someone has cleverly placed a banner ad there that says "Warning: Your site may have errors", and which looks exactly like the official page text. Just ignore it.

Use the Back button on your browser to submit additional sites.

Updates

The Excite spider revisits all indexed pages from time to time -- no, I don't know yet how often that really is -- therefore it is not necessary to resubmit a page every time it changes. It's not clear whether resubmitting would expedite the update.

Appearance

Excite listings come in two flavors. The directory listings resemble Yahoo! and include a brief description written by the reviewer. The search engine listings are similar to AltaVista: they consist of your TITLE and the first 200 characters of your Meta Description. If a page has no Meta Description they just use the first text on the page.

NBCi

NBCi

Submit: See below...
Type: Directory, Search Engine (Inktomi)

NBCi is another portal site, struggling to reinvent itself and grab a piece of Yahoo!'s pie.  They represent the aggregation of several earlier internet businesses. This from their web site:

NBCi was created through the combination of Snap, XOOM.com, NBC.com, NBC Interactive Neighborhood, AccessHollywood.com, VideoSeeker and a 10 percent equity stake in CNBC.com.

I was delighted to find that their strategy is based upon some profound insights:

NBCi's business philosophy is to acquire members whose lifetime value exceeds their acquisition cost.

Gosh, why didn't I think of that...

NBCi's search capability consists primarily of a proprietary directory called LiveDirectory, and the Inktomi search engine. While submittals to LiveDirectory are encouraged, there does not appear to be any way to directly submit to the Inktomi database.  Fortunately, you can submit to Inktomi through HotBot instead. The remainder of this article deals only with submittal to the LiveDirectory service.

Like other directory services, LiveDirectory is organized hierarchically. Currently (January 2001) it has 22 top-level categories.

Submitals to LiveDirectory are free, but you must become a member of NBCi in order to have submittal privileges. There is no charge for membership. If you try to submit a site you will prompted to login or become a member. It is noteworthy that your member ID will be displayed along with each listing that you submit.

Besides the free directory, NBCi also offers a "Top Sites" directory for which you must pay $199 (as of this writing they are offering a special promotional price of $99). Your listing must appear in LiveDirectory before it can be promoted to Top Sites. Payment does not appear to guarantee your entry to Top Sites.

Listings in Top Sites have a number of practical advantages over ordinary LiveDirectory listings. Besides appearing first in all searches, they also can appear at higher levels of the category tree. So for example, the category Science & Technology : Engineering does not allow LiveDirectory submittals (it's considered too broad), but many Top Sites entries appear there. This is a huge benefit, because visitors may not ever make it down another level in the directory hierarchy. According to NBCi, Top Sites also receive "enhanced keywording," whatever that is.

Submitting a Site to NBCi

Like Yahoo!, LiveDirectory requires that you navigate to the appropriate category in their index before you can submit a site. If you start with the Submit a Site link on their home page, you'll probably regret it. You'll have to navigate the categories without the benefit of seeing what sites are already there. It's better to start at the Search & Find page, and browse the categories until you find the best fit, and only then click the "Submit a site to this category" link which you'll find at the bottom of the page.

For this example, I'll submit my Compact Disc Media web page, the same one used in the Yahoo! walk-through.

The first step, then, is to find the appropriate category. I'm pleased to find the word "Technology" among the top-level categories, but am worried that other pages similar to mine will be under "Computing". Browsing Science & Technology : Engineering doesn't lead to anything that looks similar to my page. On the other hand, Computing : Hardware : Peripherals : Storage leads to several similar sites, so I decide to drop anchor there.

I click on the "Submit a site to this category" link at the bottom of the page which brings up the most elaborate submittal form I've seen so far. Actually it does not require much more information than Yahoo!, but NBCi puts it all in one form instead of spread across several pages.

Confirm Category
I have an opportunity to choose a sub-category from a listbox if I wish. None of the sub-categories are better than where I am, so I leave this alone.
Site Title
Duh. I copy and paste the title from my web page. Note that you are limited to 128 chars.
Site URL
Double-duh. They don't say whether or not to add a trailing "/" to directories. It's generally a good idea so I do so, but later find that it is automatically removed it when the form is processed.
Short Description
Limited to 255 characters, which is more than most. In fact, all 255 chars will appear in the listing (which is one reason NBCi listings are much larger than most others).
Keywords
I can enter up to 10 additional keywords that do not appear in the title or description. This requires some consideration. My Meta Keyword tag includes about 30 keywords. It takes a few minutes to pick out 10 of them.
Owner/Agent/3rd Party
I'm asked to explain my association with the site being submitted. I click "owner".
Optional Information
There are a lot more boxes in the form, but all of the remaining information is optional. One can enter contact information for the webmaster, and geographic information for sites that are relevant to a particular locale. Lastly, there are a large number of checkboxes by which one can flag a site that contains multimedia elements. These will show up in the listings with special icons. I take special note of the fact that animations can be flagged, since many of my sites do include animations. But the CD site does not (yet) so I leave all unchecked.

I click "Submit", but the page comes back with an error: my description exceeds 255 characters. This is a good reminder -- I should have rewritten the description when I first submitted it to the other sites. I rewrite the description on my web page, then copy and paste it into the form. This time it is accepted.

NBCi puts up a Thank You page, including a summary of all the information you have provided. They say, "We will contact you by e-mail as soon as your site has been accepted, typically within 24 hours." I receive my confirmation email within about 30 minutes, but the site does not appear in the index until the next day.

Updates

For updates, NBCi takes advantage of the fact that submittals can only be made by registered members of their site. As a member, the sites you have submitted appear in a special "My Sites" box on the Search & Find page. You can use this box to check and update your listed pages.

Any category-based directory always begs the question of cross-linking a listing between multiple categories. LiveDirectory does not, apparently, offer cross-linking, but it is possible that Top Sites can be crosslinked. The site help files include the following comment:

Once a LiveDirectory site is in our collection, you cannot submit the same site again or request for your site to be placed in multiple categories. You may request a different category for a site you've submitted using the Comments field on the site's update page, and an editor will review your suggestion and get back to you as soon as possible.

Appearance

NBCi listings have a distinctive "wordy" appearance. Titles are displayed in a large (13 or 14pt) font and descriptions can occupy as much as three full lines. The Title and Description text comes directly from your submittal form. All listings also include the username of the submitter and the date on which it was last updated, and listings can optionally be sorted by submittal date. Listings may also include a series of icons which supposedly denote the presence of multimedia content. Actually, of course, they just denote which buttons were pressed when the site was submitted, and are therefore useless.

Each listing also includes a link for "Site Details", which leads to some remarkable information. You can see who added the site to the directory and when, and when it was last updated. But most interesting is the "Keyword Ranking" information. For example, the site details for the "Smithsonian Institution Dinosaur Museum CD-ROM" show that it ranks #2 in popularity under the search term "dinosaur museums", #3 under searches for "smithsonian museum of natural history", and #58 among searches for "dinosaurs".

HotBot

HotBot

Submit: http://hotbot.lycos.com/addurl.asp
Type: Search Engine, Directory

Search the Web:

HotBot is now a part of Lycos but they do not apparently share the same database. HotBot uses the Inktomi search engine and Open Directory. The submittal form referenced above is only for adding sites to the Inktomi database.  Like Yahoo! and NBCi, Open Directory submittals require you to navigate to the appropriate subcategory before you submit your site. The Open Directory submittal process is discussed below.

Submitting a Site to HotBot's Search Engine (Inktomi)

I click on the link shown above and get the submittal form for the Inktomi database. It only really asks for two bits of information from me, the URL and my email. There are also a bunch of checkboxes which, if left checked, will have you opt-in for some mail lists.

For this example I will use my CD Media web site. Before I submit it, I check to see if it is already indexed. It isn't, so I enter the URL, enter my email address, and make sure the opt-in boxes are all clear before clicking "Submit".

HotBot replies with a "Thank You" screen which kindly informs me that "Your Web site will be added to HotBot's index within the next 60 days." Great.

Open Directory

Open Directory

Submit: Start at http://dmoz.org/ ...
Type: Directory

Open Directory provides directory data for a bunch of index sites: AOL Search, AltaVista, HotBot, Google, Lycos, Netscape Search, and others.

The Open Directory Project is Open Content (not quite Open Source...). Essentially this means that anyone can download and publish the directory listings, in whole or in part, for free, just by agreeing to include a brief acknowledgement.
[ ODP ]

The Open Directory Project is often called ODP.

Submitting a Site to Open Directory

As with all other directory submittals that we've examined, Open Directory submittal requires that one first navigate to the appropriate category. Users of the Open Directory database sometimes change the categories. So, for example, there are 16 top-level categories at the ODP site, but HotBot only shows 14 top-level categories.

I'll use, once again, my CD Media web site for an example. The 16 top-level categories are similar to what we've seen in the other directories. In very short order I find myself in the Computers : Hardware : Storage : Optical category. This is as deep as the other directories have gone. But Open Directory has 11 subcategories at this stage. My web site consists of reference material for CD media technology, so I'm torn at first between the Reference and Media subcategories. But "Media" turns out to be full of commercial products, while "Reference" contains other sites like my own (but not as good, of course :-).

Near the bottom of the page in the category listing are links to "Submit a Site" and even to "Become an Editor". The latter seems very interesting, but a subject for another day, or at least another part of this writeup. I click "Submit" and get a new page with the submittal form.

The Open Directory submittal form is very straightforward. It begins by cautioning me to check their submission policies, which I do since I've never submitted to them before. No surprises there, but the next item gives me pause:

You need only submit your site once to the Open Directory. Once your site has been accepted into the Open Directory, it may take anywhere from two weeks to several months for your site to be listed on partner sites which use the Open Directory data, such as AOL Search, AltaVista, HotBot, Google, Lycos, Netscape Search, etc. We make updates of the data available weekly, but each partner has their own update schedule.

So if I had submitted this site to any of these directories, I should not submit it again here. At this particular point, the CDMedia site has only been submitted to the search engines on those other services, not to their directories. So I proceed.

The submittal form requires the site URL, Title and Description, and one can optionally supply one's email address. Description is limited to "25-30 words", or probably 150-200 characters. I enter the URL (with trailing slash), and use copy and paste to enter the Title and Description. Finally, I add my email address (why not?) and click "Submit".

There is a long delay on the next page while a banner ad from wired.com [fails to] load, and then Netscape fails to display the rest of the page :-(. I used "View Source" to see the text of the page. It is a "Thank You" page, saying that my submittal has been accepted, and repeating the warning quoted above: "it may take anywhere from two weeks to several months for your site to be listed".

Becoming an Editor of Open Directory

My experience submitting the CD Media site to Open Directory (see the HotBot section above) led me to explore becoming an editor. Each category in Open Directory contains a link that leads to an application form for that specific category. This makes it easy to apply in a specific category where your expertise will be meaningful. Once you've been accepted as an editor in some category, you use a different form to request permissions for new categories.

Submitting Your Application

The application form starts simply enough: I supply my name and email address, and choose a userid and password (use caution with the password, because it will later appear in a cleartext mail message). But the next two questions require a bit of thought.

It takes me a few minutes to write answers to these questions that I think are satisfactory. But the next section really requires some thought:

Fortunately, it is relatively easy for me to provide "2 or 3 URLs" in this particular category -- I spent weeks doing research on the web for my CD Media site, which includes an entire page of links with my own descriptions. I pick out 3 good ones that do not already appear in this Open Directory category and plug them in.

After submittal, I get a simple "thank you" page, and the news that I "will soon receive email with further instructions." Within 30 minutes an email does indeed arrive. It contains a unique registration key in the subject line and simply requires that I reply so that they can confirm my email address. Within a few minutes of my reply, a "thank you" message arrives advising me that my application is being considered. All very tidy and efficient.

I submitted my application Friday evening. At 9am Saturday morning the approval arrives by email. It includes brief instructions, as well as my userid and password (in clear text). A much larger document also arrives, Open Directory Guidelines which provides detailed instructions and standards for such things as title capitalization. I'm now an official editor of the category Computers : Hardware : Storage : Optical : Reference.

Good Citizenship

It shouldn't be necessary to say this, and experience suggests that those who need to hear it aren't listening, but one should not become an Open Directory editor solely to tout one's own pages. In fact, you'll soon get kicked out if you do. Rather it should be viewed as an opportunity to directly apply your knowledge, experience and energy in a way that benefits the world community. The Guidelines document sums it up:

The goal of editors should be to produce useful resources for the web public. We do not bar editors with business affiliations, since those editors with their own sites usually know their competition and related sites better than anyone. This knowledge can be ideal for helping build an authoritative directory. However, we will not tolerate editors who only add their own sites, or maliciously interfere with others' listings in the directory.
http://www.directhit.com/

Direct Hit

Submit: http://www.directhit.com/util/addurl.html
Type: Search Engine / spider

Direct Hit at first seems to be "yet another search engine." However they actually have one of the most innovative approaches. It may not be everyone's cup of tea, but it's an effort to separate the wheat from the chaff on-line without resorting to armies of editors. The Direct Hit rankings are

based on actual search activity of millions of Internet users. New sites submitted or found when our spider crawls the web are listed in the search engine and given an opportunity to be found by searchers. Sites that users visit and spend time at for particular search topics are then ranked higher than sites that are consistently ignored.

I have a personal benchmark of how completely any given search engine has indexed the web: I just search for my own name. For my purposes, the best search engines find pages and pages of things that I and a few other Chip Chapins have been involved in. Direct Hit is unique among search engines in that it found absolutely nothing. This did not endear it to me.

However, one is compelled to admit that searches for "Chip Chapin" are quite rare. If Direct Hit does a superior job of the most popular searches, then they definitely will have something to offer.

The Direct Hit web site lists many partners, and some of the listings are confusing at first. For example, they claim that HotBot uses their search engine and tell us to look for the "Powered By Direct Hit" logo at the bottom of the results page. But when I search with HotBot my results show a "Powered by Inktomi" logo.

It turns out that was precisely because my search on HotBot (for my name) simply did not get any hits from Direct Hit. If you search on HotBot for, say, "Van Morrison", you will find that the first five listings under Web Results come from Direct Hit. Direct Hit also supplies the "Refine Your Search" terms at the very top of the HotBot listing page.

How Direct Hit Indexes a Site

Direct Hit calls their spider "Grabber."  Here are some observations culled from their site:

Submit Your Site To Direct Hit

Submittal is free and relatively painless.  Besides URL and your email, it also asks for a list of keywords.  It is unclear to me just how these keywords are used. They say that inclusion in their index may take "several weeks or longer". For this example I used my CD Media site.

Entering the URL and email was a no brainer. For the keywords, I just used a subset of the words on my page. I don't know if there's a limit on how many can be supplied. After clicking "Submit" I got a "thank you" page which warned, "Depending on how many sites have been added to the index recently, it may take several weeks or longer before your site begins to appear in the Direct Hit index."

Appearance

Direct Hit listings consist basically of Title and Meta Description. In some cases, the URL appears instead of the Title, and instead of a description you just have the first 255 characters of text on the page.

The thing that sets them apart is that entries are scored according to how many searchers click through to the various sites. In order to record this information, each link in a Direct Hit listing will take you first to a Direct Hit server. The click can then be recorded as it is then redirected to the ultimate target.

dh-score.gif (1K)

Entries are also scored by how long people spend there. I'm guessing, but Direct Hit probably estimates this from the time between clickthroughs from the Direct Hit listing.

Scores are shown graphically, as above. The highest score I've seen was 5.95. The lowest score was a little greater than 1. Apparently only the first page of results are displayed with scores, and only if they are over a certain threshold.

FAST

FAST / AllTheWeb

Submit: http://www.alltheweb.com/add_url.php3
Type: Search Engine / Spider

FAST is a Norwegian technology company that has entered into a partnership with Dell to create the AllTheWeb data center and search index. AllTheWeb uses FAST's search engine and is vying with Google to have the largest number of web pages in their index.  As of the end of October, 2000, they both are claiming over 575 million pages (Google also has many partially indexed pages, taking them over a billion). The FAST web site makes it clear that a search engine is not the end of their strategy but is simply a stepping stone. They have very explicit plans for providing searches and content to users of mobile phones.

FAST Web Search
Web Search

The FAST crawler will not only fetch text documents, but also images, audio, and video.  In fact, they offer an alternate image compression format known as FAST Image Transfer (.fst files) that claims to be 2-3 times smaller than JPEGs and 10 times smaller than GIFs.  It requires a special Java decoder.  They also have a similar Video Transfer product. One supposes that these will become more important as users begin to use mobile phones to access multimedia content.

When I searched for my name on AllTheWeb I was surprised to find that they had already indexed my entire web site, plus many other references around the web. Then I remembered that one of my pages had been submitted to AllTheWeb several months ago when I tried out a search engine submission service. They evidently spidered the whole site and have kept it reasonably up to date since then. In fact, I've seen their spider's activity in my web logs at least once a month since it was first submitted.

Submit Your Site To AllTheWeb

AllTheWeb submittals simply require the URL and your email address. After you click "Submit" you get a "thank you" page that says your site has been "scheduled for addition to alltheweb.com at the next update of the index."

Appearance

AllTheWeb listings are very clean and consist simply of the Title (linked directly to the URL), Meta Description (or the first text on the page if there was no description), and the URL. Listings occupy the full width of the screen and there is a minimum of clutter. Descriptions can be up to 255 or 256 characters long, and they thoughtfully break them at a word boundary.

I'd have to say that AllTheWeb listings are possibly the best I've seen. Unlike Google, they don't try to "hide" similar pages, a strategy which often seems to backfire in my searches.

Northern Light

Northern Light

Submit: http://www.northernlight.com/docs/regurl_help.html
Type: Search Engine / Spider

Northern Light is reported by Search Engine Watch as having one of the larger indexes: over 350 million pages as of October 2000, about the same as AltaVista.  Registration is free and very easy, requiring only the URL, your name and email address.  Their web crawler, "Gulliver," will follow all the links from the single page that you submit to index all the pages on your site.  They continually update their index by crawling the web.

Northern Light Search
Tips

They do not guarantee that a submitted site will be listed, and they admit that average turnaround time for submitted sites to appear in their index is about six weeks, which seems like an eternity to me.

The Northern Light search engine will read and index all of the text on your pages.  They do not use the contents of Meta or Alt tags, but only visible text.  Below are some of their criteria for weighting a page with respect to particular index terms.  It's well worth studying these.

Submit Your Site To Northern Light

The submittal page is simple and quick. They ask submitters for the URL of course, plus their name and email address. I enter the info for my CD Media page and click "submit". I'm taken to a "thank you" page and that's that.

According to their Webmaster FAQs, "Our average turnaround time for submitted sites to appear in our index is about six weeks"

Appearance

Northern Light generates their site listings directly from the TITLE tag and the first 170 or so characters of the body text.  They do not use the Meta Description field even if it is present. If there is no TITLE tag, they will use the first 80 or so characters of the text.

Multiple page listings from the same site are not shown by default. There is an icon to click if you want to see more. This is similar to Google.

The engineers behind Northern Light have managed to "add value" to their listings in a number of significant ways. Each listing includes a score, ranging from 1 to 99 or 100%, that denotes how well it matched your search. For example, in searching for my name, "Chip Chapin", the highest scores (92% and 89%) were from documents that had my name in the Title, and also had it appear in the body text very close to the top of the page. Listings that simply mentioned my name somewhere in the document tended to rank around 50%.

Besides the score, there is also an attempt to broadly classify each listing into categories such as Articles & General info, Directories & Lists, Bulletin boards, Newsgroups & Forums, Questions & Answers. The classification appears to be derived from a heuristic analysis of the page text. For example, anthing with "FAQ" in its title is likely to be classified as "Questions & Answers". This classification is printed next to the score.

Sample listings from Northern Light

But wait, there's more. Not only are the listed pages classified, but so are the sites from which they are served. The site classification is derived from the URL itself.  So .com and .net sites will be listed as "commercial," .org sites are listed as "non-profit," sites with a ~ (tilde) in the URL (e.g. http://www.bayarea.net/~cchapin/) are listed as "Personal pages", and sites from a national top-level domain, such as .fi or .nl would be listed by nationality ("Finnish site" and "Netherlands site" in these cases). There are also precedence rules. Thus "http://zeus.jesus.cam.ac.uk" is listed as an "Educational site" rather than as a "United Kingdom site"

Each listing also shows a date, which I believe is the modification date of the file. In some cases it says "Date Not Available"

Finally, Northern Light has a separate feature called Custom Search Folders. All the results from your search are classified yet again into categories that range from the obvious ("Commercial sites") to the intelligent ("Modern architecture"), to the truly surprising ("Apocrypha & Pseudepigrapha"). Some of these are quite profound. I was very surprised to see "Funeral products & services" in my list -- it turned out to contain several obituaries in which someone of my name appeared.

Northern Light has one other unique feature that deserves mention, though it's not relevant to getting our web sites listed. They have indexed a large number of printed periodicals: academic journals, newspapers and magazines. This is called their Special Collection, and copies are available for a fee.

Updates

Northern Light uses a spider (called Gulliver) which periodically revisits all listed sites. However their Webmaster FAQs included the statement that one could "forward your URL to us at Crawler@NorthernLight.com and we will submit it for re-crawling." It's unclear to me to what extent that will expedite the procedure.

Go.Com

Go.com / InfoSeek

Submit: http://addurl.go.com/dynamic/landNotLogged
Type: Search Engine, Directory

Not to be confused with GoTo.com, Go.com is a portal site owned by Walt Disney Internet Group.  It includes both directory and search engine components. appears to be proprietary, and InfoSeek is their search engine.  I'm not certain, but I believe that InfoSeek itself is owned by Disney.

Appearance

Sample listing from Go.com

The Go.com search results listing offers its visitors a uniquely unpleasant visual experience. The listing is squeezed into a narrow column only 328 pixels wide, leaving the rest of the screen free, apparently for advertising. Strangely, even though the search engine reported 33 hits when I searched for my name, I could find no way to make it display more than 20 of them.

Searches for popular topics will return three sections:

Buttons at the top of the Web Search Results listing allow one to change the sort order and request "details". The details are a search relevancy score, a date (apparently the file modification date) and the file size. When I searched for my name, relevancy scores ranged from 58% down to 15%. It seems to based on position in the document. On the other hand, when I searched for "CDR" and "subchannel", documents that simply matched CDR (or CD-R) in the title returned 100% relevancy, even though subchannel did not appear at all.

Submit Your Site To Go.com

When I first studied Go.com in October 2000, submittals to the search engine and to the directory were completely separate processes. Revisiting this site in January 2001, I find that this is no longer the case. There is now a uniform submittal procedure. Like many other portal sites, Go.com offers a paid submittal for $199 called Premium Service, and a no charge Basic Service.

Premium Service.
  • Cost $199. No guarantee of inclusion.
  • Guaranteed site review within 48 hours.
  • Inclusion in the Premium Listing Service (if site is added).
  • Weekly spidering updates for one year (if site is added).
Basic Service
  • No charge.
  • May take up to 10 weeks to appear in the index.
  • Listing in search index only.
  • No guarantee of spidering.

It is entirely unclear to me at this time what exactly the Premium Listing Service might be.

Go.com submittals were once the easiest: just enter your URL and click Submit, they didn't even want your email address.  But now they have the most painful procedure of any search engine. This is the only index service that requires a phone number for all submittals. They require the title, and URL, your name, email address and phone number. They also provide space in the form for the description, keywords, your Company and job title. You also must agree to their terms of service, which actually state that a fee of $199 is required for each submittal. Since this contradicts their advertised Basic Service, let's assume that it is not enforceable.

For this example I use once again my CD Media site, using copy and paste for the description and keywords. With some reservations, I include my phone number. I'm listed anyway, so what difference does it make? I click "submit" and very quickly get a "Congratulations" page that says "Your site has been added to the GO.com Search Index." A quick search of the index does not reveal my site. I suppose it could take 10 weeks.

Becoming a "Go Guide"

"Go Guides" maintain the Go.com directory. Click here to find out more.

I decide to apply. After taking the "tour", I am prompted to first become a member of Go.com. Members must identify themselves as kids, teens or adults, and are asked to supply a birthdate and postal code (zip code) in addition to their name and email address. There is a "privacy policy" document, but no real explanation is provided as to why these fields are required.

There are also a bunch of checkboxes for "interests" and opt-in for "special offers". I cleared them all, but when I had to go back to the form to choose a different userid, several got reset. Regrettably I didn't notice in time.

Go.com members can become "Go Guides" instantly, simply by clicking a link. There is no vetting of the guides at this stage. You then select categories of the directory and apply to become an editor of that category. I requested access to the "Technology > Hardware > Peripherals and Storage" category, and was automatically approved at once. This seems to be a far cry from the Open Directory Project, which is fairly careful about who gets to edit their directory.

However, the system is not as bad as all that. When I try to suggest a new subcategory, I am not allowed to do so: I don't have enough "points". I have to earn points by suggesting new sites before I get additional privileges. The same applies to editing existing entries.

Updates

If you've paid for Premium Service, your site will be spidered weekly, which should be often enough. Otherwise, since they do not guarantee any spidering at all of Basic Service submittals, you should resubmit it. It may take up to 10 weeks for changes to appear in the index. There is a form on the submittal page for removing dead URLs.

LookSmart

LookSmart

Info:  http://submit.looksmart.com/info.jhtml
Type: Directory

As we've seen, the LookSmart directory is used by AltaVista and Excite. They are also used by MSN, which we haven't covered. Unlike the other directory services which we've looked at (Yahoo, NBCi, Open Directory, and Go.com), LookSmart requires payment before they will consider a page for inclusion. The others all offer no-charge submittals, though in some cases the delay is so long that you may prefer to pay.

LookSmart's pay-to-play system irritates me. But on reflection, I think in some ways it is a more honest way of doing business than others who offer no-charge submittals, but delay them so long as to be nearly useless. LookSmart is up front about it, and the two business day guarantee of their Express Submittal means you get more for your $199 than most other directories.

Submit Your Site To LookSmart

LookSmart offers two primary submittal options, Express Submit and Basic Submit.

Submittals are limited to 5 URLs per domain. Exceptions to the 5 URL limit are available, but you must contact LookSmart to discuss it.

LookSmart offers no-charge submittal to registered non-profit groups.

As with all other directories, they do not guarantee inclusion, and there is no refund if, after review, you are not included.

So far, I have not been willing to pay $199 to submit a site, and therefore have nothing further to add about LookSmart.

GoTo.com

GoTo.com

Submit: See Below
Type: Pay-per-Click

GoTo.com pioneered the concept that web promoters will pay a certain amount for each "click" from the index.  Thus, with GoTo you can actually bid for higher index placement on particular keywords. The GoTo listing will show sites in order by how much they were willing to pay. 

GoTo.com: Search made simple
border border border
border border border
border border border
Affiliate Info

When I first used GoTo several years ago, I felt that this pay-for-placement approach distorted the search results. Well, of course it does. But from the perspective of a web site promoter, high placement in the GoTo listings can be very cost effective. Remember, you only pay when someone clicks through.

GoTo now has deals with a number of portal sites such as HotBot, Lycos and AOL. The top few hits from GoTo receive preferred placement ahead of listings from their regular search engine. This greatly increases the value of high placement (top 3) in the GoTo index.

Here is a recent press release from GoTo about their relationship with Lycos (including HotBot):

Users who conduct searches on Lycos.com, HotBot.com and the directory pages of Lycos Network sites will be shown search results that include GoTo's top 3 listings in a section called "Featured Listings." In addition, GoTo listings in the 4th and 5th position will appear among the first 10 results in the "Web Sites" section on Lycos.com, and GoTo's 4th listing will appear in the number 10 position in HotBot's "Web Results" section.

Here is a similar release describing their relationship with AOL (including Netscape):

Users who conduct searches on AOL.com and AOL Service will be shown search results pages that include GoTo's top 3 listings, and users who conduct searches on Netscape Search will be shown search results pages that include GoTo's top 2 listings.

Appearance

GoTo.com Search Listing

As I was writing this, I made the unpleasant discovery that something (possibly an ad) on some of the GoTo search results pages causes Netscape 4.7 to crash. Microsoft IE worked fine. I have never had this problem with GoTo before.

The distinguishing feature about these listings is the legend Cost to Advertise at the end of each listing. In the example shown here, someone is paying 20 cents per click for the top listing. The third-place listing pays 15 cents. If you were to scroll down, you would find that the fifth-place listing drops to only 5 cents, and the 9th through 22nd listings pay only 1 cent each. After that comes unpaid listings.

The listings themselves are straightforward: the title comes from the Title field of the document and the description comes from the Meta Description field, if present, or from the first text on the page. Paid listings can rewrite the title and description, which is a pretty nice benefit for 1 cent per click. I like the clean appearance.

Submitting Your Site to GoTo.com

Follow the "List Your Site" links that appear at the bottom of each page. There are two basic packages that GoTo.com offers. The $99 Express Service includes

You must sign up for Express Service by phone.

The standard Online Signup is illuminating. Paid listings on GoTo.com each consist of five items:

Note that a single site can use separate titles and descriptions for every keyword that they bid on.

GoTo.com's software will perform some simple checks on your submittal. For example, I have a site devoted to the computer game Half-Life. I decided to try entering a listing using "Half-Life" as the keyword. However, I failed to include the term "Half-Life" in the description. The GoTo software caught this and gave me a warning. Good stuff.

Entering these listings for many keywords can be tedious. GoTo.com offers a special spreadsheet for download that you can use to make large numbers of submittals at once.

Submittal is a 3-step process. After entering your desired listings, the second step is to provide contact information. You will be asked to supply your name, address, phone number and email address.

The third step is where you are actually asked for money. GoTo.com has a minimum payment of US$25 to open an account. If you deposit more money, it is all refundable except $25.

Updates

Once you have an account with GoTo.com, you can manage your listings and update your bids through their "DirecTraffic Center" on-line management tool. Unpaid listings cannot be modified.


After You Submit Your Site

After you've gone to all the trouble to submit your site to the various index services, you'd probably like to just forget about it and move on to other tasks. I know I would. But just as we all check our HTML after making changes to a document, so should we check our results with the search engines and directories.

The easiest way to do this is usually to search for your domain name or host name in each search engine. Not only will you find out which of your pages are listed and which are not, but you'll also see how they are listed. You may find that pages are listed with missing titles and cryptic or incorrect descriptions.

Once again, the author of the help file at AltaVista had some great things to say. These specific searching suggestions could be modified to work with most other search engines.

A search for host: followed by your domain name will provide a list of all the pages from your site that are in the AltaVista index. If the list is relatively small, you can easily see which pages lack titles and instances where the same title and/or description has been used multiple times. You can also check the dates, to see which pages might be obsolete. If you have changed domain names or file names or directory names at any time, do searches for link: followed by the old address to find out if there are pages (either at your site or elsewhere) that still link to the old addresses. You can immediately fix the old links inside your site and contact the webmasters of other sites, asking them to fix their old links.

Pay for Promotion

Businesses listed in this section will promote your site for a fee, or sell tools to help you promote your site. Currently these are all search engine submittal services and tools.

WebMasterSuite

Info:  http://www.webmastersuite.com/
Type: Submittal Service

For $30 WebMasterSuite will submit a site to 400 search engines.  For no charge they will submit to 16 search engines, provided that you register with them.  I think they charge $30 for each site, which would be excessive if submitting multiple sites.  Also this doesn't corellate well with Lycos' recommendation of submitting each page of a site separately.  Here are the 16 search engines that they submitted for me: 555-1212.com, All the Web, AltaVista, Direct Hit, Google, InfoSeek, InfoSpace, IQ Seek, Multilinks, Northern Light, Oneseek, PeachPod, Peekaboo, Search It, WebCrawler, Zap Directory.

Trellian SubmitWolf

Info: http://www.trellian.com/swolf/index.html
Type: Submittal Tools

A commercial software product, cost $95. A free download is available (restrictions not clear).  They describe themselves as follows:

SubmitWolf PRO is an easy to use, professional website promotional tool, which automates the process of promoting your web pages on the internet. It can dramatically increase the number of visitors to your web site. It has a database of over 3500 search engines, link list and business directories.  SubmitWolf Pro can automatically register your website's details with 1500+ of these sites, simply and quickly achieving mass exposure.  Why pay a submission service to promote just a single URL, when for just $95 you can own the software which many such services themselves use. In addition, you can submit as many URLs as you like, at no extra cost.  Incidentally, over 1000 URL submission services rely on SubmitWolf technology.

Global Promote

Info: http://www.gloprom.com/
Type: Submittal Service

Describes itself as follows: "Global promote is a first class translation and submission service utilizing Trellian technology."  They will "promote" a web page for $25.  They also provide international translation services.

Restricting Indexing of Your Site (robots.txt)

There are ways of giving "hints" to search spiders or robots as to what parts of your site may be off-limits or not ready for public view.  One way is through the "robots.txt" file.  Details of creating the robots.txt file are found at http://info.webcrawler.com/mak/projects/robots/exclusion-admin.html.  There can be only a single robots.txt file for a given web site, and it must appear at the top level of the site, e.g. "http://www.chipchapin.com/robots.txt".

Another way is through the "Robots META tag" ( http://info.webcrawler.com/mak/projects/robots/meta-user.html) which can be included in the HEAD section of any HTML document.  For example:

<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">

means that a robot should neither index this document, nor analyse it for links.

Remember first of all that these are just hints and do not actually create any sort of security.  Secondly, a given web crawler robot may or may not actually obey these hints.

References and Links

SelfPromotion.com (http://selfpromotion.com/index.t)
"If you want more people to visit your website, you must know how to create search-engine-friendly webpages, and then submit your URL to all the major search engines. That's what this site is all about."  SelfPromotion.com is a poster-child for good use of the Title, Keywords, Description and Introduction.  It's a source for massive amounts of information on this whole subject.  And it's free!  The brainchild of Robert Woodhead, who sounds like a fine fellow.  The whole tone of his site says "techie".  I like it.
SearchEngineWatch (http://searchenginewatch.com/)
An entire site by Danny Sullivan devoted to the study of search engines.  This was recommended by Robert Woodhead at SelfPromotion.com.  It includes a lot of resources and is well worth spending some time on.  Among the many excellent articles, this one comparing search engine features is particularly cool: Search Engine Features For Webmasters.
WebMasterSuite (http://www.webmastersuite.com/)
Besides running a submittal service, one of the best things about WebMasterSuite is their collection of informative articles.  These really help to clarify how the different search engines work and what's important to a proper listing.  I particularly liked the first article:
Lise Elsliger, " Do's and Don'ts of Getting and Staying Listed in Major Search Engines"
PromotionBase.com (http://www.promotionbase.com/)
This is where I found the Keywords article by Sumantra Roy, which in turn led me to SelfPromotion.com and various useful tools.  It contains a nice assortment of articles, though they tend to be directed more toward marketeers than techies.
JimTools / JimWorld
This is where one of the keyword search tools appears, http://www.jimtools.com/keywords/index.htmlThe two sites seemed to be down for several days as I was writing this, but I can reach them now. I'm glad I persisted in trying to reach these sites because there is a wealth of material here.  Besides the keyword tool and many others, JimTools includes a free search engine submitter.  The submitter submits to 46 separate search engines, but no directories. The site is heavily sponsored and the ads are annoying, but at least they're all relevant to the subject of web site design and promotion. In fact the ads are unusually well-integrated into the overall design of the site and may be worth study for that reason. See his Ad Rates page for more info.
"Welcome to the Road to More Traffic"  -- Not the most appealing slogan, but that's what JimWorld is all about.  The site includes a lot of stuff, such as an article on "How to Interpret Server Log Files," and one on "How To Use Meta Tags".  The material presented is quite basic, but I still learned things from it.