"Search Engine" is a generic
term commonly used to identify Web Sites that match users searching
for information to a list of Web Sites that may meet the user's
needs. While all "Search Engines" perform the same
function, the methods they use to create their Web Site lists
(listings) varies greatly. Some "Search Engines" use
employees to review and build Web Site listings. Others generate
their listings using computer programs, commonly refered to as
robots. Some like Yahoo incorporate both.
Employee Generated Listings Overview
Search Engines that utilize employees accept
or deny requests to add new Web Sites, generally submitted from
web site administrators. The requests contain information about
the site and suggestions on how the requester would like for
the site to appear in the Search Engine's listings. Reviewers
generally verify that Web Sites meet the "Search Engines" standards
and place approved sites into categories. End users looking for
information move through the categories like (Entertainment/Music/Rock
and Roll) looking for web sites to meet their needs.
Robot Generated Listings
Search engines that utilize robots also
accept requests to add new web pages to their listings. These
requests are then passed into the robot. The robot visits the
site, reads the pages, and follows all the page links and stores
the information it finds in the search engines database.
Rather than browsing through categories,
users looking for information through this type of search engine
enter search strings. The search engine then takes the search
string and tries to match the it to web site information stored
the database earlier by the robot. Web Site matches are returned
to the user in the order that the search engine perceives to
be most relevant.
Stated another way, what the robots find
when they visit your site determines if and how your site will
appear when users perform searches related to your company. It
is critical to understand stand how the different robots read
and store information in order to utilize their features to your
companies advantage.
What Robots do and don't understand
To begin with robots are similar to very
older web browsers and can only understand a limited amount of
the possible information that many web sites contain. For instance
robots don't understand pages generated using tools like cgi
scripts, database scripts, or shockwave. They may not read image
maps. The use of frames may either prevent the search engine
from finding pages within a web site, or it may cause the search
engine to send visitors into a site without the proper frame "context" being
established.
If the robot can't understand the web page
it visits or only understands the wrong piece of it, the page
won't be stored in the search engines database or won't be found
by users trying to find the products and services your company
provides.
Does all this mean that a web site has to be extremely simple
so that search engines can read and record it?
No, but it does mean that the web designer
needs to anticipate these problems and account for them.
Once you make your site readable by the
robot it is important to give the robot the information you want
it to have. Unfortunately you can't just fill out a form to tell
the robot about your site, it reads tags both visible and behind
the scenes to determine for it's self what information your site
contains.
Have you ever performed a search for something
like barbecue and gotten back stuff like this:
No Title
Bad Cat Fall Fling. October 12, 1997.
Cadet. 103 lbs. David Wason (BloomingWale)
Steve Truso (Hillsboro) 112 lbs. Jeffrey MaCava (WCW) Frederic Gevao...
Not likely many will click on this site
over any other is it. This web site designer failed to design
the site so that it spoon feed the robot quality information
about the sites content. This site may be the greatest looking
and most useful site ever created but it won't be of much use
if no one sees it.
As important as it is give search engines
the information you want displayed it is even more important
to make sure that the robot has stored an accurate representation
of your pages content. If the search engine holds information
about your "bed and breakfast" site that causes it
to come up on the search results lists of people interested in "eating
breakfast in bed" it won't do your Bed and Breakfast much
good. What's even worse is people that want your service won't
be able to find you.
Robots collect the information contained
in your sites <TITLE>, <BODY>, <META> tags
and use it along with a few other factors to determine if your
site matches a users search criteria and what information is
displayed to the user performing the search.
More Info
This was designed as a very basic overview.
There are many search engines and directories, each with it's
own specific characteristics. If your interested in learning
more and have some time on your hands Search
Engine Watch is a great source for more in-depth info.