By Danny Sullivan, Search Engine Watch, Oct
26, 2001
Some crawler-based search engines make it easy to
confirm that your web page is in their index. With
others, it can be more difficult. Below are the best
ways to find your web pages in the major crawler-based
search engines.
Please note that these commands can also be useful
for web searchers who wish to refine their queries,
as explained more in the Site Search section of the
Power Searching For Anyone page.
The Search Engine Alliances page explains where some
search engines get or give their listings to others.
Also, Search Engine Watch members have access pages
that explain in depth the different data sources that
each major search engine users and how they display
information.
AltaVista
AltaVista has commands that can be used to easily
narrow your search to a single URL or to pages within
a particular web site. These commands can also be
combined with query terms by those who wish to refine
their search results.
URL Search
To find a single page listed in AltaVista's crawler-based
index, you can use the "url:" command. Simply
preface the URL you wish to locate with this command,
such as:
url:http://searchenginewatch.com/webmasters/meta.html
If the URL is in the index, it will be displayed.
You can also use this command to find pages within
a particular section of a web site. For example, this:
url:http://searchenginewatch.com/webmasters/
would list all the pages from within the /webmasters/
area of Search Engine Watch. This can be a useful
way to find all the pages from your web site, if it
resides within someone else's domain.
Site Search
To locate all the URLs listed from a particular
web site, use the "host:" command, such
as:
host:searchenginewatch.com
Use only the actual domain name. Omit the http://
prefix. Also, be aware that using the www prefix can
make a difference. For instance, the query below:
host:www.searchenginewatch.com
would bring back only the pages AltaVista has found
from Search Engine Watch with the www prefix. However,
Search Engine Watch can also be reached without the
www prefix. In fact, this is the more common way that
people come to Search Engine Watch. Consequently,
AltaVista has actually indexed many more pages from
the site without the www prefix. To see these pages,
the first example shown would have to be used.
AllTheWeb.com/FAST Search
At FAST Search, commands can be used to find a single
URL or multiple web pages from a particular site,
as explained below:
URL Search
To find a single page listed in FAST's crawler-based
index, you can use the "url.all:" command:
url.all:searchenginewatch.com/webmasters/meta.html
This command will also work to bring up a single
URL that is listed in the FAST-powered results used
by Lycos.
Site Search
To locate all the URLs listed from a particular web
site, use the "url.host:" command, such
as
url.host:searchenginewatch.com
Use only the actual domain name. Omit the http://
prefix. Also, be aware that using the www prefix can
make a difference, as described with AltaVista.
Google
At Google, commands can be used to find a single
URL or multiple web pages from a particular site,
as explained below:
URL Search
To find a single page listed in Google's
crawler-based index, you can use the "allinurl:"
command, such as:
allinurl:searchenginewatch.com/webmasters/meta.html
The allinurl command works the same as with AltaVista,
which means you can also use it to find pages within
a particular section of a web site. Be sure to omit
the http:// prefix.
Please note that if you are trying to find web pages
with both words in the URL and in the document itself,
you'll need to use the special "inurl" command.
This is explained more in the URL Search section of
the Power Searching For Anyone page.
Site Search
To locate all the URLs listed from a particular web
site, use the "site:" command in combination
with a word or words that you know appear on all the
pages. For example:
site:searchenginewatch.com searchenginewatch
would bring up all (or nearly all) of the pages Google
lists from Search Engine Watch, because all the pages
should have the word "searchenginewatch"
on them as part of the footer text.
You must use the site command in combination with
a search term. It will not work, otherwise.
Inktomi
Inktomi powers some of the results used by a variety
of different search engines. Below is how to locate
a single or multiple URLs within Inktomi powered-listings.
URL Search
To find a particular URL listed in Inktomi's crawler-based
index, you can use the "originurl:" command.
Simply preface the URL you wish to locate with this
command, such as:
originurl:http://searchenginewatch.com/webmasters/meta.html
If the URL is in the index, it will be displayed.
This command has been tested to work on the following
Inktomi-powered services:
AOL Search
GoTo
HotBot
The originurl command will bring up an individual
URL listed in the Inktomi-powered results of these
services. It does not work at iWon, LookSmart or MSN
Search.
Keep in mind that not all Inktomi partners tap into
the entire Inktomi database. That's one reason why
you may find a URL at one service but not at another.
Site Search
To locate all the URLs listed from a particular
web site, use the "domain:" command, such
as:
domain:searchenginewatch.com
Use only the actual domain name. Omit the http://
prefix. As explained above for AltaVista, using the
www prefix can also make a difference.
Unfortunately, the domain command works inconsistently
at different Inktomi-powered services. Here's a rundown:
HotBot: Use the command. If your have any listings
in the Open Directory, these will be shown first.
You'll know these are Open Directory listings because
they will have a "More like this" link underneath
them. By pass these and find a listing for your site
that instead has a "See results from this site
only" link. Select that link, and you'll see
all the pages listed in Inktomi from your web site.
iWon & LookSmart: Using the command will list
all pages from a web site.
At the following services, the command fails to operate
because "clustering" (as explained on the
Search Assistance Features page) prevents you from
seeing more than a few pages from your site.
AOL Search
GoTo
MSN Search
Directory Search
Inktomi has a special command that lets you find
pages within a specific areaof a web site. This is
the "originurlpath:" command, and you use
it in combination with the domain command, such as:
domain:searchenginewatch.com originurlpath:webmasters
This would find pages from within the /webmasters/
area of Search Engine Watch. In other words, everything
within this area would be listed:
http://searchenginewatch.com/reports/ekgs/
Directories: Yahoo, LookSmart & The Open
Directory
Directories are search engines that are powered by
human beings, rather than by crawling the web. Because
humans are involved, directories tend to list only
a few pages per web site. This means that you probably
won't be needing to make use of special site or URL
commands to locate your listings. In fact, of the
three major directories, only Yahoo has any specific
command like this. At Yahoo, you can use the "u:"
command to locate specific URLs, like this:
u:searchenginewatch.com
That would bring up any pages from Yahoo's human-compiled
listings that contain "searchenginewatch.com"
within the URL, if it is done from the Yahoo Directory
page, as opposed to the regular Yahoo home page (which
brings back Google results).
At the web's two other major directories, LookSmart
and the Open Directory, you'll find that searching
for your domain or a portion of your domain should
bring up many or all of your listings.
For example, by entering "searchenginewatch.com"
or "searchenginewatch," I would be able
to find most of my human-compiled listings in both
places.
LookSmart also provides a detailed guide to locating
your URL within its service and the listings it provides
to partners:
How to Find Your Listing in the LookSmart
Network
http://submit.looksmart.com/info.jhtml?page=find
Outsourcers
All the search engines and directories mentioned
above produce their own listings. However, there remain
some major search engines that simply outsource to
produce the results they provide. For example, MSN
Search uses information from both LookSmart and Inktomi.
Trying to check your listings at such a place is difficult,
because two or more data sets are involved. Below
is a guide to what happens as such places
AOL Search
Use tips described above for the Open Directory to
find your Open Directory listings and for Inktomi
to find your Inktomi listings.
HotBot
Using the Inktomi site search command described above
will brings up both your Open Directory and Inktomi
listings. There is no command to bring up all your
pages included in HotBot's Direct Hit powered "Top
Ten" listings.
Lycos
Use tips described above for FAST to find your FAST
listings. There is no way to find all your listings
from the Open Directory or Direct Hit information
the service uses.
MSN Search
Use tips described above for Inktomi to find your
Inktomi listings. There is no way to find all your
listings from LookSmart.
Netscape
Use tips described above for the Open Directory to
find your Open Directory listings and for Google to
find your Google listings.
Other Resources
There are services that can check search engines
for your URL automatically, including checking on
how they appear in relation to particular keyword
phrases. A list of these is maintained for Search
Engine Watch members in the Position Checking/Tracking
section of the Search Engine Optimization Toolbox
page.
Also, this page gives you just a taste of some of
the powerful searches that can be done with search
engines. See the Power Searching page for an at-a-glance
guide to other types of searches.