Yahoo! Slurp is Yahoo!'s web-indexing robot. The
Yahoo! Slurp crawler collects documents from the Web to build a
searchable index for search services using the Yahoo! search engine.
These documents are discovered and crawled because other web pages
contain links directing to these documents.
As part of the crawling effort, the Yahoo! Slurp crawler will take
robots.txt standards into account to ensure we do not crawl and
index those pages that you would not like to have returned via Yahoo!
Search Technology. If a page is disallowed to be crawled by robots.txt
standards, it is neither considered for inclusion nor placed in
the search engine's database.
Yahoo! Slurp follows HREF links. It does not follow SRC links. This
means that Yahoo! Slurp does not retrieve or index individual frames
referred to by SRC links.
Yahoo! Slurp has support for frames and makes an effort to crawl
complex URLs such as those generated by forms, content generation
systems, and dynamic page generation software.
How can I prevent Yahoo Slurp from following
links from a particular page or archiving a copy of a page?
Yahoo! Slurp obeys the noindex meta-tag. If you place:
<META NAME="robots" CONTENT="noindex">
-in the head of your web document, Yahoo! Slurp will retrieve the
document, but it will not index the document or place it in the
search engine's database.
| <META NAME="robots" CONTENT="noindex"> |
Yahoo Slurp will retrieve the document, but it will not
index the document. |
| <META NAME="robots" CONTENT="nofollow"> |
Yahoo Slurp will not follow any links that are present on
the page to other documents. |
| <META NAME="robots" CONTENT="noarchive"> |
Yahoo maintains a cache of all the documents that we fetch,
to permit our users to access the content that we indexed
(in the event that the original host of the content is inaccessible,
or the content has changed). If you do not wish us to archive
a document from your site, you can place this tag in the head
of the document, and Yahoo will not provide an archive copy(Cache)
for the document. |
Yahoo Slurp indexes not only the title and meta tags,
but also the full text of webpages. So including quality content
in the webpage is as important as including keywords in the title
and meta tags.
Yahoo searches for pages and when it finds a page with the required
keyword, it lists the page in its SERPs. The position of the page
depends on the content. But there are chances for a page with the
required keyword being left out due to poor content or because Yahoo
could not find the page.
How to attract Yahoo slurp to crawl the site ?
There are 3 ways you attract the Yahoo Crawler in crawling the site:
1. Get links from sites that are regularly crawled
by the Yahoo Robot, If that is done Yahoo regular visits the site
and crawls it, Regular Yahoo visit is a good sign and helps a lot
of getting good Ranking,
2. As yahoo says you can trigger the Yahoo Robot by
browsing a site using the Yahoo companion toolbar, Yahoo says this
will trigger the Yahoo slurp Bot,
3. Through the Infamous PFI/PPC program sitematch,
This type of inclusion guarantee's an inclusion into the Yahoo index,
so no problem using it,