|
Sites to be indexed
1) fully indexed all sites located in the directory GOU-SEX.com.
2) indexed the home page section - title, description, keywords that contain the following keywords and their word forms:
title :
* porno *, * porn *, * sex *, * adult *, * fucking *, * fuck *, * BDSM *, * bizzare *, * femdom *, * bondage *, * strapon *, * pregnant *, * panti *, * pantihouse *, * mastrubation *, * squir *, * handjob *, * ass *, * anal *, * amateur *, * fetish *, * fisting *, * teen *, * erotica *, * porn * * sex * * erotic * * VDSM * * Bondage *, * lesbian *, * trans *, * gay *, * fisting * * latex *, * fetish *,
description:
* porno *, * sex *, * adult *, * fucked *, * porno *, * prostitutes *, * sex *, * erotic *.
keywords:
* porno *, * sex *, * adult *, * porn *, * sex *.
Search robot conducts insensitive comparison, the symbol * denotes any number of characters (narimer - will be indexed page containing the section title as the word Sex and the word SexShop). Other sites (pages) will be ignored by search robots.
Use Robots.txt
Our search robot named SexSearchBot and use robots.txt and exclusions. The robots.txt file is updated once a week.
Directive in robots.txt for our robot
Examples:
complete ban
User-agent: SexSearchBot
Disallow: /
ban indexing folder cgi-bin
User-agent: SexSearchBot
Disallow: / cgi-bin /
If the server is heavily loaded and no time to work out requests to download, enter the "Crawl-delay". It allows you to specify search engine a minimum period of time (in seconds) between the end of downloading a page and start downloading the next. For compatibility with robots that do not fully follow the standard processing robots.txt, the directive "Crawl-delay" should be added to the group, beginning with the recording "User-Agent", immediately following the directives "Disallow" ("Allow").
User-agent: SexSearchBot
Disallow: / cgi-bin /
Crawl-delay: 2 # specifies taymut in 2 seconds
Relevance
Organizing Documents
When sorting documents prior to issuance of a search result by default GOU considers two parameters of each instrument: relevance and popularity. First, documents are sorted by relevance, and if equality, by popularity.
Calculation relevancy
The relevance of each document is calculated as the cosine of the angle between the vector of weights of this document and the vector of weights corresponding to the search query, multiplied by 100%. The number of vector coordinates is equal to the number of words in the query (with all the word forms and synonyms) to the number of sections. Each coordinate of the vector corresponds to a word from the query in a document section. And the value for this position is calculated based on the weights of sections and on the basis of whether a given word is mentioned in the query or its forms. And one more coordinate is equal to the average distance between the searched words in this document. For this query vector coordinate is equal to 0.
Rating Popularity
The search engine supports the GOU method of calculating popularity rating - SexRank. By default, the popularity rank calculation uses only links between different sites.
Method of calculating the popularity SexRank
Popularity rating is calculated in two stages. At the first stage of the initial value of the parameter for each server divided by the number of links from this server. Thus, the weight of one link from this server. At the second stage, for each page is the sum of the weights of all links pointing to this page. This sum is the popularity of the page.
By default, the primary parameter for each server is 1.
|
|