//Algorithms for ranking link distances via @martinibuster
1558213865 algorithms for ranking link distances via martinibuster 760x490 - Algorithms for ranking link distances via @martinibuster

Algorithms for ranking link distances via @martinibuster



algorithms for ranking link distances via martinibuster - Algorithms for ranking link distances via @martinibuster

algorithms for ranking link distances via martinibuster - Algorithms for ranking link distances via @martinibuster & nbsp;

h3_html = & # 39;

& # 39; + cat_head_params.sponsor.headline + & # 39; & # 39;

& nbsp;

cta = & # 39; & # 39; +
atext = & # 39;

& # 39; + cat_head_params.sponsor_text +

& # 39 ;;
scdetails = scheader.getElementsByClassName (& # 39; scdetails & # 39;);
sappendHtml (scdetails [0] h3_html);
sappendHtml (scdetails [0] atext);
sappendHtml (scdetails [0] cta);
// logo
sappendHtml (scheader, "http://www.searchenginejournal.com/");
sc_logo = scheader.getElementsByClassName (& # 39; sc-logo & # 39;);
logo_html = & # 39; - Algorithms for ranking link distances via @martinibuster & # 39 ;;
sappendHtml (sc_logo [0] logo_html);

sappendHtml (scheader, & # 39;


& # 39;)

if ("undefined"! = typeof __gaTracker) {
__gaTracker ('create', 'UA-1465708-12', 'auto', 'tkTracker');
__gaTracker ("tkTracker.set", "dimension1", window.location.href);
__gaTracker ('tkTracker.set', 'dimension2', 'link building');
__gaTracker ('tkTracker.set', 'contentGroup1', 'link building');
__gaTracker ('tkTracker.send', 'hitType': 'pageview', page: cat_head_params.logo_url, & title> #:; Cat_head_params.sponsor.headline, & # 39; sessionControl & # 39 ;: & # 39;
slinks = scheader.getElementsByTagName ("a");
sadd_event (slinks, click & # 39 ;, spons_track);
} // endif cat_head_params.sponsor_logo

There is some kind of binding algorithm that is not discussed enough, but not enough. This article is intended as an introduction to link distance ranking algorithms. This is something that can play a role in the ranking of sites. In my opinion, it is important to be aware of this.

Does Google use it?

Although the algorithm considered is from a patent filed by Google, Google's official statement on patents and research papers is that they produce a lot and that # They are not all used, sometimes even differently from what is described.

That said, the details of this algorithm seem to resemble those of the Contours of what Google has officially declared about how it handles links.

Complexity of Calculations

There are two sections of the patent (Production of a Ranking for Pages Using Distances in a Web Link Graph) The complexity of the calculations is as follows:

"Unfortunately, this variation of PageRank requires solving the entire system for each seed separately. Therefore, as the number of seed pages increases, the complexity of the calculation increases linearly, thus limiting the number of seeds that can be used in a practical way. "

So you need a method and an apparatus to produce a ranking of Web pages using a large number of diversified starting pages …"

The above highlights the difficulty of carrying out these calculations on the Web because of the large number of data points It is easier to calculate these niches by subject

What is interesting in this statement is that the original Penguin algorithm was calculated Once a year or more, sites that were penalized remained virtually penalized until the next seemingly random date of Google's calculation of the Penguin score.

At one point in time, Google's infrastructure needed to be improved. "Google is constantly building its own infrastructure but apparently does not advertise it. The Caffeine Web Indexing System is one of the most exceptions.

Real-time Penguin Deployed in the Fall of 2016.

It is remarkable that these calculations are difficult. It indicates the possibility that Google performs a periodic calculation for the entire Web, then assigns scores based on distances between trusted sites and all other sites. Thus, a gigantic calculation, carried out each year.

Thus, when a SERP is calculated via PageRank, the distance scores are also calculated. This is very similar to the process known as the Penguin algorithm.

"The system then assigns lengths to the links based on the properties of the links and the properties of the pages attached to the links. The system then calculates the shortest distances between the starting page set and each page of the page set according to the length of the links between the pages. Next, the system determines a ranking score for each page in the page set based on the shortest calculated distances. "

What does the system do?

The system creates a score based on the shortest distance between a seed lot and the proposed ranked pages.The score is used to rank those pages. .

So this is a superposition above the PageRank score to help eliminate manipulated links, based on the theory that the links being manipulated will naturally have a longer distance between links the spam page and the set of trust.

It can be said that the ranking of a web page includes three processes.

IndexingRankingRanking Modification (usually related to personalization)

] It is an extreme reduction of the ranking process.

It is interesting to note that this process of ranking by distance takes place during the ranking part of the process. he does not is no chance to classify meaningful sentences without associating page to seed.

Here is what he says:

"A possible variant of the PageRank that would reduce the effect of these techniques is to select a few pages" of confidence "(Also called the starting pages) and to discover other pages that may be good by following the links of the pages of trust.

This is an important distinction: to know in which part of the grading process the seed is calculated, as this helps us formulate our grading strategy.

This is different from the Yahoo TrustRank. thing. YTR turned out to be biased.

It can be said that the Majestic TrustFlow Topical is an improved version, similar to a research paper that has demonstrated that it is more accurate to use a niche organized seed set. Research has also shown that it is better not to organize a seed game algorithm per subject worth more than one order.

So it makes sense that Google's distance ranking algorithm also organizes its seed game by groups of niche subjects.

As I understand it, this Google patent calculates distances between a seed lot and assigns distance scores.

Reduced Link Graph

"In a variation of this embodiment, the links associated with the shorter distances constitute a reduced link graph. "

This means that there is an Internet map known as a link graph and a smaller version, the link graph filled with web pages containing spam. filtered pages. Sites that primarily get links outside of the reduced link graph may never enter it. The dirty links have no traction.

What is a reduced link graph?

I will keep this short and nice. The link to the document is presented below.

What you really need to know is this part:

"The early success of the linkage algorithms was based on the assumption that the links Although many links exist today for purposes other than authoritative ones, such links introduce noise into the analysis of links and undermine the quality of recovery.

To get high quality search results, it is important to detect them and reduce their influence … With the help of a classifier, these noisy links are detected and dropped. this, link analysis algorithms are run on the reduced link graph. "

Read this PDF for more information on reduced-link graphs .

If you get Make links to sites such as news organizations, it may be fair to assume that they are inside the reduced link graph. But are they part of the seed? We should not perhaps be obsessed with this.

Does Google say that negative SEO does not exist?

"… the links associated with the shortest calculated distances constitute a reduced link graph"

A reduced link graph is different from a link graph. It can be said that a graph of links is a map of any Internet organized by the links between sites, pages or even parts of pages.

There is then a reduced link graph, which represents a map of all but some. sites that do not meet specific criteria.

A reduced link graph can be a map of the Web, minus sites that are not spam. Sites outside of the reduced link graph will have no effect on those located inside the link graph because they are located on the outside.

This is probably why a spam site referring to a normal site will not cause any harm to a non-spam site. Because the spam site is outside the reduced link graph, it has no effect. The link is ignored.

Could this be the reason why Google is so confident that it captures link spam and that there is no negative SEO?

Distance from the seed game is equal to less Power ranking?

I do not think it is necessary to try to map what is the seed. What is more important, in my opinion, is to be aware of news neighborhoods and their link to the place where links are obtained.

At one time, Google displayed a PageRank score for each page so I could remember which types of sites tend to have low scores. There is a class of sites with weak PageRank and Moz DA, but they are closely related to sites that I think are probably just a few clicks away from the seed.

What Moz DA measures is a reconciliation of the authority of a site. It's a good tool. However, what Moz DA measures may not be at a distance from a set of seeds, which is impossible to know because it's a Google secret.

I do not put down the tool Moz DA, continues to use it. I'm just suggesting that you want to expand your criteria and define what is a useful link.

What does it mean to be close to a seed lot?

After a ] document from Stanford University on page 17, asks what is a good notion of proximity. The answers are:

Multiple ConnectionsQuality of ConnectionDirect and Indirect Connections Length, Degree, Weight

This is an interesting consideration.

Take Away

Many people worry about the anchor text. ratios, DA / PA inbound links, but I think these considerations are a bit old.

The problem with DA / PA is a backtrack on trying to get links from pages with a PageRank of 4 or more, which was a practice that has started from a randomly selected PageRank score, the number 4.

When we talk or think when we consider links in the context of ranking, it may be worth considering the distance ranking part of this conversation.

Read the patent here

Images by Shutterstock, Modified by the author