How Google Search Works website

How Google Search Works website

httpv://www.youtube.com/watch?v=MoW60fuHn64

A website that you might or might not have visited – HowSearchWorks – if you haven not seen it, I highly encourage you to check it out. It is a small site that talks about advances Google had in crawling, their algorithm, how they fight spams, and make public about their removal policies – some really concrete integrated stuff. Some thing you might not have noticed on the site:

If you go to the main page and you scroll through a little, you’ll notice it’s basically almost like an info graphic, but it’s actually interactive, so you can click around and find all kinds of Easter eggs. What is really interesting is if you scroll down to the bottom of the page, it will tell you how long you have been on that page. In that time, Google has already handled 5.7 million searches or something along those lines. Google handles over 2 billion searches a day or something along those lines, it’s pretty neat to just figure out the amount of math and you can extrapolate and say how many searches a day does Google have. That’s a static counter based on when the site launched. But its kind of an interesting thing to play with. As you look through the site you will also find that Google talks about how they do evaluation so there are some videos on how they evaluate search quality. Just to remind people on a common misconception a lot of people have: Google evaluates new algorithm and then sends them out to quality raters and they look have to decide which one looks better to them. They don’t know what algorithm is being evaluated and whenever they vote, Google takes their data and selects the better results. But they don’t take those votes/ratings from the quality raters and directly apply them in their ranking algorithm. What is interesting is that they showed the funnel for the things in the recent year 2012 where they went through 118,000 ideas where they just played around with the new way of generating search results. Using the ratings they have gottten from the quality raters, they are able to say in general which looks like a promising experiment.

for example from there we did ten thousand will call side by sides where
can you get these side by side censor search results and
it’s like a blind taste test and you ask people which one do you like better based on that we did seven thousand what
we call live traffic experiments where we actually take experiment and we put it out on our main website
and we look at how often people click on various search results to try to determine whether a person
making the search results better so the net result was that we were able
to launch 665 algorithmic changes things that changed
on our search results page in 2012 which is kinda have interesting to put that into context
that’s roughly to changes to how we generate the search results
page every single day for the entire year so it’s kinda funny when people you know
cum in ass well okay what happened on such-and-such day because there’s
usually a lot of stuff happening things rolling
out new dated being deployed and those are actual changes not just
data being refreshed that we’re talking so that gives you a little bit a feel
for the scale of how many different changes were exploring at any given point now a part of how
search words that I enjoy the most in this fam section and there’s a lot of
nitty-gritty detail that we will do all kinds of information that you might not
have seen before so for example there’s a spam carousel and that is updated periodically so you
can actually get to C span them right after we’ve removed so we’ll show
you a screenshot so you know run into danger of getting you know
infected by malware or something but it is it’s literally like you can
watch over her shoulder as we removing spam so you get a chance
to see that sorta stuff that we have to deal with every single day right below the skin
care so you see that we have different types it’s been so we talk about the
categories it’s been I think that’s pretty hopeful to know
because that lets you know the sort some stuff that we have to deal with so the major categories are cloaking or
sneaky redirects hacked sites hidden texter keyword
stuffing parked domains pure span which is just
another name for black cat when it’s like you know any savvy user will hopefully
be able to recognize it as absolute spam things like spin me free
hose or dynamic dns providers the content with very little added value
unnatural links from aside unnatural links to SA and the
user-generated spam we might have good content up front but maybe so many
spam comments that it’s actually causing then search result were better user experience so there
were specific more granular more detailed things within each one others so you know unnatural links from a site
might involve someone who is selling links the past a tree for example but that gives you an idea
up the overall categories that we look at whenever actually fighting spam the other thing that’s kind of
interesting if you serve down the page and look a little bit is we give you several different graphs
we actually tell you month by month the actions that we’ve
taken on span so what types have actions and how many actions we
took and if you look you’ll see that the vast majority of what we tacklers is what we classify as sure swim sure
spam or black ass p.m. so that just means
that it’s stuff that you know it’s gibberish you know its
ultimate anybody would be able to recognize if they’re sufficiently Sammy maybe
machine generated my regenerate its for spam hopefully
this sort of thing that anybody would look at me like wow I hope I don’t see that in my search
results something that you might not notice is the next biggest category with in
recent years has been hacked sites and it’s kinda
funny because like back in 2010 there was some issue
who wrote something like was the web’s been team been doing
haven’t seen a lot action from the recently and we were actually engaged in a
pitched battle hand to hand combat what hap sites which if you just a regular SEO or even
a regular black hat SEO back then you might not have noticed as
much so it’s not the case that we’ve heard taking a break or you know taking
things easy we were working very hard on spam it was
just that has been the most people hadn’t encountered your and we’re going to keep working on all
those kinds of things you can get those kinds and insights when you look through
these graphs and see okay this is the history of the sort of
stuff that Google has had to tackle interns spam um what’s also interesting is we started to do more more message in
overtime now we could probably do better and think about other ways to give more
concrete more actionable messages to webmasters and we’re gonna
keep exploring their but when you look at the milestones in turns over what
we’ve done in terms of communication it actually is pretty exciting and you
can see the volume spike up as we’ve started to get more and more information at this point for pretty much any direct
action that you take that the manual was being team takes that affects your
ranking the webmaster will get a message about
that and that’s really helpful because at least you know there’s an issue and
you can start to do. and did with dig into it and start to
investigate a little bit so it’s kind of interesting yeah you
know i’m looking at one graph that says in January of 2013 we sent over 431,000 messages as a result of the
actions that we took on the web spam theme and so it’s its
the other thing that you should think about is the scale which were operating now remember that’s
manual webstream actions which then generated some sort a message to the webmaster the
idea that we could have a one-on-one conversation with 431,000 different owners a website’s sorta shows
you the scale that were operating at and why it’s hard when I’m so far we haven’t figured out a way to
have a one-on-one conversation with every single webmaster to rank number one were entirely or has
questions about you know past potential web spam action but what
you can see below that is a graph that shows the
reconsideration requests have been submitted and so for a random
month I you know in in 2013 %uh there were
roughly are for real week there were roughly
5,000 reconsideration requests and bear in mind this is interesting so
over a month 401 30,000 messages go out in a week we get five thousand reconsideration
request urges reconsideration request messages so if you take that week long
baseline and turn it into a month call about 20,000 reconsideration
request processing message is that we handle
during the month now what’s interesting about that is if
you do the math that basically means all the people we learned that men who
alleged inaction right now at least less than five
percent have those people request reconsideration that it means that most the time were
killin spam in the spammers are not same hey this is not right I won a contest
that they’re actually saying okay you caught me gonna move on and try to do it on a
different URL where you won’t catch me next time so it’s kinda need to take some these
numbers and compare them our and play a little bit
with realizing what it with insights can we get from these kinda graphs and it it shows you the skill the
problem if you have twenty thousand people a month who want to talk to you
about why they think their website should bring highly when we think that it has at least
violated the guidelines you see this would have difficulties we
have been trying to talk to everybody will keep trying to do better will keep
trying to be more transparent but I think it’s fantastic that we’ve
got this how search works website we’ve got some dashboard for you can see
how things are going and you can even see live examples of
spam as they get thrown out so we’ll keep looking at ways to make
things even better birder I I think you really enjoy the website
if you get a chance to check out dig in and just you know absorb some new
information that’s available on the website thanks very much.