No categories

Search & Internet Marketing Manager SEO BLOG

Elias Kai Google-Kai.com

Archives Posts

Fresh Google Index

September 3rd, 2007 by elias.kai

A site known as Bodog.com has been offline for at least one week.(27th of August until 3rd of September)
Bodog.com is still cached and indexed by Google and even showing on Top Search Results.

I think Google should give more attention to the status of any site and take an action on it.

http://www.google.com/advanced_search?

If you remove all the uninteresting parameters from the search URL, you’ll find that as_qdr is responsible for date restrictions. For example, here’s how to restrict a search for [IRAQ] to pages first seen by Google’s crawler in the past 24 hours: (I would love to see it as for the last 3 minutes with a Fresh Ajax design)

http://www.google.com/search?q=iraq&as_qdr=d

Note that you’ll only find new web pages and not pages that were cached and updated in the past 24 hours. That means you won’t find homepages from popular sites or other frequently-updated pages. If the date range is small, you’ll mostly find news and blog posts.

The amazing thing is that you can change the value of as_qdr to custom intervals. I will list some possible values of the as_qdr parameter:

d[number] – past number of days (e.g.: d10)
w[number] – past number of weeks
y[number] – past number of years

For example, http://www.google.com/search?q=iraq&as_qdr=d10 lets you search for pages that contain “Iraq” and were created in the past 10 days.

Archives Posts

Defend Google Proxy Hacking

August 17th, 2007 by elias.kai

How a third party can remove your pages and site from Google’s Index?

Be aware of all .info sites.

With the introduction of “Big Daddy,” Google crawls from many different data centers; they also changed the algorithm substantially at the same time. According to Dan “It appears that the changes include moving some of the duplicate content detection down to the crawlers. [This is problematic. In short:]

1. The original page exists in at least some of the data centers.
2. A copy (proxy) gets indexed in one data center, and that gets sync’d across to the others.
3. A spider visits the original, checks to see if the content is duplicate, and erroneously decides that it is.
4. The original is dropped or penalized.

So … the problem is that if you flood Google with massive amounts of duplicate content, it exposes a vulnerability. Eventually the algorithm makes a mistake, and your content is no longer authoritative.

« Previous Articles by Elias Kai