Google Operating System Unofficial news and tips about Google

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Tuesday, 17 July 2007

Finding Related Web Pages

Posted on 10:25 by Unknown
Google is the only major search engine that offers a "similar pages" feature, but not too many people use it. Launched in September 1999 as GoogleScout (scout=explore, investigate), the feature shows around 30 web pages related to a search result.

For example, to find sites related to Google Reader, you can click on the "similar pages" link placed after the snippet and you'll discover other feed readers, Google Reader's blog, information about feeds, blog platforms.


The related pages are generated by analyzing the link structure of the web. A patent from 2000 explains how this feature works: "a first set of hyperlinked documents that have a forward link to the selected hyperlinked document is provided. Additionally, a second set of hyperlinked documents that are pointed to by the forward links in the hyperlinked documents in the first set is provided. A value is assigned to each forward link in each of the hyperlinked documents in the first set, with the value being reduced for a forward link if there are multiple hyperlinked documents from the same host as the hyperlinked document that includes the forward link. A score is generated for each hyperlinked document in the second set according to the values of the forward links pointing to the hyperlinked document. Accordingly, a list of related hyperlinked documents is generated from the second set according to the score of the hyperlinked documents."

Basically, you're expecting that many sites that link to Google Reader will also link to its competitors and to related information. This is very similar to Amazon's recommendations: "customers who bought this item also bought".

How to use this features?
  • find the competitors of a company (e.g.: DaimlerChrysler)
  • find similar music (e.g.: Regina Spektor)
  • you like a site and want to find other similar sites (digg)
  • explore a domain (like machine learning)
  • refine the search results (search for bass and find the related pages for the first result about the fish)

Unfortunately, Google's implementation has a major flaw: because many pages link to popular sites like Blogger, Flickr, StatCounter, you'll sometimes find these sites in the list of related links even if they're completely unrelated. Gred Linden calls this "the Harry Potter problem", when talking about Amazon's recommendation system. "The first version of similarities was quite popular. But it had a problem, the Harry Potter problem. Oh, yes, Harry Potter. Harry Potter is a runaway bestseller. Kids buy it. Adults buy it. Everyone buys it. So, take a book, any book. If you look at all the customers who bought that book, then look at what other books they bought, rest assured, most of them have bought Harry Potter."

So even if GoogleScout doesn't work well all the time, it's a great tool for research and serependitious discoveries (add a bookmarklet to your browser to use this feature for any site you visit). Another way to find related pages is to search for a site in Google Directory and to click on its category. Similicio.us uses the bookmarks from del.icio.us to complete this sentence: "people who bookmarked this site also bookmarked...", while the untrustworthy Alexa fills in the blanks for "people who visit this site also visit...". Google also uses similar ideas to provide recommendations based on your search history.
Email ThisBlogThis!Share to XShare to FacebookShare to Pinterest
Posted in Nostalgia, Web Search | No comments
Newer Post Older Post Home

0 comments:

Post a Comment

Subscribe to: Post Comments (Atom)

Popular Posts

  • More People Can Buy Apps from the Android Market
    If there's one thing that Google should do to improve Android, it's developing a better Android Market. Google's app store has a...
  • Could Google Save Yahoo from Microsoft?
    Microsoft is taking over Yahoo! by Gnal. Licensed as Creative Commons Attribution . Even if it's hard to believe that Yahoo will accept...
  • Google's Marketing Dashboard
    MediaPost reports that Google wants to integrate the reporting features from all of its ad products to provide a "fully functional mar...
  • Watch a Video in YouTube's HTML5 Player
    In January, YouTube launched a player that used the HTML5 video tag. To try this player, you have to go to youtube.com/html5 and enable th...
  • Swipe Navigation in the Mobile Gmail Site
    One month ago, Google updated the Gmail app for iOS and added a swipe gesture that lets you move between conversations without having to re...
  • A Bogus DMCA Takedown Request (Part 3)
    I've mentioned in the previous two posts that Inspection 12 sent a DMCA notice for one of my posts, Google took it offline and reject...
  • The Old Image Search, Still Available
    The old Google Image Search interface is still available in the OneBox result that's displayed for some Google searches like [tropical b...
  • Google+ Photo Search With Image Recognition
    Last year, Google Drive added an advanced image search feature powered by Goggles that recognizes objects and uses OCR technology to extrac...
  • Search Engine Comparison Poll: The Results
    Six days ago, I posted a poll that asked you to evaluate the quality of the first results from Google, Yahoo, Windows Live. You had to ente...
  • Bring the Mashups to Google Maps
    Google Maps API was the most successful API ever created by Google and the tool behind a lot of cool mashups available on the web today. Th...

Categories

  • Acquisitions (17)
  • Ads (16)
  • AJAX Search (4)
  • Android (83)
  • Annoyances (7)
  • API (9)
  • April Fools Day (2)
  • Blog Search (4)
  • Blogger (20)
  • Book Search (11)
  • DMCA (4)
  • Easter Egg (18)
  • FeedBurner (4)
  • Firefox extensions (10)
  • Froogle (1)
  • Game (3)
  • gm (1)
  • Gmail (161)
  • Google Analytics (4)
  • Google Apps (17)
  • Google Bookmarks (7)
  • Google Buzz (14)
  • Google Calendar (17)
  • Google Cast (3)
  • Google Checkout (5)
  • Google Chrome (105)
  • Google Chrome OS (28)
  • Google Co-op (9)
  • Google Contacts (9)
  • Google Desktop (5)
  • Google Dictionary (8)
  • Google Docs (80)
  • Google Drive (41)
  • Google Earth (22)
  • Google Gears (5)
  • Google Goggles (7)
  • Google Groups (2)
  • Google Hangouts (4)
  • Google Health (2)
  • Google Instant (15)
  • Google Keep (5)
  • Google Latitude (5)
  • Google Local (9)
  • Google Maps (80)
  • Google Music (3)
  • Google News (20)
  • Google Notebook (9)
  • Google Now (14)
  • Google Pack (2)
  • Google Phone (9)
  • Google Photos (14)
  • Google Play (3)
  • Google Plus (29)
  • Google Profiles (5)
  • Google Promos (2)
  • Google Reader (47)
  • Google Scholar (1)
  • Google Sites (1)
  • Google Suggest (13)
  • Google Takeout (1)
  • Google Talk (19)
  • Google Toolbar (7)
  • Google Translate (38)
  • Google Trends (9)
  • Google TV (4)
  • Google Update (1)
  • Google Video (11)
  • Google Voice (6)
  • Google Wallet (2)
  • Google Wave (3)
  • Greasemonkey (10)
  • iGoogle (32)
  • Image Search (31)
  • InOut (13)
  • Knowledge (14)
  • Mobile (133)
  • Month in review (1)
  • Music (3)
  • Nostalgia (6)
  • OneBox (19)
  • orkut (10)
  • Page Creator (1)
  • Picasa (5)
  • Picasa Web Albums (22)
  • SearchMash (2)
  • Security (10)
  • Social (32)
  • Software (4)
  • Spam (2)
  • Tips (86)
  • Universal Search (3)
  • User interface (116)
  • Visualization (9)
  • Voice Search (14)
  • Web History (7)
  • Web Search (202)
  • Webmasters (5)
  • Windows Live (5)
  • Yahoo (8)
  • Yahoo Pipes (2)
  • YouTube (122)

Blog Archive

  • ►  2013 (364)
    • ►  September (1)
    • ►  August (60)
    • ►  July (60)
    • ►  June (56)
    • ►  May (59)
    • ►  April (48)
    • ►  March (47)
    • ►  February (29)
    • ►  January (4)
  • ►  2012 (134)
    • ►  December (14)
    • ►  November (18)
    • ►  October (26)
    • ►  September (5)
    • ►  August (8)
    • ►  July (17)
    • ►  June (24)
    • ►  May (4)
    • ►  April (18)
  • ►  2011 (13)
    • ►  January (13)
  • ►  2010 (487)
    • ►  December (47)
    • ►  November (37)
    • ►  October (44)
    • ►  September (44)
    • ►  August (55)
    • ►  July (44)
    • ►  June (43)
    • ►  May (54)
    • ►  April (48)
    • ►  March (40)
    • ►  February (28)
    • ►  January (3)
  • ►  2008 (65)
    • ►  February (13)
    • ►  January (52)
  • ▼  2007 (435)
    • ►  December (60)
    • ►  November (55)
    • ►  October (57)
    • ►  September (64)
    • ►  August (59)
    • ▼  July (70)
      • Upload Manager for File Sharing Websites
      • More Sorting Options in Google Docs
      • Google Documents Can't Be Deleted Entirely
      • Meebo Grows Faster Than Google Talk
      • Free Access to Wall Street Journal and Other Subsc...
      • Google Indexing Many Web Pages in Real-Time
      • Google's Intranet Search Engine
      • New Data in Google Trends
      • Microsoft's Live Search Adds Face Detection
      • Is Google Checkout Confusing?
      • View the Original Articles Inside Google Reader
      • Larry Page Wanted Foxit Reader in the Google Pack
      • Google Maps Shows Popular Searches
      • The Absurd Phone Call
      • Gmail Improves Document Preview
      • Google's Magic Box
      • What You Need to Know to Get Better Search Results
      • Google Tests a New Homepage in Asia
      • Google AJAX Search for the iPhone
      • SearchCrystal - Visual Meta Search
      • Search Engines and Favoritism
      • Google Buys ImageAmerica to Improve Google Earth's...
      • A Faster Way to Invite Contacts to Multiple Google...
      • Google Docs Integrates with Google Calendar
      • Google Discontinues Click-to-Call and Related Links
      • Users Report Gaining Access to Random Google Accounts
      • Earth at Night
      • Google Reader Is More Podcast-Friendly
      • Custom YouTube Players: Here Comes YouTube TV
      • Google Print Ads, a Good News for the US Newspapers
      • Google Reader as a Social News Aggregator
      • BlogRovR - A Guided Walk Through the Blogosphere
      • Social Gmail
      • Finding Related Web Pages
      • Google Custom Search Business Edition
      • Google to Launch a Search Engine for Ringtones
      • The Ultimate Search for Bourne with Google
      • Download Published Documents and Spreadsheets
      • Google's Evolution as Seen on Wikipedia
      • Google Sidebars
      • Blogger Adds Trendy Search and Other Widgets
      • The Interoperability of Online Operating Systems
      • Top 10 Google Words and Expressions
      • Redirect Your Blogger Feed to FeedBurner
      • Useful Google Bookmarklets
      • Google Maps Becomes a Geographical Data Platform
      • A Windows Mobile Version of Google Talk
      • Swivel - Draw Conclusions from Data
      • Google Buys Postini to Expand Enterprise Offering
      • The Pressure of Google NDA
      • Improving Google's Social Network
      • Interclue - Clever Link Previews
      • Google File Search
      • Guide for Migrating to Google Apps
      • Google Code Search Updates
      • Visual Overview of a Wikipedia Article
      • Encyclopedic Google
      • AutoFilter for Google Spreadsheets
      • Zoho vs Google Docs
      • Inside Google Earth
      • Paul Buchheit, the Man Behind Gmail
      • OS-Level Autocomplete
      • Google Book Search Is More Accessible
      • Google Phone Is a Collection of Apps
      • Google Makes FeedBurner PRO Free
      • Tool for Backing Up a Blogger Blog
      • Interview with AndrĂ© Banen, the Helping Mind from ...
      • New Shortcuts for Google Docs
      • Google Earth Gallery
      • Listen to MP3 Files Online Using Google's Flash Pl...
    • ►  June (59)
    • ►  May (11)
Powered by Blogger.

About Me

Unknown
View my complete profile