The New York Times and C/Net have an insightful story today about Google's latest addition to its data center called the Googleplex. It is massive. Read this excerpt from the article "
The rate at which the Google computing system has grown is as remarkable as its size. In March 2001, when the company was serving about 70 million Web pages daily, it had 8,000 computers, according to a Microsoft researcher granted anonymity to talk about a detailed tour he was given at one of Google's Silicon Valley computing centers. By 2003 the number had grown to 100,000.
Today even the closest Google watchers have lost precise count of how big the system is. The best guess is that Google now has more than 450,000 servers spread over at least 25 locations around the world. The company has major operations in Ireland, and a big computing center has recently been completed in Atlanta. Connecting these centers is a high-capacity fiber optic network that the company has assembled over the last few years.
Last year I wrote about "Google data centers and dark fiber connections" explaining how Google was buying up unused "dark" fiber from failing telecom companies, and using it to tie together its massive data centers around the world. Danny Hillis says "Google has constructed the biggest computer in the world, and it's a hidden asset."
Chris Gulker has an interesting take on the story "
Markoff and Hansell peg Google's 'computer' at 450,000 servers in 25 centers worldwide. If that's true, and positing 900 million computer users in the world, then each Google server supports some 2,000 users.
2,000 users is no big deal for a web server. But that can be a heavy load for many heavier, more traditional networked applications that are server based. So one wonders how scalable Google's applications, Google Spreadsheets, Writely et al. might be as more customers flock to them - especially since Google will want to keep its search engine and attached ad-serving processes humming along at top speed as well.
This sounds about right. I remember at Napster we were able to support about 8,000 users per server, but that was more like a simple P2P search engine connecting users to search results. No heavy duty computing. Web wide search engines eat up enormous compute cycles. Web based applications like maps, spreadsheets, and word processors will chew up cycles too. A plan of 2,000 users per server is probably adequate today, but more servers per user will be needed if application use takes off.
Microsoft stunned Wall Street last month when it announced a plan to spend up to $2 billion more than expected next year. Much of that money will go to build out a world class computing infrastructure to rival, and exceed, Google. The NYT story says "Microsoft's Internet computing effort is currently based on 200,000 servers, and the company expects that number to grow to 800,000 by 2011 under its most aggressive forecast, according to a company document."
This is a multi-billion dollar battle of the titans in a fight for the world's Internet users. There is no doubt that more and more applications and services will be hosted on the web and served to a browser. This infrastructure is what enables that to happen at the speed of light.
Subscribe - To get an automatic feed of all future posts subscribe here, or to receive them via email go here and enter your email address in the box in the right column.
Don -
I think you touched on this... But, just the CURRENT crawling aspect of the operation requires far more cycles/processor/hour than providing the search results back to the browser EVER will. (Well, EVER is a strong word here and subject to change :)
If the hosted apps are adopted at anywhere near the expected rates they still won't touch the horsepower required to keep up with the sprawl of the net over time as it is growing at a far faster rate than apps adoption.
GB
Posted by: Gerald Buckley | June 14, 2006 at 08:31 AM
Hello Don,
Gulker says Google has 450,000 servers, and there are 900 million computer users in the world, which means Google "supports" 2000 users per server. Then he says that 2000 users is no big deal for a server.
You go on to say Napster supported 8000 users per server.
Obviously all 900 million users will not be using Google services, and definitely not all at the same time. Does the 2000 users per server imply simultaneous use? Or is using the total number of computer users divided by number of servers some standard web server calculus? If not, as I assume, the 900 million metric is not very useful, as it does not indicate how many users are, on average, using Google servers. And, if 2000 simultaneous users per server is no big deal, Google probably has a while to go before they get there, right?
Just curious.
Posted by: Doug Cummings | June 14, 2006 at 12:31 PM
Considering Verizon's FIOS (45mbps in the home) and AT&T (25mbps) spending over $4 billion on it's Project Lightspeed, bandwidth availability to consumers will increase exponentially. With that in mind, I hope those data centers come online quick! I switched to search.msn.com recently and the I love it, the only major drawback I find is that it's a bit slower than google.com when I search the tail end.
Posted by: Charles Tran | June 14, 2006 at 03:00 PM
Wow, as soon as I posted my last comment, I came across this link: http://search.msn.com/s/jobs/openings/search%20jobs.htm
Apparently search.msn.com hand craft results in real-time on the tail end. I guess that's why my queries have been slow when I search on the tail end.
From the "Hand crafted results" job posting, it seems like they have a team of over 100 handcrafters responding to queries in less than 3.8 seconds... that's eye-opening.
Posted by: Charles Tran | June 14, 2006 at 03:07 PM
It'll be interesting to see which architecture scales better. This is basically the biggest test there is of the two company's technology. Microsoft will probably have the most riding on this since if they don't scale well it'll reflect badly on their server software sales.
Posted by: Jason Kolb | June 14, 2006 at 06:51 PM
With SaaS taking off and multitude of companies wanting to provide all sorts of web services, should we all be buying stocks in data centers?
Posted by: bored | June 14, 2006 at 08:41 PM