The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

Which Free OS for Sharded Architecture

Not a Linux guy, but I'm trying to figure out which Linux distro (or freeBSD, or ...) would make the most sense for a "sharded" architecture something (but not really) like Google.

The idea is rock solid, bullet proof, effectively scalable to any size by throwing more boxes and bandwidth at it.

Each box could end up being a DB, a Biz box, or a web server, etc.

The OS would have to be free as in, well, free.

Probably looking at PostgreSQL, Mono, Apache as primary components.
Curt LaMontagne Send private email
Monday, July 23, 2007
The problem is some of the software which should be used for administration, monitoring, distribution of computation, distribution of database. For example, at Yahoo they do a lot more than just this proposal, and they have tens of millions of custom lines of code in all of their tools. :-)

Each of those high profile companies has plenty of custom work which they use behind the scenes.

But it all has to start somewhere, right?

One of the advantages of Linux is the large knowledge base for it and that many developers already target Linux by default so you can reuse their tools or even hire them.

Most Linux distributions support a lot of customization, so one could compile the Kernel to include just what's needed for the servers, remove any unnecessary services, remove packages with commands like apt-get remove xxx, "remaster" some distros to create installation CDs/images which include your modifications, learn to install the image using the network and make it "one click" style, clone live images or something, use Xen for virtualization or something...

But on top of that, goes those softwares which need to work across computers and which are not as popular and as available... But there are folks working on such softwares for sure...

The difference between Linux and FreeBSD, is that the most popular Linux distros work with binary packages while FreeBSD generally works with source-code which is compiled on demand. Sure, Yahoo might have a lot of binary packages as well and Yahoo has its own package manager software for it which works with "one click" style.

It all has to start with a first step, though.
Joao Pedrosa
Monday, July 23, 2007
Which Linux distro? It's a little tough to choose one, but I think Ubuntu could be used. Ubuntu is largely used on the Desktop and less so on the Server, but I use on the server just like many folks do.

Ubuntu is based on Debian and Ubuntu has commercial support in case you need it. Ubuntu launches new versions every 6 months or so, but it supports some versions which are meant for enterprise adoption up to 5 years. The difference is that these enterprise versions come from 2 to 3 years or so. :-)

Ubuntu is free and as long as you maintain your own copy of the repository files (which is easy to do), you shouldn't have a major problem with it despite the difficulty of supporting your custom installations.

In comparison, RedHat is commercial while RedHat provides a free distro called Fedora which is under constant development just like Ubuntu, but it's not as well supported as the commercial version (RedHat).
Joao Pedrosa
Monday, July 23, 2007
Check out:

Ubuntu Server Edition

The Server Edition - built on the solid foundation of Debian which is known for its robust server installations — has a strong heritage for reliable performance and predictable evolution.
Joao Pedrosa
Monday, July 23, 2007
Gentoo or FreeBSD. Install once and mirror the install to the other boxes, sure it may take longer but things will be set up how you want it.
Aspiring College Developer
Tuesday, July 24, 2007
Tuesday, July 24, 2007
Thanks, guys. Ubuntu Server and Gentoo both look like they'd be good candidates.
Curt LaMontagne Send private email
Tuesday, July 24, 2007
LiveJournal hashes based on roles to MySQL clusters hosted on Linux. You can learn more here:

Flickr has a very interesting sharding architecture:

Google of course makes extensive use of sharding and

I can imagine sharding built on top of a distributed infrastructure like hadoop:

And hibernate has a sharding system in beta:

Hopefully these sources will help.
Tuesday, July 24, 2007
Of the components you listed, mono at first glance seems like the one that would be most affected by the distribution you choose.  Disclaimer: that's just a guess based on the fact that it's fairly complicated, somewhat less widely-adopted than the other apps, and currently under active development.  If mono has a favorite distro, I'd start looking there.

Given mono's Novell connection, I'd have to guess SUSE as a starting point.  Does the mono site have any info on distro support?
D. Lambert Send private email
Tuesday, July 24, 2007
Avoid Gentoo if you're "not a Linux guy".  Gentoo's gimmick - everything is compiled to your specification - is aimed at tinkerers and hobbyists.  The right sysadmin can turn Gentoo into the world's best Linux server, but if you don't have that background, you're likely to find it a frustrating experience.
Wednesday, July 25, 2007
Honestly, if you're looking for a simple server distribution that doesn't require bleeding edge software, why not go with debian?  It's rock solid, supports multiple architectures and is easily mantained remotely.
Derek Send private email
Friday, August 03, 2007

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
Powered by FogBugz