Planning Hardware Upgrades

December 29, 2006

TrustNet is coming along very nicely, but I’m beginning to notice that I need some slight hardware improvements to have room for growth.

First of all, the current server is perfectly adequate and is expected to scale well. Here’s what TrustNet is currently running on:

  • Pentium 4 2.4GHz
  • 1GB ECC RAM, in dual channel
  • 2 x 200GB SATA drives
  • 1 x 250GB SATA drive
  • All 3 drives are in a RAID-1.

The actual problem is the case I picked for it. It’s silent indeed, the problem is that the design is weird and non-standard. The worst problem is that it uses a very weird power supply, with fans that are dying and that I can’t easily replace. The supply itself seems to be impossible to find here as well, and the way it’s assembled makes it pretty much impossible to replace without removing the motherboard. And if that wasn’t enough, the case design results in lots of dust getting sucked into it.

So the case is going to be the first thing to go, and replaced with this rack case. Along with this, I’m also in the process of getting a rack where to house it. Having to replace fans on the server while running, and oiling a fan on a running power supply (eek!) sure made me start to really appreciate the advantages of rack mounted hardware.

The other change that I’m going to do is to upgrade the RAM to 3GB total. The current amount is sufficient, but a bit tight. The database is getting larger by the day, and more RAM is always a good thing. The server runs Linux with the overcommit_memory parameter set to 2, and that increases memory requirements.

The overcommit_memory kernel parameter can be changed by writing the new value to /proc/sys/vm/overcommit_memory. It determines how the kernel allocates memory. The default setting in most systems is 0. Here are the possible values:

  • 0: Allow program to allocate more memory than exists, within some limit.
  • 1: Make malloc() always succeed.
  • 2: Allow allocating memory up to swap_space + fraction of physical memory defined in overcommit_ratio.

Why do values 0 and 1 exist? Because many programs allocate a large chunk of memory and then let it be unused. This happens because of fixed allocations larger than necessary “just in case”, inefficient allocations that are performed without the code that uses the memory block ever getting used, memory leaks, etc. This way of doing things allows to run more programs on the system.

The problem is of course, what happens if an application asks for 768MB RAM on a 512MB system (which will succeed under values 0 and 1), then actually goes and uses it? What happens is quite bizarre: The infamous OOM Killer rises from the nether depths of the Linux Kernel, evaluates all running processes with a magic formula, and decides which one of them to kill to free some memory. Note that the process being killed doesn’t have to be the one that made the system run out of memory! This means that most of the time, a completely innocent process will get sacrificed to allow the memory hog to keep running. To add to the weirdness of the situation, the process gets a SIGKILL. That means that the program being terminated can’t do anything about it, and has no chance to do a graceful shutdown.

Since having random processes die suddenly is a very bad thing on a server, mine runs with overcommit_memory=2. Under this setting, the kernel will never allow allocating more memory than is available, and if it’s attempted, malloc() will fail. Important things, like database servers are written to deal with memory allocation failure in a sane way, so this is a much better alternative. That, however, means that less memory than usual is available, and so I need to add more RAM to it to have a comfortable margin.

I will attempt to perform the upgrade during a grid downtime, which will be the next Wednesday if I have everything required by then.

Advertisements

TrustNet HUD 0.52 Released

December 28, 2006

I’ve had some problems with the latest release. Apparently something broke (not sure what exactly) and some scanners weren’t connecting to the network correctly. If that happens to you, and you have an older copy, attach it to get the update again. If not, IM me (Dale Glass) for a new copy.

Full changelog follows:
Read the rest of this entry »


TrustNet HUD 0.50 Released

December 27, 2006

This version has a new feature: tool discovery.

The idea here is that there are many tools on SL that perform some action on an avatar, but require inputting the name by hand. For instance, security orbs and similar systems, weapons etc. I created a protocol that allows the scanner to ask tools talking the protocol to identify themselves. Then the scanner presents a list of all the tools that were detected, and allows the user to choose which to use on the avatar that was clicked.

I have documented the protocol on the LSL Wiki.


TrustNet HUD 0.49 Released

December 26, 2006

Finally the next TrustNet HUD version is ready! This has been delayed quite a lot due to the SL Scripters Trade Show, but it was worth it. The show went very well, and I made contact with some very interesting people. I also got some valuable feedback.

The major news in this version is access to the website. Expect it to be slow (due to implementation issues) and buggy for now, but the basics of it work. I’d like to get feedback on it, so contact me with any bug reports or feature suggestions. Full changelog follows.

Read the rest of this entry »


TrustNet Talk Log

December 22, 2006

I just did a presentation of the TrustNet system at the SL Scripters Trade Show. This is the log of the event.

Various junk such as login/logout notifications have been removed, and the log has been colorized a bit to make it nicer (although this could use some improvement still).

Read the rest of this entry »