an eddy in the bitstream

Day: February 5, 2006

Fedora Core 4 Upgrade, Day Two

First thing this morning the load average on louvin was up over 6. Bad news. As I watched it steadily dropped back down to 0.05ish. Which is normal. The only thing I could see running was about a dozen index.cgi processes under httpd, so perhaps someone was banging pretty hard on the web server at that particular moment. I seriously need to consider moving my site over to a mod_perl or PHP based site, to help performance.

Turned off SpamAssassin till I can research more how I might use it. It eats up a lot of memory doing nothing right now.

A couple more Apache config tweeks. The user public_html/~user feature is turned off by default in version 2, so enabled that for my ~karpet/ urls. Also needed to tweek my templates for perm links on my blog to explicitly include the index.cgi part of the url. Definitely time to move to a mod_perl solution. Maybe Catalyst would be up to the job.

I’m going to spend a couple hours trying to get new hard drives added. So I have rerouted traffic back to dellpc in the meantime. When I’m all done, I need to blow all the dust out of the Gateway and move it back downstairs to its basement home. I’ve been working on it up here in my office and it’s damn loud with 5 hard drives spinning and all those fans.

Several attempts witht the Seagate 9g scsi3 ST19171WC (80pin) drives failed to work. I tried both the internal scsi bus and the Adaptec scsi2 PCI card, both with the SCA adapter (68pin->80pin and 50pin->80pin). In all cases the drives failed to spin up. Tried several jumper settings, especially the ME/MTR jumper (which toggles the spin up time for the drive). No luck. Finally just gave up. These drives were cheap, 7200rpm scsi drives I bought as a case of 10 for $100 about 4 years ago. They worked ok (with SCA adapters) in my old Mac G3. The spin up issue was real there too but I fixed it with some SCSI toolkit software that I can no longer recall (FWB?).

So I moved on. I may either junk the drives or find someone who can use them. If you can use them, send me mail.

I got the external scsi drive to work, an older Seagate 9gig scsi2 drive. 20/MBs isn’t that fast, but it’ll do. Created a partition with fdisk and then this one-liner to create the ext3 filesystem:

mke2fs -j /dev/sdd1

And then edited /etc/fstab to mount at /opt2.

Don’t know yet what I plan to do with this extra 9gig, but I guess my overall goal is to run all my old scsi drives into dust so I can justify getting either (a) new bigger scsi drives (>36g) or (b) going the full IDE route with a IDE PCI card and some beefy 300g drives. This Gateway server has all kinds of scsi support, but the drives are just so expensive, and not really worth the cost now that IDE/ATA drives are so fast and cheap. At least for my applications (small low-traffic web/mail/file server). I’ve got one open IDE bus slot (slave on the primary bus) for when my next scsi drive dies. After that, I’ll look into the PCI card controller. In the last 5 years of this machine, it’s only lost 2 drives: one of the original 9g scsi drives, and a 40g IDE drive I was using as a backup drive. Those both crapped out in the last year, so I figure the other drives are due to die on me.

Would like to figure out how to send my named/bind logging to a separate log file. Just haven’t yet found the magic lines for my /etc/syslog.conf file (or maybe directly in /etc/named.conf).

Noticed at last reboot that my GRUB settings for ide-scsi are now deprecated. Apparently that’s a 2.6 kernel thing. So I’ll keep the old settings if I need to boot in 7.2 (2.4 kernel) and just update my new FC4 config. Here’s the new GRUB settings (thank you google):

default=0
timeout=5
splashimage=(hd1,1)/boot/grub/splash.xpm.gz
hiddenmenu
title Fedora Core (2.6.15-1.1830_FC4)
        root (hd1,1)
        kernel /boot/vmlinuz-2.6.15-1.1830_FC4 ro root=/dev/sdb2 hdd=ide-cd
        initrd /boot/initrd-2.6.15-1.1830_FC4.img
title Fedora Core (2.6.11-1.1369_FC4)
        root (hd1,1)
        kernel /boot/vmlinuz-2.6.11-1.1369_FC4 ro root=/dev/sdb2 hdd=ide-cd
        initrd /boot/initrd-2.6.11-1.1369_FC4.img
title Redhat 7.2
        root (hd1,0)
        kernel /boot/vmlinuz ro root=/dev/sdb1 hdd=ide-scsi
        initrd /boot/initrd-2.4.20-28.7.img

Oh, and I notice that RedHat conveniently aliases /etc/grub.conf to /boot/grub/grub.conf. How nice. I discovered that by accident when mindlessly typing

vi /etc/grub.conf

and then realized as it opened that that was the wrong location. Guess others have done what I just did. 🙂

Found out I was wrong about that extra NIC I had lying around. Just popped it in, rebooted and configured. Full 100/Mbps, full duplex. So I left it in; now have two NICs. Will be fun to play with that in future (and one less card sitting in drawer).

[root@louvin network-scripts]# ethtool eth1
Settings for eth1:
        Supported ports: [ TP MII ]
        Supported link modes:   10baseT/Half 10baseT/Full 
                                100baseT/Half 100baseT/Full 
        Supports auto-negotiation: Yes
        Advertised link modes:  10baseT/Half 10baseT/Full 
                                100baseT/Half 100baseT/Full 
        Advertised auto-negotiation: Yes
        Speed: 100Mb/s
        Duplex: Full
        Port: MII
        PHYAD: 32
        Transceiver: internal
        Auto-negotiation: on
        Supports Wake-on: pumbg
        Wake-on: d
        Current message level: 0x00000007 (7)
        Link detected: yes
[root@louvin network-scripts]# cat ifcfg-eth1
DEVICE=eth1
BOOTPROTO=static
BROADCAST=10.0.0.255
IPADDR=10.0.0.51
NETMASK=255.255.255.0
NETWORK=10.0.0.0
ONBOOT=yes
TYPE=Ethernet
[root@louvin network-scripts]# ifconfig eth1
eth1      Link encap:Ethernet  HWaddr 00:10:B5:0D:B0:25  
          inet addr:10.0.0.51  Bcast:10.0.0.255  Mask:255.255.255.0
          inet6 addr: fe80::210:b5ff:fe0d:b025/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:242 errors:0 dropped:0 overruns:0 frame:0
          TX packets:7 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:14984 (14.6 KiB)  TX bytes:528 (528.0 b)
          Interrupt:9 

All external traffic is routed to 10.0.0.50, so if I need to do any internal LAN transfers that might chew bandwidth, I could now use the alternate card for that. Not really a performance worry since I get such little traffic, but still, geek fun.

Ok. Time to clean this machine out and move it downstairs and get on with my life. Lots of housecleaning to do before my family returns this evening.

One more thing: I’ve been doing all this work to the melodious sounds of KCRW Music via iTunes radio. Great station in my hometown of LA. A really nice Johnny Cash 4-part tribute I heard yesterday as part of their pledge drive.

Update

Looks like the high load levels from Apache are due to web robots hitting my site in rapid succession. Seems to be related to the spammers who were abusing my webalizer logs in the referrers section. I have turned off the referrers reporting in webalizer; still would like to figure out a way to deny httpd response to those abusers.

Learned a new command for the bash shell: hash -r (same as rehash under t/csh)

Upgrade to Fedora Core 4

Or, how I spent one weekend in February when it was 10 degrees F outside.

Background

Since 2000 I have been running my own Linux server (hostname: louvin) to host peknet.com (and over time, several other domains). It’s been a terrific way to learn about Linux, hosting and system administration. I know what the big boys feel like when it comes to system failures, backup requirements, power and network outages, crashed drives and corrupted filesystems. Even though it’s on a small scale (1 server on a home LAN of 3-4 machines), I’ve run the same gauntlet that any other web hosting company has. The only difference being, I’ve never charged anyone for hosting.

Over the years I’ve learned a lot about what to do and not to do as a system administrator. There have been system crashes when I didn’t know enough about Linux so I just reinstalled the system. That probably happened 3 or 4 times in the first year or so. Downtime is always bad when others are depending on their website and email to keep working, so I’ve lost some sleep over it all. As I learned more and moved to a stable system, that has happened less often. I have been running RedHat 7.2 since 2001, and I think I’ve only reinstalled once (after the main / drive crashed).

In 2002 I decided I was getting out of the hosting business. I had learned what I wanted to learn and the responsibilities far outweighed the [non-fiduciary] payback. And I didn’t want anyone to suffer downtime on my watch, especially since it was all gratis. I didn’t kick everyone off all at once though, since I had made some commitments to folks. So as their domains came up for renewal, I found them nice homes (just like puppies).

So by 2005, I was only hosting 3 domains: peknet.com, wholefarmcoop.com, and jeansulivan.com. peknet and jeansulivan are personal sites, so I didn’t care about uptime for those necessarily (though I too like my email to keep flowing). wholefarmcoop.com, on the other hand, is a for-profit site that I voluntarily administer and design. The Coop is a group of central Minnesota farmers who sell their wares via community dropoff centers and local markets. Since their site is a mishmash of homegrown scripts and db it’s far easier to host it myself than rely on (even a friendly) webhosting company.

However, the time has come to move wholefarmcoop to a real webhosting company and put them on firmer ground. Since becoming their host in summer of 2003, I think we’ve had pretty close to 98% uptime, so that’s a pretty good number for a home-hosted commercial site. And they haven’t had to pay hosting charges. But still, they need some guaranteed uptime, and I need to redesign their site to make it more portable and user/admin friendly.

I want to rewrite the wholefarmcoop site to use PHP and MySQL, both because I need to develop those skills professionally and because those technologies will allow someone else to step in and take over when I decide it’s time to retire as volunteer webmaster.

Trouble is, louvin’s RedHat 7.2 install is so old that I can’t find updated RPMs any more for the newer versions of MySQL and PHP and Apache, and I want to use software that isn’t 3-5 years old. Being in the technology business means constantly running to keep from slipping behind too far on the development curve. I want to use Apache 2.x, MySQL 5 and PHP 5 (all the latest versions) and while I could compile all those myself, I think it’s time for louvin to get a facelift.

So with the wife and kid out of town this weekend, I am going to attempt to upgrade louvin from RedHat 7.2 (circa 2001) to Fedora Core 4 (FC4) (circa 2005, the latest release from RedHat). I know this will take the better part of a day in order to make sure I do it right, and it’s good to have un-interrupted time. These are things I’ve learned.

I’ve also learned that when doing system maintenance, I will enjoy it more if I’m not worried that all the websites are down and email is bouncing while I’m working. So this time I have a plan: I’ve got an old Dell laptop sitting on the shelf. A clean install of FC4 on the laptop (hostname: dellpc) and a little rsync magic, and I can reroute all incoming traffic to my domains to dellpc while I work on louvin. All mail will spool on dellpc and all sites will be accessible while I fiddle and tinker with the upgrade. Going from 7.2 to FC4 is a huge leap in technology terms (especially with Linux, which has come a tremendous way in the last 5 years). I’ll be switching from a 2.4 to 2.6 kernel (I just run the stock kernels). And making moves to Apache 2.x. I also want to replace majordomo with mailman while I’m at it, for the one mailing list I host [address omitted to keep the spammers at bay].

I’m also going to try documenting this whole experience (this document) both for my sake (so I can remember why/when I did certain things) and in the event it proves useful for any other chronic do-it-yourselfers out there tackling similar issues.

What needs to happen

Technical overview:

louvin
A Gateway 6400 server running RedHat 7.2. This guy’s getting upgraded to Fedora Core 4 (FC4).

IP
10.0.0.50
hostname
louvin
active services
http, ssh, smtp, svn, majordomo, imap, named/bind/dns. Also the backup server for my home LAN (using rsync over ssh).
specs
 BIOS Date 4/26/01
 single Pentium III 800MHz
 memory:
        640M (2G max)
        PC133-compliant, registered, parity, 
        ECC SDRAM (yeah, the expensive kind)
        2 of 4 slots empty
 2 PCI SCSI buses onboard
 2 IDE buses
 1 CD-ROM, 1 CD-RW
 4 hard drives:
   9 gig scsi - /opt    /dev/sdc
   9 gig scsi - /home   /dev/sda (also 1.5g of swap)
  36 gig scsi - /       /dev/sdb
 300 gig ide  - /backup /dev/hdb (only drive on the bus)

I have an Adaptec scsi2 PCI card inserted for external drives, which has never really been used (I’m storing it there as much as anything). I’ve also tried to get the 2nd internal scsi PCI bus to work with these cheap 9g SCSI3 drives I bought for $9/each. I hope to try again if I have time this weekend, and toss the drives if they don’t work. They’re full height (2 inches) which makes them heavy and useless if they don’t work with louvin.

dellpc
IP
10.0.0.20
hostname
dellpc
active services
http, ssh, smtp.
specs
Dell Latitude CPi laptop. Pentium II 266MHz (pretty slow). 190M RAM. 6g drive. Good enough for what I’m using it for: backup services while louvin is down.

Getting Started

I installed FC4 on dellpc early last week in anticipation. Installation was fine. I rsynced over my web data and set up Apache appropriately. Did the same with Postfix. Should be ready to hold the fort indefinitely while the upgrade happens.

One last rsync just before we switch the router config:

root@louvin 2% pwd
/opt/webdocs
root@louvin 3% ls
jeansulivan.com/  log/  lost+found/  peknet.com/  wholefarmcoop.com/
root@louvin 4% rsync -a -e ssh . dellpc:/opt/webdocs

Now make the switch at the router:

telnet router
en
show nat
set nat entry delete all
set nat entry add 10.0.0.20    # routes all traffic from outside to dellpc

Now open multiple terminals with: tail -f somelog to keep track of traffic to dellpc.

And in /var/log/httpd:

watch ls -lht

to monitor the size and access time for all logs at once

All traffic now pointing at dellpc, so safe to take louvin down for upgrade This is really the way to go! no time pressure, or at least, not as much.

Wanted to repartition the 36g drive to preserve the existing 7.2 install. The existing install only takes about 4g of the drive, so I wanted to resize it into 3 partitions: 6g for 7.2, 4g for swap (right now the swap space is on /dev/sda), and the rest for the new FC4 install. (More than 3 partitions gets me into extended partition land, which I like to avoid.) Assuming the upgrade goes well, I can reformat the 6g partition eventually and use it the next time I upgrade.reda

However, the SystemRescue CD wouldn’t boot on 3+ attempts. Might be problems with detecting the SCSI buses, since they use a non-standard SCSI driver. In fact, when I first got this machine, RedHat 7.0 was brand-new and it wouldn’t boot without a special driver. That special driver (Symbios LSI Logic Corp iirc) was later made part of the standard RedHat package in 7.2, which is why I upgraded to 7.2 asap. This was all in 2001; my memory is a little fuzzy since I also got married, bought a house and suffered along with all the world as the barbarian hordes invaded my country.

Tried to reboot into old 7.2 but that failed too. Then I realized I had added 3 new hardware pieces while the system was down: connected the 2nd internal scsi bus cable to the scsi2-scsi3 adapter I need for my cheap 9g drives; an old PCI ethernet card (which I think I will toss after this, since google tells me it’s only 10mbs anyway); and connected my external scsi drive to the Adaptec PCI scsi2 card that’s been installed for years. I disconnected all 3 to be safe, then was able to boot as normal. So I re-tried the System Rescue CD and voila! it worked.

I’ve had really good luck with the System Rescue CD (http://www.sysresccd.org/). It contains QtParted, an OSS Partition Magic clone. I’ve used it to non-destructively re-partition existing NTFS (Windows) drives in order to install a dual-boot Linux system. In this case, however, I couldn’t resize the existing partitions because they contain ext3 (Linux) filesystems. So instead I took a tip from this helpful link and resized them directly from the command line:

root@cdimage % e2fsck -f /dev/sdb1
[....]
root@cdimage % resize2fs -f /dev/sdb1 6G
[....]

NOTE I had to use the -f options to force both. Otherwise, resize2fs kept complaining that I needed to run e2fsck -f (which I did, about 3 times with no errors).

That only made my filesystem smaller. The partition was still 36g. So I ran parted and resized it to 6g.

root@cdimage % parted


(parted) select /dev/sdb (parted) resize Partition number? 1 Start? [0.0156]? End? [xxxxxxxx]? 6000

(parted) quit

root@cdimage % reboot

Hopefull that did it. If not, I always have a backup. 🙂

On to the FC4 install.

I did a fresh install, rather than an upgrade, since I it will be easier to set up a clean installation than one based on the patched-together 7.2 system. It’s had so many RPMs manually installed over the years that I can’t remember anymore. I only need some basic services (ssh, httpd, postfix) to get me started, and now that yum is mature, I’ll be able to keep up to date more easily (one of the reasons I am upgrading).

Clicking through the keyboard setup, etc., till I’m asked to partition my hard drives. I choose the manual option. My resize of the sdb1 partition appears to be successful. I now have nearly 30g of free space on that drive. I add my 4g of swap and set the rest to ext3 and mount at /. I also delete the swap partition from the sda drive and set it to reformat as ext3 and mount as /home2 (since the other 7g partition on that drive is /home).

Click on through the standard install. I choose the server setup. Enable GRUB (which unfortunately will install on my backup IDE drive rather than my boot (sdb) drive — my current boot scenario is from GRUB on a floppy though, so this should at least make it easier to powerup without babysitting and entering all the GRUB stuff manually at each boot … I know I know, there are probably GRUB experts out there who wince at that, but despite reading lots of GRUB documentation and trying many different things, I never could get GRUB to probably install on the boot partition on any drive).

I enable the firewall and allow ssh, http/s, and smtp by default, since that’s 99% of the network traffic this box will support.

NOTE: I disable the new SELinux security feature. This is probably a good feature but I have found it causes things not to work that I expect to (like Apache). So I just disable it and rely on old fashioned security via iptables, ssh and common sense.

I’m in America/Chicago central time (the shame! the Twin Cities never seem to be an option for setting the default timezone).

Set my root password, choose my packages (basic server plus some devel extras (gcc, etc.) and we’re off to the races!

I get to eat lunch while the install process grinds on.

Start time: approx 1:25pm End time: approx 1:45pm

On reboot the normal bios info and memory check flashes by. But then just a prompt. No message, but obviously no boot disk was found. It boots fine from floppy and CD, so I wonder if the BIOS boot order is screwy. Has this been my GRUB issue all along?! Hard reboot and let’s find out.

I drill through the BIOS settings. Boot order is floppy, CD, hard drive. Looks right. Hard drives … all 4 show up. But wait! The IDE (/dev/hdb) drive is set last, after the 3 scsi drives. I reorder them and move the IDE drive first since that’s where GRUB is installed. Reboot. Now I get a ‘GRUB Geom error’. Well, at least GRUB is trying to load. That’s progress. But I need to get it booted in order to fix it, so pop in the floppy I’ve been using for years with GRUB on it.

Here’s the GRUB commands I use to boot:

 root (hd1,1)        # second partition on second drive (base-0)
 kernel /boot/vmlinuz ro root=/dev/sdb2 hdd=ide-scsi
 initrd /boot/initrd
 boot

After booting, rebooting and googling lots, I finally figured out that the BIOS was not seeing the IDE drive geometry correctly. Rather than fix it, I tried reinstalling GRUB to see if that would correct it.

This documentation helped.

grub> setup (hd0)

Did the trick. Kinda. The GRUB boot parameters weren’t quite right. It was trying to boot from (hd2,1) instead of (hd1,1). Manually editing the GRUB boot params fixed it at boot time.

Another problem: while trying to convince the BIOS that the IDE drive was on the up and up, I switched the drive’s place on the bus from the slave position to the master position. So even though I could get GRUB to boot the FC4 install correctly, now it errors on boot because it can’t mount the IDE filesystem /backup.

This is where having gone to the trouble of preserving the 7.2 installation saves the day. I reboot into the 7.2 install on /dev/sdb1, mount the FC4 install at /fc4, and I can fix both problems: edit /fc4/boot/grub/grub.conf to pass GRUB the correct boot params, and edit /fc4/etc/fstab to fix the mount issue for the IDE drive.

Change grub.conf: change (hd2,1) to (hd1,1)
Change fstab:     change /dev/hdb1 to /dev/hda1

Reboot and magic. It all boots correctly into Fedora Core 4. I’m ready to set up my system, and I no longer need to rely on my GRUB floppy disk. Happy days are here again.

The thing that strikes me at this moment is how quickly I was able to solve this issue. In less than an hour I had it fixed. Five years ago when I started it would have taken me days, probably. It’s obvious, I know, but experience really helps, and I didn’t know how much I’ve learned until trouble came along.

Now I have ssh access to louvin, so I can log in from my Mac (where I’m writing this and from where I’m monitoring dellpc). So I don’t need to switch keyboards anymore. And I can copy/paste what I’m doing here. Lord, I love terminal/ssh. Where would I be without the command line? Which of course leads to: Windows sucks.

So here’s the new drive set up:

 [root@louvin ~]# df -h
 Filesystem            Size  Used Avail Use% Mounted on
 /dev/sdb2              24G  1.5G   22G   7% /
 /dev/hda1             276G  9.4G  252G   4% /backup
 /dev/sda1             1.3G   35M  1.2G   3% /home2
 /dev/sda2             7.1G  2.3G  4.5G  34% /home
 /dev/sdc1             8.4G  1.2G  6.9G  15% /opt

Looks like I forgot to mount the old 7.2 partition anywhere. It’s at /dev/sdb1. I’ll do that now, since I’ll need access to my old config files (they’re also in /backup/louvin but it’ll be easier in the long run to have all partitions mounted by default).

Edit /etc/fstab and added this line:

 /dev/sdb1               /72                     ext3    defaults        1 2

Now mount it and show what we’ve got:

 [root@louvin ~]# mkdir /72
 [root@louvin ~]# mount /72
 [root@louvin ~]# df -h
 Filesystem            Size  Used Avail Use% Mounted on
 /dev/sdb2              24G  1.5G   22G   7% /
 /dev/hda1             276G  9.4G  252G   4% /backup
 /dev/sda1             1.3G   35M  1.2G   3% /home2
 /dev/sda2             7.1G  2.3G  4.5G  34% /home
 /dev/sdc1             8.4G  1.2G  6.9G  15% /opt
 /dev/sdb1             5.8G  3.4G  2.2G  61% /72

yum

Time to get things set up. First, I know that FC4 has had many RPMs updated since it was first released 4 or 5 months ago. yum is the preferred way of updating RPMs. I have recently done a clean FC4 install and full yum update/download for my work PC, and since it took a long time to download all the new RPMs, I backed them up to louvin once yum was done on that machine. So in /backup/alpc/yum_rpms I have all the updated RPMs I need to get this server completely up to date. Shouldn’t have to download more than a few RPMs that may have been released in the last couple weeks.

So I need to configure yum and tell it to look at my local cache instead of downloading the updates. I’ve found this site a helpful starting place for FC4 installs: http://stanton-finley.net/fedora_core_4_installation_notes.html

Looks like newrpms.sunsite.dk is down right now, so I’ll ignore that RPM repository for this setup.

[root@louvin tmp]# vi /etc/yum.repos.d/freshrpms.repo
[root@louvin tmp]# vi /etc/yum.repos.d/dries.repo
[root@louvin tmp]# vi /etc/yum.repos.d/newrpms.repo
[root@louvin tmp]# rpm --import http://freshrpms.net/packages/RPM-GPG-KEY.txt
[root@louvin tmp]# rpm --import http://dries.ulyssis.org/rpm/RPM-GPG-KEY.dries.txt
[root@louvin tmp]# rpm --import http://newrpms.sunsite.dk/gpg-pubkey-newrpms.txt
[....after sitting and waiting a minute with no response....]
^C
[root@louvin tmp]# rpm --import /usr/share/doc/fedora-release-*/*GPG-KEY*
[root@louvin tmp]# yum update
[........]

When yum starts downloading the headers for specific RPMs, I hit Ctrl-C and cancel the operation. This way I can symlink in my own cache of RPMs:

[root@louvin updates]# cd /var/cache/yum/updates
[root@louvin updates]# mv packages/ packages.orig
[root@louvin updates]# ln -s /backup/alpc/yum_rpms/packages packages
[root@louvin updates]# ls -l
total 2860
drwxr-xr-x  2 root root    4096 Feb  4 16:02 headers
lrwxrwxrwx  1 root root      30 Feb  4 16:03 packages -> /backup/alpc/yum_rpms/packages
drwxr-xr-x  2 root root    4096 Feb  4 15:56 packages.orig
-rw-r--r--  1 root root  397008 Feb  3 21:56 primary.xml.gz
-rw-r--r--  1 root root 2507776 Feb  4 16:00 primary.xml.gz.sqlite
-rw-r--r--  1 root root     951 Feb  3 21:56 repomd.xml

Then restart the update:

[root@louvin updates]# yum update

It says I need to download 242M of RPMs for 202 packages. But when I say ‘y’ at the prompt, it re-calculates based on the RPMs it finds in my local cache and only ends up downloading 40 RPMs. Saved myself some work there, and now my cache is up to date should I need to do this process over again for any reason.

Then while yum churns, I’ll start copying my old config files to the appropriate place and other admin chores.

init.d services

First I want to turn off all the services RedHat installs by default that I don’t need. All the NFS stuff, CUPS, Howl, Sendmail (I use Postfix instead), xfs, and any others I don’t need. When I started using RedHat I just left everything on, assuming I needed it because I didn’t know any better. Now I read the comments at the head of each /etc/init.d/ script to see what it is that the service is doing, and end up turning most stuff off. Less services means more security and performance (and faster startup times). I still use ntsysv from the terminal; a classic that still works fine. I don’t bother removing RPMs or otherwise pruning files; I don’t mind the disk space, just the processing power. Who knows: I might decide to turn some things on some day, so leaving them installed seems wise.

Apache

FC4 ships with Apache 2 by default, and I was running a copy of 1.3.x that I compiled myself. Thankfully, I’ve already figured out how to adjust my config files because I’ve installed 3 or 4 FC4 systems already. It’s pretty much a straight copy/paste. A new Apache is a big reason for this whole upgrade. Now I can run mod_perl2 and can easily get RPMs for other Apache modules.

It’s all about the RPMs.

[root@louvin conf]# /etc/init.d/httpd start
Starting httpd:                                            [  OK  ]

All good. Till DNS is up again and I reroute traffic back to louvin, I can confirm all is well by pointing my browser at http://louvin/ where I see the Apache test page.

Postfix

Postfix is a great MTA packages. So much easier to understand and configure than Sendmail, ime. Once I figured out how to configure it to turn away spammers and others looking for open relays, I liked it even better. Instead of 1000s of bouncing messages, now I’m down to dozens.

I currently use SpamBouncer and procmail to filter out just my personal mail. I see FC4 comes with SpamAssassin which I have heard good things about. I may investigate that down the road for domain-wide control.

Here’s what I use in my postfix main.cf file to turn away spammers:

smtpd_helo_required = yes
strict_rfc821_envelopes = yes
smtpd_hard_error_limit = 2
address_verify_poll_count = 1
default_process_limit = 20
disable_vrfy_command = yes

That seems to help a lot. I got a lot of help via google and the postfix archives at arriving at that config. IIRC that was about a year ago. Maybe a little more. Having a kid has killed my memory cells.

To get postfix up and working, I need to un-install the default Sendmail and then copy over my old config files.

First stop the daemon:

[root@louvin postfix]# /etc/init.d/sendmail stop
Shutting down sendmail:                                    [  OK  ]
Shutting down sm-client:                                   [  OK  ]

Then remove the RPMs:

[root@louvin postfix]# yum erase sendmail

Copy the old config files and create the .db files that postfix uses:

[root@louvin postfix]# cp /72/etc/postfix/aliases .
[root@louvin postfix]# cp /72/etc/postfix/virtual .
cp: overwrite `./virtual'? y
[root@louvin postfix]# postmap virtual
[root@louvin postfix]# newaliases

Now the main.cf file is a little tricky. I was running a recently compiled version of Postfix under 7.2 so I know the config will work. I just want to preserve the originals just in case. I also want to pare the file down and remove all the helpful comments now that I know more about what I’m doing (years later…). So I take my old config file and strip out the comments, then symlink main.cf to my spare copy:

[root@louvin postfix]# grep -v '^#' main.cf | grep -e . > main.cf.spare
[root@louvin postfix]# mv main.cf main.cf.wordy
[root@louvin postfix]# ln -s main.cf.spare main.cf
[root@louvin postfix]# /etc/init.d/postfix start
Starting postfix:                                          [  OK  ]

Checking /var/log/maillog shows that all is well.

Users

Now that the two important services are happy, I need to update my /etc/passwd and /etc/shadow files to preserve all the ownership attributes and passwords for my users. This shouldn’t be too terrible. Just a little precaution: diff the existing and new files to make sure RedHat hasn’t changed the default UID/GID values for the system users (nobody, apache, etc.).

Looks ok. Some differences but nothing major. Just in case, I’ll just move the new files aside for the time being, in case I need to manually change an ID value or something.

[root@louvin etc]# mv passwd passwd.fc4
[root@louvin etc]# mv shadow shadow.fc4
[root@louvin etc]# cp /72/etc/passwd .
[root@louvin etc]# cp /72/etc/shadow .

and now the test:

[root@louvin etc]# cd ~karpet
[root@louvin karpet]# pwd
/home/karpet

Magic. My old users/groups are now restored. I love Unix/Linux. I haven’t said that enough today but it’s true. This whole process is made so much easier thanks to the great technology I get to use.

Now I need to test ssh for karpet. This is going to be a little tricky, I know, because I have now changed my sshd config/id and my old password-less login will likely fail. So tail -f /var/log/messages

OK. I can see there’s already a problem. My slick little passwd file change neglected to spot a new user. I get this error:

Feb  4 16:58:10 louvin sshd[345]: fatal: Privilege separation user sshd does not exist

when I try and login in via ssh. The solution is to add a line to my /etc/passwd file:

[root@louvin etc]# grep ssh passwd.fc4 
sshd:x:74:74:Privilege-separated SSH:/var/empty/sshd:/sbin/nologin
[root@louvin etc]# grep ssh passwd
[root@louvin etc]# vi passwd
[root@louvin etc]# init.d/sshd restart
Stopping sshd:                                             [FAILED]
Starting sshd:                                             [  OK  ]

Now I can ssh in. But it asks for my password, so I need to re-set that up:

[karpet@cartermac:~]$ ssh louvin "echo `cat ~/.ssh/id_dsa.pub` >> ~/.ssh/authorized_keys"

Now I have to remove the old keys from my ~/.ssh/known_hosts file and all is well.

That reminds me: need to update my /etc/motd — that personal touch.

[karpet@cartermac:~]$ ssh louvin
Last login: Sat Feb  4 17:06:53 2006 from 10.0.0.10
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Welcome to louvin, main server for peknet.com
  Contact karpet@peknet.com with questions.


be kind

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ karpet@louvin 1%

Beautiful.

It’s now 5pm and I’ve been at this for 8 hours. Time to get the essentials done and eat some supper.

Meanwhile yum has finished updating everything for me, including a new kernel. Perhaps I should reboot at this point and make sure everything works automatically as it should.

Good thing I tested. Needed to change UID for postfix user in /etc/passwd, and change /no/shell to /sbin/nologin in the same place. Looks like SpamAssasin (which I turned on via ntsysv) is writing to /var/log/maillog as spamd as well, so that’s good to know.

Another reboot to make sure there wasn’t something else I missed. At least I don’t have to enter all my GRUB statements each time. This upgrade is good if for nothing else than I fixed GRUB.

Ah. I had turned on tux (kernel level http server) to play with it, but of course it’s listening on port 80 which prevents Apache from starting. Turned off tux till I can figure out how to make it listen on a different port (or if I want to play with it at all…).

iptables. damn. I forgot that with the firewall on I need to explicitly open ports for named, etc. Created a little script to help me:

#!/bin/sh
#
# iptables wrapper


PORT=$1

iptables -I INPUT -p tcp --destination-port $PORT -j ACCEPT iptables-save > /etc/sysconfig/iptables iptables -L

SVN

For sanity’s sake, I created a svn repository for my local machine, so I can track changes I make to /etc. My existing svn root is /opt/svn, so as root:

cd /opt/svn
svnadmin create louvin
chmod 700 louvin
cd /etc
svn import . file:///opt/svn/louvin/etc -m init
cd ..
mv etc etc.orig
svn co file:///opt/svn/louvin/etc etc
cd etc
ls -l shadow /etc.orig/shadow
rsync -a /etc.orig/ .
ls -l shadow /etc.orig/shadow

That last trickery with the ls -l shadow and rsync is to restore the proper permissions on the files in /etc. shadow is a good test since it should be read-only by root. SVN automatically uses the existing umask (I think) but just changing my umask won’t help, since /etc files all have different perms per file. There are 100s of files, so I let rsync do the hard part of comparing before and after (the original with the new checked out version).

The whole point is to import the files and then check them back out for everyday use, but preserving the correct original permissions (which SVN loses in translation).

I’ll leave /etc.orig around for the time being, till I’m satisfied I didn’t break anything.

I’m going to wait a little while before flipping the switch and making this machine live. The dellpc solution seems to be holding up just fine, and I’d rather wait till I’m good and ready.

Lingering Issues

damn. bind/named still isn’t working for some reason.

and the reason is…. iptables. Even though I’ve opened port 53 for incoming requests, it gets a little more complicated because of how DNS works. Seems I need to not only open port 53 for tcp packets but udp as well. That simple. Took me an hour on google to find the answer…

IMAP service with the new dovecot server wasn’t playing nicely with my Mozilla IMAP client. I think it’s because of how dovecot fails to map mailbox paths to the filesystem on the server. In any case, I have switched back to the older cyrus IMAP server I had used in the past. Seems a little flaky; could be my config.

/etc/group needed some migration attention. It helped to sort it by group name and then diff the old and new. Having files under SVN has already saved my ass a couple times, since its now so easy to say “now what did this file look like before?”

As of 9:15pm I’ve switched traffic over and am monitoring mail and web. All seems ok, with just a couple glitches with missing Perl modules I had custom installed. Still TODO:

  • copy over old crontabs
  • copy over old root user scripts, dir
  • verify updatedb/locate db is getting created
  • set up the Swish-e/SWISHED server again with our new Apache/mod_perl2 server
  • ??

Over and out.

© 2024 peknet

Theme by Anders NorenUp ↑