The Agony of Upgrading Fedora

Time for a rant.

A few years ago I decided it was finally time to learn Linux after having used DOS, Windows, and the Mac OS for years. My plan of attack was to run my own domain – savagexi.com, complete with a website, blogs, mail server, DNS server and DHCP server. And if I’d ever find the time, MythTv.

Back then Fedora seemed like the best choice, and every year or so I upgrade the servers in the basement to the latest version. Upgrading Fedora always sucks, but my experience over the weekend warrants a big, resounding F.

When working on my own machines, I tend to go beyond flying by the seat-of-my-pants to wanton recklessness. There’s nothing quite like a nasty error message (disk failure, missing partition, broken boot loader, misconfigured X server etc.) to focus the mind and learn how things really work. Over the years, my reckless attitude has cost me only once, when a disk drive that was part of a Logical Volume gave up its soul when it screeched to a dreadful halt. And even then, I almost managed to rescue the data I needed off the remaining disk, finding out five minutes too late what I should have done instead of what I did. Since then, I’ve eschewed LVM and gone with nice, simple RAID 1 arrays (which means having 2 disks that mirror each other so if one breaks you can get your data from the other one) to at least provide a modicum of redundancy.

The impetus for upgrading this time around was spam. I’ve always heard how wonderful greylisting is, and after one too many emails about navigating the love canal with confidence, it was time to take action. But of course I ran into a roadblock – setting up greylisting on Fedora 6 using a program called PostGrey didn’t work because it conflicted with SELinux (see, I’m a glutton for punishment, using SELinux on a home network). Of course that took some doing to figure out, since Fedora 6 doesn’t bother to actually log a message about the problem. So after reading the Fedora 8 release notes about how PostGrey and SELinux are best of buddies, I decided it was worth the pain to upgrade the email server.

From past experience, I was under no illusion it would be easy. But little did I suspect just how dreadful it would be. I decided to do the upgrade using a network install since I don’t have a DVD burner (yeah, yeah), which means the bytes are downloaded on demand across the Internet. It actually works pretty well if you pick a fast mirror, such as facebook. But when things go wrong you have to stop the installation, reboot the machine, Google around a bit, fix whatever problem is, start the installation over and redownload the bytes. Remember the stop-reboot-fix-install sequence, I must have done it twenty times.

Day 1

Attempt #1. Things got off to a rousing start with Anaconda, the Fedora installer, complaining that the disk partitions on the two drives in the machine had to be labeled. Of course Anaconda should have just fixed the problem itself, but no, it is a remarkably unhelpful program.

Attempt #2. So stop-reboot-google around-fix the problem – and try again. This time Anaconda bitched about not finding any valid partitions, or in English, it couldn’t read the 2 hard-drives on the machine and thus couldn’t update them. Since I had just rebooted the machine, it stretched the imagination that Anaconda could be so dumb. But either way, back to the stop-reboot-fix-reboot-start cycle. Except this time there was no fix, since the machine booted just fine.

Attempt #3. Try again. This time I gave into Anaconda when it offered the choice of wiping the drives clean, and hit the next button. I then quickly decided that was a bad move, and hit the back button. No luck. Although the installation hadn’t started yet (I was on screen that was asking me some question I don’t remember), when I rebooted the machine I was greeted by the message GRUB. Mind you, not a grub prompt, just four capital letters that spelled GRUB. Ugh.

So it was now time to dig out the Fedora 6 rescue disk and run it. It couldn’t find any partitions either, and dumped me at a command prompt. From there I could run the ever exciting program fdisk, which let’s you manage the partitions on your disk. fdisk is a nice, easy to use program, but its living on the edge – one false move and you can easily delete your data. From fdisk I noted that the machine had two hard-drives, the first was 80GB and the second 60GB. I also saw that first drive (80GB) no longer had a partition table thanks to Anaconda. Working backwards, I recreated its partitions. That was easy to do, since the two drives are part of a RAID array and thus I assumed the first partition on the first disk should be 60GB.

Attempt #4. Reboot and …. get greeted by the every friendly GRUB message again.

Attempt #5. Reboot, but this time I hit the F12 key to open the Boot menu. I then noticed that the last choice in the boot menu was to start a utility disk, which miraculously opened to a grub prompt (it wasn’t until the next day I figured out how to run grub from the rescue disk, although I suspected it was possible). Of course I don’t know diddly about Grub, so it took another 30 minutes of Googling to figure out how to fix the problem (basically reinstall grub on the drive).

Attempt #6. This time, Anaconda had the decency to recognize my partitions and even offered me a chance to upgrade them. Hooray. Pushing my luck, I hit the next button, and watched Anaconda check the dependencies for all installed packages. 5%, 10%, 15%, 25%, 26%…and then nothing. Of course.

More Googling, and finally enlightenment. Turns out I was hardly the first to run into this show-stopper bug. If that wasn’t bad enough, the bug was still open 2 months after it was reported, and none of the mirrors had been updated (there has been a respin of the the Fedora 8 CDs, but its hardly useful if I can’t get to it). So I read through the whole thread, and in one of the comments a Fedora developer had posted a link to a “update image” on his website. After a bit of research, I figured out what an update image is and how to use it.

Attempt #7. If you don’t first succeed, try, try again. This time Anaconda got past the dependency checker, and amazingly enough finished. Success was near at hand. NOT.

Attempt #8. Reboot the machine and watch in horror as the dreaded GRUB message rears its ugly head.

So back to the rescue disk – which of course can’t mount any partitions ( wtf?) and spits me out to a linux prompt. Back to fdisk. And once again enlightenment – Disk #1 had once again lost its partition table. Fix it. Boy this is getting tedious.

Attempt #9. Surely things are fixed by now. Reboot. And then watch in amazement as the computer tries to load Fedora Core 6, spits out pages and pages of errors, and unceremoniously dumps me to a login prompt. Of course the login prompt doesn’t work. WTF?

Ah – my favorite pastime, loading the rescue disk. Try fdisk again, everything looks ok. So the next obvious thing is the RAID array is broken somehow. Go read about mdadm, which is the Linux program for creating and managing software RAID arrays. Using the wonders of Google, I found a very helpful article that explains how to rescue your RAID array. Following the instructions, I remount the array and discover that only Disk #2 is available. And then it dawns on me – somehow Anaconda only updated Drive #2, thus leaving Drive #1 with Fedora 6 in a very broken state. So a bit more Googling, and I learn how to re-add Disk #1 back into the array. And then nothing. Hmm. More Googling – how exactly do you know what a RAID array is doing?

That didn’t take long, and I stare in wonderment as something actually goes right – mdadm is happily resyncing Disk #1 with Disk #2 and says it will be done in a bit over an hour. At this point its 3:30 am, so I call it a day.

Day 2

Attempt #10. After a good night’s sleep, it was time for more fun. The RAID array had successfully fixed itself overnight, so crossing my fingers I rebooted the machine. My heart sunk when I was greeted with lines and lines of warnings about disk overflow errors. But wait, those were for the extra partition on Disk #1 (remember only the first 60GB are used in the RAID array, leaving 20GB free). Once the cruft had cleared, the machine managed to boot all the way to the Fedora 8 welcome screen. Hallelujah! Of course a fair bit was broken, including the DNS server, which meant at least a few hours in BIND hell (BIND and I simply don’t get along). But first things first.

However, I was worried about the disk overflow errors. For some reason, the kernel thought the 20GB partition was smaller that it really was. A bit of Googling turned up a couple of potential causes and solutions, but none worked. So back to fdisk. I figured the best course of action was to just delete the 2nd partition and recreate it.

Attempt #11. After recreating the problem partition, it was time to reboot the machine. And of course back to my old friend GRUB. I have no idea how I ended up back there, but clearly old flings die slowly. But at this point I was an old hand at moving on, and rescue disk in hand, it was time to work some magic at the grub prompt. And to be on the safe-side, I Googled around a bit more to see if somehow I had mistakenly configured GRUB with RAID and could kick this habit once and for all. Fortunately, I turned up this gem of an article and promptly changed things around based on its recommendations.

Attempt #12. And finally, one day later, a clean boot to Fedora 8 (minus of course BIND being unhappy).

Denouement

It beats me how any normal person manages to maintain their own Linux system – I only succeed through sheer determination and stubbornness. I realize that Fedora recommends a clean install with each new version, but to do that without losing your personal data and system configuration takes knowledge and effort beyond almost anyone who lives on this planet, including myself. So overall – I give Fedora an F for its horribly broken upgrade program.

And of course the kicker – PostGrey still doesn’t work with SELinux on Fedora 8. But at least in FC8 its polite enough to actually log an error. So anyone for creating and compiling their own policy files? Ah, I feel another rant coming along about SELinux.

  1. ToddY
    January 30, 2008

    Heh, thansk for sharing.

    Yep, TCO (Total Cost of Ownership) is why I gave up on Linux years ago. I just didn’t want to work that damn hard on it.

    And yes, I’m sure its better now than it was (thank god for google to find answers)
    but I’m also older and so am willing to put even less effort into it. 🙂

    I’m sure there are a ton of people/organizations who Linux is the perfect answer for their personality/needs/etc, but just reading your post reaffirmed why it isn’t for me anymore.

    Reply
  2. Dylan
    January 30, 2008

    Wow– you could probably turn this into some kind of screen play for a tech version of Indiana Jones…

    As with anything new- learning takes time, patience, and even sometimes Reading-TFM…

    I have been a Debian user for nearly 7 years now, and their package management system alone is worth it. Seriously, who in their right mind would want to install from scratch with each new version? That route sounds like a recipe for the kind of tragedy that plays out in your rant above.

    Getting started in linux can be hard, but there are a couple of tips which can help.

    1. join a local user group, or at least their mailing list- this continues to save me countless hours when working with something new from time to time…

    2. get a text-based install of debian or ubuntu. these are modern, well-supported distributions with a SANE package management system.

    3. spend some time looking at linux books at your local book store- maybe even buy one. many of the tips will save time and effort.

    Learning while doing is difficult- so make sure you put in some of the learning before attempting the doing.

    Good luck —

    Reply
  3. James
    January 30, 2008

    Ubuntu inherits Debian’s efforts in making upgrades as easy as breathing. The saying about the old debian install program was that it didn’t matter it was horrible, since you only install Debian once, and upgrade forever after. My desktop’s Debian install from 2000 is still going strong after two major system upgrades (P3 to Athlon 2500+ to Core2Duo).

    Reply
  4. Charlie
    January 30, 2008

    Hey Dylan and James,

    I’ve definitely considered moving over to Ubuntu. What’s stopped me is a) once Fedora is actually installed it works pretty well for me and b) it seemed like it would be more painful to switch than upgrade. Of course, this weekend’s experience makes that a bit laughable.

    And good tips Dylan – I’m a big fan of [Safari](http://safari.oreilly.com) which has lots of good Linux books online.

    Reply
  5. Charlie Savage –
    January 30, 2008

    ToddY – So what did you end up with?

    Reply
  6. Robert
    January 30, 2008

    Charlie,

    I know the Linux upgrade horrors – I tried to get my WLAN to work under Debian once (still doesn’t work but I’m satisfied with using wires now).

    Which leads me to the question: why didn’t you use the Danish boot disk?

    Robert.

    Reply
  7. January 30, 2008

    It’s not just Linux. I’ve experienced similar issues with Plone and MacPorts (and probably others that I’ve suppressed the memory of). I think upgrades/updates are hard to make easy and to make work in all the configurations that people set up on their own systems. The funny thing is, I always wind up living on the edge by tweaking the dickens out of whatever I use. I should be amazed and thankful that I don’t have these kind of problems every time I do any upgrade.

    Reply
  8. ToddY
    January 30, 2008

    Mac OS X for personal use, and Windows happens to be my current required dev platform for work.

    Something of an old article, but one that I found interesting at the time.

    http://www.paulgraham.com/mac.html

    Reply
  9. Charlie Savage –
    January 31, 2008

    Hey Robert,

    Memories of the Danish boot disk kept floating through my mind. I can’t remember though how we got to that state – was I trying to setup a dual boot system or something?

    Reply
  10. Charlie Savage –
    January 31, 2008

    Hi Allan,

    I agree – upgrades are hard – and system upgrades are harder. But like you, I try to forget my horror stories as soon as possible. But I figured I’d write one down for once!

    Reply
  11. January 31, 2008

    It’s not that Fedora sucks; it’s that you admittedly have (what is to you) a “production” environment that you treat as if it is a development environment. Here’s what I’ve learned over time:

    * Keep your application data on partitions that are separate from your OS partition. If possible, keep the OS on a separate disk altogether. Drives fail, installations get borked. Eggs in one basket, etc.

    * Generate a ks.cfg that maps out *exactly* what you want to do prior to upgrading. At times Anaconda can suck but you can bypass most of the ugliness by using kickstart. You can specify drive layout, whether to wipe the partitions, etc. It’s the best aspect of Red Hat’s product that not enough people use. Every Fedora or RHEL install generates a working ks.cfg in root’s home dir. Use it as a jumping off point.

    * Download a copy of VMware server and test out said ks.cfg prior to upgrading. It’s free. Set up a VM on your desktop and play around with it until you’re comfortable. That way the computer you destroy will only be a virtual one.

    * Download the Fedora tree to one of the partitions you won’t be wiping and set your ks.cfg to install from that partition. Keep it on a local server and always be suspect of mirrors that may be a day or two behind. Okay so this one is optional 🙂

    You could easily have this problem with other Linux derivatives; don’t let people tell you different … even in Ubuntu.Upgrading from 6.06 to 6.10 broke a *lot* of working configurations. You just have to do the requisite planning and testing if your data is important to you.

    Reply
  12. Charlie Savage –
    February 1, 2008

    Hey Brian,

    I treat it even worse than a development environment – its more like a teaching environment. So no doubt I’m partially at fault here.

    And thanks for the great info about ks.cfg, I’d never heard of it before. I’ll check it out.

    And last, I stick with Fedora because I do like it, and at this point, am fairly familiar with it.

    Reply
  13. February 2, 2008

    It sounds like you’re pretty set on Fedora, but I hope maybe I can nudge you in another direction. My nudging is not because I think Fedora’s awful, but because my single greatest pearl of wisdom after using linux as my primary OS over the last 10 years is this: the single greatest factor in determining your enjoyment of linux is the package management system.

    If you haven’t tried an RPM based system (Redhat/Fedora/Suse/etc), an APT based system (Debian/Ubuntu/etc), and a Portage based system (Gentoo) then there’s a reasonable likelihood you could be happier somewhere else. Different people find different systems intuitive.

    It sounds to me like you might really enjoy the Gentoo way of looking at things. There isn’t any upgrading from version to version. Portage is a rolling upgrade system that treats updating as a diff between your system at the newest stuff. This can be a blessing and a curse, but you sure learn a lot.

    Don’t underestimate the value of shopping around. I used Redhat for years before I shopped around and it turned out to be one of my least favorite.

    All that aside, I’m glad you made it. That reboot-google sequence drives me _nuts_.

    Reply
  14. Charlie Savage –
    February 2, 2008

    Hi Matt,

    Thanks for the recommendation. Shopping around strikes me as good advice. I’d like to get a bit of experience in the Gentoo and Debian world. One of the MapBuzz developers (Andres) is a big fan of Gentoo, and uses it as his main development machine.

    Reply
  15. Andre P.
    February 22, 2008

    Upgrades of the latest versions of Ubuntu have gone very smoothly for me.

    Reply
  16. March 13, 2008

    I guess I should consider myself lucky that the .isos of Fedora 8 are (apparently) corrupt (I tried from work and home on different machines, different OSs, burning software, different brands of blank disks, different mirrors). I’d gotten fed up with Ubuntu because of it’s crappy sound card support and was considering switching. Uh, maybe not.

    Reply
  17. steve@waweru.net
    March 16, 2008

    They change Fedora so radically with each version I feel like ditching Linux. Change is good but has to make sense.

    Reply

Leave a Reply

Your email address will not be published.

Top