Friday, March 1, 2013

Grub re-install

Problem
All I get is "grub" or a "grub>" prompt when I try and boot

Solution
You have to install GRUB on the MBR (Master Boot Record). To do this just follow this steps:

READ THIS FIRST !!

First you will need to know what Grub calls the hard disk drive partition that holds the required files.
A quick aside :- There are three ways of defining hard disk drives and their partitions. The first, that you're most probably familiar with, is Windows/MS-DOS letters (such as C: or D: ).

The second is Linux's method, which is to give the first device (hard disk drive or CD-ROM drive) on the first IDE (ribbon) cable the name /dev/hda, the second device (hard disk drive or CD-ROM drive) on the first cable is called /dev/hdb, the first device (hard disk drive or CD-ROM drive) on the second cable is called /dev/hdc and the second device on the second cable is called /dev/hdd .

So, you've got the names hda, hdb, hdc, and hdd for all of your four possible IDE/SCSI attached devices, although you probably only have a hard disk on hda and a CD ROM / DVD drive on hdb.

The hard disks are, probably, cut up into partitions that are numbered from one. So the first partition on the first hard disk attached to the first IDE cable will be called /dev/hda1, while the second will be called /dev/hda2 and, for further example, the fifth partition on the second hard disk on the second IDE cable would be called /dev/hdd5.
Get the idea?

Now to the third way of naming a hard disk and partition. Grub uses the letters "hd" followed by a number starting at zero to name the hard disks. To denote a particular partion a comma and a further number, again starting at zero is added. All of this is surrounded by brackets ().

So to Grub, the first hard disk drive attached to the first IDE/SCSI cable is called (hd0) , and to specify the first partition on that drive you would need to type (hd0,0)

(First of all, enter your BIOS setup and in BOOT Sequence window choose to boot with CDROM first.) Once the server is up in the live CD, you need to find out which is the correct partition containing the boot directory.
Issue the below command to find this.

grub> find /boot/grub/stage1

and you'll get returned the hard disk name and partition that has that file (see above for how Grub names these). However, if you have a separate /boot partition, remove /boot from the above command.

grub> find /grub/stage1
(hd0,0)

You can see (hd0,0). The output of this command is the name of the hard disk and partition that holds the stage1 file.
So, when you tried this, you got (hd0,0) returned to you. This tells me that your /boot/ folder lives on the first partition on hard disk on the first IDE cable. If it had returned (hd0,1) instead, that would have shown you that the /boot/ folder lived on the second partition of the first hard disk on the first IDE cable.

Once we have found this out we need to give Grub this in the next commands.

The root command tells Grub where to base all of its file path searches from. We take the hard disk and partition, given by the find command and use it with the root command, like so :-

grub> root (hd0,0)

Next comes the kernel command. This tells Grub the name of the kernel (core part of Linux) that you want to load when, later, you do the boot command.

As there is no way that you can remember the full name of the kernel, you can use the tab key facility in Grub (the tab key is that one with two opposite facing horizontal arrows that sit above the Caps Lock key on most keyboards).
HOLD ON !!

Let me explain the root command with example. If I did nor give root hd0,0) and pressed tab after typing kernel, you can see no output fro grub. This happens as we have not specified to grub a base harddisk and partition where it can look for kernel files.

grub> find /grub/stage1
(hd0,0)

grub> kernel /
Error 12: Invalid device requested
Now I am giving root (hd0,0) to grub. Now grub can suggest you options as grub now know where to look for possible files.
grub> root (hd0,0)
Filesystem type is ext2fs, partition type 0x83

grub> kernel /
Possible files are: grub symvers-2.6.9-100.ELsmp.gz boot symvers-2.6.9-89.35.1.ELsmp.gz vmlinuz-2.6.9-89.35.1.ELsmp initrd-2.6.9-100.ELsmp.img grub.OLD System.map-2.6
.9-023stab053.2-enterprise System.map-2.6.9-89.35.1.ELsmp initrd-2.6.9-89.35.1.ELsmp.img config-2.6.9-89.35.1.ELsmp System.map-2.6.9-100.ELsmp message config-2.6.9-100
.ELsmp initrd-2.6.9-023stab053.2-enterprise.img vmlinuz-2.6.9-100.ELsmp lost+found message.ja vmlinux-2.6.9-023stab053.2-enterprise vmlinuz-2.6.9-023stab053.2-enterpri
se

Thats enough for root stuff. Carry on below.
Load the kernel. If you dont know with which kernel the server was up, follow the steps below.
Mount the harddrive partition to get the /boot partition. If the /boot is separate partition, mount it, otherwise mount / partition. In the below example, its separate partition.
[root@vps9 grub]# fdisk -l

Disk /dev/sda: 139.9 GB, 139978604544 bytes
255 heads, 63 sectors/track, 17018 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sda1 * 1 65 522081 83 Linux
/dev/sda2 66 587 4192965 82 Linux swap
/dev/sda3 588 1109 4192965 82 Linux swap
/dev/sda4 1110 17018 127789042+ 5 Extended
/dev/sda5 1110 1370 2096451 83 Linux
/dev/sda6 1371 17018 125692528+ 83 Linux

How to confirm which is the /boot partition? Look for the * in the "Boot" coloumn in the fdisk output. Now mount it.

mkdir /oldboot
mount /dev/sda1 /oldboot
cat /oldboot/grub/grub.conf

Note the default loaded kernel grub lines. Eg :

title Virtuozzo (2.6.9-023stab053.2-enterprise)
root (hd0,0)
kernel /vmlinuz-2.6.9-023stab053.2-enterprise ro root=LABEL=/ console=ttyS0,57600 console=tty debug
initrd /initrd-2.6.9-023stab053.2-enterprise.img

Go back to grub prompt and pass the kernel

grub> kernel /vmlinuz-2.6.9-023stab053.2-enterprise ro root=LABEL=/

NOTE:-
If with "root=LABEL=/" the kernel failed to boot, then get the / partition and supply it as root=/dev/sda5. You can check it by using e2label.
[root@vps9 grub]# e2label /dev/sda5
/
Now pass the initrd
grub> initrd /initrd-2.6.9-023stab053.2-enterprise.img
Boot the passed kernel

grub> boot
Linux will now boot.

Re-installing Grub from within Linux
Once the server is up, ssh into the server. From command prompt you will need to enter the grub-install command. This takes one parameter - the name of the hard disk whose master boot record (mbr) will be having grub installed on it.
[root@vps9 grub]# grub-install /dev/sda
Installation finished. No error reported.
This is the contents of the device map /boot/grub/device.map.
Check if this is correct or not. If any of the lines is incorrect,
fix it and re-run the script `grub-install'.

# this device map was generated by anaconda
(fd0) /dev/fd0
(hd0) /dev/sda
That will do.

If you're still having boot problems
Grub errors messages
The complete list of error messages are at the end of this HowTo.
The two that I've bumped into are
Error 15
File note found.
Which normally means that you have mistyped the file name. Try using the tab key to help you fill in Grub commands.
Error 17
Unable to mount (use) the partition.
This may be that you have mistyped the number (remember, Grub counts from zero and not one), or that the partition that you pointed at does not have a valid file system.
Kernel panics If you get an error message, while booting, along these lines :-

Kernel panic: No init found. Try passing init= option to kernel

Your kernel needs something called an "initrd" and can't find one. There are a number of reasons that this can happen.
-> You haven't put an initrd statement in your Grub.conf or while at the Grub prompt.

Easy one to fix, just make sure that you have the correct (and correctly spelt) initrd for Grub to pass to the kernel. Have a look at the "All I get is either a "grub>" prompt or just "grub" when I try and boot" section of this HowTo.
-> The "root=" parameter on the kernel statement does not point to the correct hard disk and partition.

The "root=" parameter on the kernel statement often says "root=LABEL=/", which often works just fine, but sometimes you have to be more exact. I've only found this to be a problem when I'm using a separate /boot and root (/) partition, or when Mandrake is involved. So change the "LABEL=/" bit to the partition that contains your root (/) folder. If your root (/) partition is on /dev/hda6, for example, then make the root statement look like "root=/dev/hda6".
-> The initrd file has become corrupted or been deleted.

You'll need to get a Linux up using either a distribution/rescue CD or a rescue diskette. Then change the root to your hard disk drive by entering, from the shell command prompt, chroot /dev/hdxy . Where the "x" is the letter of the hard disk and the "y" is the number of the partition. So, if your normal /boot folder is on /dev/hda2 , then enter chroot /dev/hda2/

Then change directory to the /boot folder, move the old .img file out of the way - assuming it's still there - by renaming it to *.img.old , and then create a new initrd by typing mkinitrd -v -f initrd-KERNEL-VERSION.img KERNEL-VERSION . Replace "KERNEL-VERSION" with the version of the kernel that you are trying to load. If you do a full listing of the /boot folder you'll see the same numbers and letters in the full kernel file's name (eg. for the kernel called "vmlinuz-2.4.22-10mdk" , you would want to create an initrd called "initrd-2.4.22-10mdk.img" and the kernel version would be "2.4.22-10mdk" ).

Footnote: - Search order at boot up time. Your PC will look for an operating system in a number of places, in an order set out in the BIOS. If you find that your PC refuses to look for an operating system in either your floppy diskette drive (if you are attempting to boot from a rescue diskette) or from your CD-ROM / DVD drive (if you are trying to boot from an installation CD / DVD), then you'll need to enter your BIOS setup.

To enter the BIOS setup screens you will need to press either the Del key or the F2 key during the POST checks (which one is dependent on your PC). So, turn your PC on and while it is giving you all of those messages about how much RAM you have and what disks it knows about, press the relevant key for your PC. Keep pressing until you are presented with either a blue or grey BIOS screen.

Using a combination of the cursor arrow keys, the tab key and the enter key, navigate your way to the option to change the boot order.

On an AMIBIOS (grey) screen, you will need to move right to the Boot option, press Enter and then move down to the Boot Device Priority option and press Enter , then select the first device, press Enter and select from the list. When you've picked the correct boot device (Floppy or CDROM), press the Esc key to exit up the levels, then move across to Exit and select Exit saving changes .

On an Award BIOS (blue) screen, move down to the Advanced BIOS Features, press Enter, then move down to the First Boot Device, again press Enter and select from the list. Once done, press the Esc key to move back up levels and then across and down to the Save & Exit Setup option.

There are other BIOSs out there , but these are to only two that I have access to. Hopefully, though, you'll have got the idea of what to do, from the above description. And you can always escape out of trouble by repeatedly pressing the Esc key.


Grub Error messages :-

1 : Filename must be either an absolute filename or blocklist This error is returned if a file name is requested which doesn't fit the syntax/rules listed in the Filesystem.
2 : Bad file or directory type This error is returned if a file requested is not a regular file, but something like a symbolic link, directory, or FIFO.
3 : Bad or corrupt data while decompressing file This error is returned if the run-length decompression code gets an internal error. This is usually from a corrupt file.
4 : Bad or incompatible header in compressed file This error is returned if the file header for a supposedly compressed file is bad.
5 : Partition table invalid or corrupt This error is returned if the sanity checks on the integrity of the partition table fail. This is a bad sign.
6 : Mismatched or corrupt version of stage1/stage2 This error is returned if the install command points to incompatible or corrupt versions of the stage1 or stage2. It can't detect corruption in general, but this is a sanity check on the version numbers, which should be correct.

7 : Loading below 1MB is not supported This error is returned if the lowest address in a kernel is below the 1MB boundary. The Linux zImage format is a special case and can be handled since it has a fixed loading address and maximum size.
8 : Kernel must be loaded before booting This error is returned if GRUB is told to execute the boot sequence without having a kernel to start.
9 : Unknown boot failure This error is returned if the boot attempt did not succeed for reasons which are unknown.
10 : Unsupported Multiboot features requested This error is returned when the Multiboot features word in the Multiboot header requires a feature that is not recognized. The point of this is that the kernel requires special handling which GRUB is probably unable to provide.
11 : Unrecognized device string This error is returned if a device string was expected, and the string encountered didn't fit the syntax/rules listed in the Filesystem.

12 : Invalid device requested This error is returned if a device string is recognizable but does not fall under the other device errors.
13 : Invalid or unsupported executable format This error is returned if the kernel image being loaded is not recognized as Multiboot or one of the supported native formats (Linux zImage or bzImage, FreeBSD, or NetBSD).
14 : Filesystem compatibility error, cannot read whole file Some of the filesystem reading code in GRUB has limits on the length of the files it can read. This error is returned when the user runs into such a limit.
15 : File not found This error is returned if the specified file name cannot be found, but everything else (like the disk/partition info) is OK.
16 : Inconsistent filesystem structure This error is returned by the filesystem code to denote an internal error caused by the sanity checks of the filesystem structure on disk not matching what it expects. This is usually caused by a corrupt filesystem or bugs in the code handling it in GRUB.
17 : Cannot mount selected partition This error is returned if the partition requested exists, but the filesystem type cannot be recognized by GRUB.
18 : Selected cylinder exceeds maximum supported by BIOS This error is returned when a read is attempted at a linear block address beyond the end of the BIOS translated area. This generally happens if your disk is larger than the BIOS can handle (512MB for (E)IDE disks on older machines or larger than 8GB in general).
19 : Linux kernel must be loaded before initrd This error is returned if the initrd command is used before loading a Linux kernel.
20 : Multiboot kernel must be loaded before modules This error is returned if the module load command is used before loading a Multiboot kernel. It only makes sense in this case anyway, as GRUB has no idea how to communicate the presence of such modules to a non-Multiboot-aware kernel.
21 : Selected disk does not exist This error is returned if the device part of a device- or full file name refers to a disk or BIOS device that is not present or not recognized by the BIOS in the system.
22 : No such partition This error is returned if a partition is requested in the device part of a device- or full file name which isn't on the selected disk.
23 : Error while parsing number This error is returned if GRUB was expecting to read a number and encountered bad data.
24 : Attempt to access block outside partition This error is returned if a linear block address is outside of the disk partition. This generally happens because of a corrupt filesystem on the disk or a bug in the code handling it in GRUB (it's a great debugging tool).
25 : Disk read error This error is returned if there is a disk read error when trying to probe or read data from a particular disk.
26 : Too many symbolic links This error is returned if the link count is beyond the maximum (currently 5), possibly the symbolic links are looped.
27 : Unrecognized command This error is returned if an unrecognized command is entered on the command-line or in a boot sequence section of a configuration file and that entry is selected.
28 : Selected item cannot fit into memory This error is returned if a kernel, module, or raw file load command is either trying to load its data such that it won't fit into memory or it is simply too big.
29 : Disk write error This error is returned if there is a disk write error when trying to write to a particular disk. This would generally only occur during an install of set active partition command.
30 : Invalid argument This error is returned if an argument specified to a command is invalid.
31 : File is not sector aligned This error may occur only when you access a ReiserFS partition by block-lists (e.g. the command `install'). In this case, you should mount the partition with the `-o notail' option.
32 : Must be authenticated This error is returned if you try to run a locked entry. You should enter a correct password before running such an entry.
33 : Serial device not configured This error is returned if you try to change your terminal to a serial one before initializing any serial device.
34 : No spare sectors on the disk This error is returned if a disk doesn't have enough spare space. This happens when you try to embed Stage 1.5 into the unused sectors after the MBR, but the first partition starts right after the MBR or they are used by EZ-BIOS.

No comments:

Post a Comment