Using LVM and understanding the risks of using defaults

In the past couple months we have had a few calls about users who have lost parts of their LVM’s and in most cases never were able to recover any data. I have done some research on a small file system that proves that using LVM in a stripe configuration will not provide any significant recovery options. You may recover some files if the primary drive in the LVM group does not fail. You may only recover fragments of bigger files if the primary drive fails and they will all be dropped into lost+found.

Let’s talk about LVM and the defaults that get used. It’s common for a distro to use LVM as the default file system to install the root or / file system. This is dangerous with one drive even if you create multiple partitions.

If you have two drives in the system and create an LVM group you should mirror the two drives to allow for recovery in the event of a drive failure. If you have more than two drives you should be sure to configure mirroring.

Default is not always the best in all situations, often users feel that defaults are set defined by people who know more about our system than we do. This is not true; you should know your system better than the people who select the defaults.

Now that I have had my brief rant I’d suggest that if you feel like using LVM you should consult the LVM howto: http://tldp.org/HOWTO/LVM-HOWTO/ to read and understand why LVM is included in systems and in what instances you may benefit, and also in what instances where it is not beneficial.

Regardless of what you decide to do I recommend that you always keep a backup of

/etc/lvm/

someplace safe, also consider a backup of the super block locations that are generated when you create the file system.

LVM stripe recovery.

In my example I use a VMware instance that I use, /dev/sdc and /dev/sdd each being 0.50GB to create an LVM group named vgken, it has a logical volume called lvol3.
Here is what I did.

I created a single partition on each drive with all available space then changed the type to Linux LVM and saved the partition information:

fdisk -l
Disk /dev/sdc: 536 MB, 536870912 bytes
64 heads, 32 sectors/track, 512 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes

Device Boot Start End Blocks Id System
/dev/sdc1 1 512 524272 8e Linux LVM

Disk /dev/sdd: 536 MB, 536870912 bytes
64 heads, 32 sectors/track, 512 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes

Device Boot Start End Blocks Id System
/dev/sdd1 1 512 524272 8e Linux LVM

Then I created the volume group vgken including the 2 partitions I created:

[root@centos5_3-1 ~]# vgcreate vgken /dev/sdc1 /dev/sdd1
Logging initialised at Mon Nov 16 06:52:52 2009
Set umask to 0077
Wiping cache of LVM-capable devices
Adding physical volume '/dev/sdc1' to volume group 'vgken'
Adding physical volume '/dev/sdd1' to volume group 'vgken'
Archiving volume group "vgken" metadata (seqno 0).
Creating volume group backup "/etc/lvm/backup/vgken" (seqno 1).
Volume group "vgken" successfully created
Wiping internal VG cache

Here is the output of pvdisplay that shows the physical volumes in the volume group:

[root@centos5_3-1 ~]# pvdisplay
Logging initialised at Mon Nov 16 06:52:56 2009
Set umask to 0077
Scanning for physical volume names
--- Physical volume ---
PV Name /dev/sdc1
VG Name vgken
PV Size 511.98 MB / not usable 3.98 MB
Allocatable yes
PE Size (KByte) 4096
Total PE 127
Free PE 127
Allocated PE 0
PV UUID aZDQ1S-XP8m-bOHI-LU56-v2LM-RmjS-4xP50S

--- Physical volume ---
PV Name /dev/sdd1
VG Name vgken
PV Size 511.98 MB / not usable 3.98 MB
Allocatable yes
PE Size (KByte) 4096
Total PE 127
Free PE 127
Allocated PE 0
PV UUID CgGLuS-KpPm-2HOq-dmOI-RgCx-PTtg-0AmiP4
Wiping internal VG cache

I create a logical volume lvol3 in volume group vgken that uses the maximum space available on the physical volumes, I know this because numbers higher than 1016M generated an error saying not enough PE free:

[root@centos5_3-1 ~]# lvcreate -L 1016M -n lvol3 vgken
Logging initialised at Mon Nov 16 06:53:59 2009
Set umask to 0077
Setting logging type to disk
Finding volume group "vgken"
/dev/cdrom: open failed: Read-only file system
Archiving volume group "vgken" metadata (seqno 1).
Creating logical volume lvol3
Creating volume group backup "/etc/lvm/backup/vgken" (seqno 2).
Found volume group "vgken"
Creating vgken-lvol3
Loading vgken-lvol3 table
Resuming vgken-lvol3 (253:2)
Clearing start of logical volume "lvol3"
Creating volume group backup "/etc/lvm/backup/vgken" (seqno 2).
Logical volume "lvol3" created
Wiping internal VG cache

Next I create a journaling filesystem on my logical volume that I created:

[root@centos5_3-1 ~]# mkfs -t ext3 -m 2 -v /dev/vgken/lvol3
mke2fs 1.39 (29-May-2006)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
130048 inodes, 260096 blocks
5201 blocks (2.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=268435456
8 block groups
32768 blocks per group, 32768 fragments per group
16256 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376

Writing inode tables: done
Creating journal (4096 blocks): done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 37 mounts or
180 days, whichever comes first. Use tune2fs -c or -i to override.

Let’s view the information that’s stored in /etc/lvm/backup about my volume group which could be useful in the event a recovery effort is required if one of the drives fails that contains /etc, in this instance /etc is not contained on the volume:

[root@centos5_3-1 ~]# cat /etc/lvm/backup/vgken
# Generated by LVM2 version 2.02.32-RHEL5 (2008-03-04): Mon Nov 16 06:53:59 2009

contents = "Text Format Volume Group"
version = 1

description = "Created *after* executing 'lvcreate -L 1016M -n lvol3 vgken'"

creation_host = "centos5_3-1" # Linux centos5_3-1 2.6.18-164.6.1.el5 #1 SMP Tue Nov 3 

16:18:27 EST 2009 i686
creation_time = 1258372439 # Mon Nov 16 06:53:59 2009

vgken {
id = "4JI09f-wnHT-LKbq-T14Q-AQOk-K7oi-Dga0MM"
seqno = 2
status = ["RESIZEABLE", "READ", "WRITE"]
extent_size = 8192 # 4 Megabytes
max_lv = 0
max_pv = 0

physical_volumes {

pv0 {
id = "aZDQ1S-XP8m-bOHI-LU56-v2LM-RmjS-4xP50S"
device = "/dev/sdc1" # Hint only

status = ["ALLOCATABLE"]
dev_size = 1048544 # 511.984 Megabytes
pe_start = 384
pe_count = 127 # 508 Megabytes
}

pv1 {
id = "CgGLuS-KpPm-2HOq-dmOI-RgCx-PTtg-0AmiP4"
device = "/dev/sdd1" # Hint only

status = ["ALLOCATABLE"]
dev_size = 1048544 # 511.984 Megabytes
pe_start = 384
pe_count = 127 # 508 Megabytes
}
}

logical_volumes {

lvol3 {
id = "f2y0cP-uyLF-YCKB-waZX-7l4Q-aA1z-tza0jB"
status = ["READ", "WRITE", "VISIBLE"]
segment_count = 2

segment1 {
start_extent = 0
extent_count = 127 # 508 Megabytes

type = "striped"
stripe_count = 1 # linear

stripes = [
"pv0", 0
]
}
segment2 {
start_extent = 127
extent_count = 127 # 508 Megabytes

type = "striped"
stripe_count = 1 # linear

stripes = [
"pv1", 0
]
}
}
}
}

Then I remove the hard disk sdd, boot back up and try to mount the volume group:

[root@centos5_3-1 ~]# mount /dev/vgken/lvol3 /mnt2
mount: special device /dev/vgken/lvol3 does not exist

When I do a scan of the physical volumes the annoying cdrom error always happens and you can see that there is a missing volume:

[root@centos5_3-1 ~]# pvscan
Logging initialised at Mon Nov 16 07:26:54 2009
Set umask to 0077
Wiping cache of LVM-capable devices
Wiping internal VG cache
Walking through all physical volumes
/dev/cdrom: open failed: Read-only file system
Attempt to close device '/dev/cdrom' which is not open.
Couldn't find device with uuid 'aZDQ1S-XP8m-bOHI-LU56-v2LM-RmjS-4xP50S'.
PV unknown device VG vgken lvm2 [508.00 MB / 0 free]
PV /dev/sdc1 VG vgken lvm2 [508.00 MB / 0 free]
Total: 2 [0.99 GB] / in use: 2 [0.99 GB] / in no VG: 0 [0 ]
Wiping internal VG cache

Trying to restore the volume group from backup does not work if a physical volume is missing:

[root@centos5_3-1 ~]# vgcfgrestore vgken
Logging initialised at Mon Nov 16 07:27:37 2009
Set umask to 0077
Wiping cache of LVM-capable devices
/dev/cdrom: open failed: Read-only file system
Attempt to close device '/dev/cdrom' which is not open.
Couldn't find device with uuid 'aZDQ1S-XP8m-bOHI-LU56-v2LM-RmjS-4xP50S'.
Couldn't find all physical volumes for volume group vgken.
Restore failed.
Wiping internal VG cache

Looking at the partition list in /proc shows there is no sd or sd1 anymore:

[root@centos5_3-1 ~]# cat /proc/partitions
major minor #blocks name

8 32 524288 sdc
8 33 524272 sdc1
253 0 1572864 dm-0
253 1 393216 dm-1

Now I power off and add new hard disk sdd then do a pvscan, the new drive shows as an unknown device:

[root@centos5_3-1 ~]# pvscan
Logging initialised at Mon Nov 16 07:41:18 2009
Set umask to 0077
Wiping cache of LVM-capable devices
Wiping internal VG cache
Walking through all physical volumes
/dev/cdrom: open failed: Read-only file system
Attempt to close device '/dev/cdrom' which is not open.
Couldn't find device with uuid 'aZDQ1S-XP8m-bOHI-LU56-v2LM-RmjS-4xP50S'.
PV unknown device VG vgken lvm2 [508.00 MB / 0 free]
PV /dev/sdd1 VG vgken lvm2 [508.00 MB / 0 free]
Total: 2 [0.99 GB] / in use: 2 [0.99 GB] / in no VG: 0 [0 ]
Wiping internal VG cache

Trying to restore the volume group from backup does not work if a physical volume does not have the uuid that the backup configuration file is looking for:

[root@centos5_3-1 ~]# vgcfgrestore vgken
Logging initialised at Mon Nov 16 07:41:41 2009
Set umask to 0077
Wiping cache of LVM-capable devices
/dev/cdrom: open failed: Read-only file system
Attempt to close device '/dev/cdrom' which is not open.
Couldn't find device with uuid 'aZDQ1S-XP8m-bOHI-LU56-v2LM-RmjS-4xP50S'.
Couldn't find all physical volumes for volume group vgken.
Restore failed.
Wiping internal VG cache

To allow the volume group to accept the new drive into the group we must create the proper uuid on the physical volume:

[root@centos5_3-1 ~]# pvcreate --uuid aZDQ1S-XP8m-bOHI-LU56-v2LM-RmjS-4xP50S /dev/sdc
Logging initialised at Mon Nov 16 07:42:00 2009
Set umask to 0077
Wiping cache of LVM-capable devices
Set up physical volume for "/dev/sdc" with 1048576 available sectors
Physical volume "/dev/sdc" successfully created
Wiping internal VG cache

A restore of the volume group from backup works and reports no errors now:

[root@centos5_3-1 ~]# vgcfgrestore vgken
Logging initialised at Mon Nov 16 07:42:03 2009
Set umask to 0077
/dev/cdrom: open failed: Read-only file system
Restored volume group vgken
Wiping internal VG cache

Another try to restore reports nothing because the volume group is already restored:

[root@centos5_3-1 ~]# vgcfgrestore vgken

Now we scan to see what the status of the volume group is, it’s obvious we have repaired the volume group:

[root@centos5_3-1 ~]# vgscan
Logging initialised at Mon Nov 16 07:42:09 2009
Set umask to 0077
Wiping cache of LVM-capable devices
Wiping internal VG cache
Reading all physical volumes. This may take a while...
Finding all volume groups
/dev/cdrom: open failed: Read-only file system
Attempt to close device '/dev/cdrom' which is not open.
Finding volume group "vgken"
Found volume group "vgken" using metadata type lvm2
Archiving volume group "vgken" metadata (seqno 2).
Archiving volume group "vgken" metadata (seqno 3).
Creating volume group backup "/etc/lvm/backup/vgken" (seqno 3).
Finding volume group "VolGroup00"
Found volume group "VolGroup00" using metadata type lvm2
Wiping internal VG cache

Here is where I activate the volume group and make it known to the kernel:

[root@centos5_3-1 ~]# vgchange -ay vgken
Logging initialised at Mon Nov 16 07:42:38 2009
Set umask to 0077
Using volume group(s) on command line
Finding volume group "vgken"
Archiving volume group "vgken" metadata (seqno 3).
Archiving volume group "vgken" metadata (seqno 4).
Creating volume group backup "/etc/lvm/backup/vgken" (seqno 4).
Found volume group "vgken"
Creating vgken-lvol3
Loading vgken-lvol3 table
Resuming vgken-lvol3 (253:2)
Activated logical volumes in volume group "vgken"
1 logical volume(s) in volume group "vgken" now active
Wiping internal VG cache

Since the volume group was missing a member and the most important one, the primary device, we are required to do an fsck and restore the superblock from a superblock backup, the first superblock backup is contained on the primary drive so is the second and I must use the third superblock backup. This is why you should always have a screenshot or log of the superblock locations when you create a filesystem. I use the -y flag because we know there will be many errors which need to be acknowledged and we don’t want to have to stick a pen in the y key of the keyboard. The journal was contained on the primary drive stripe so we loose the journal then many errors fly past the screen which for the purpose of this post I truncated at the first error:

[root@centos5_3-1 ~]# fsck.ext3 -b 163840 /dev/vgken/lvol3 -y
e2fsck 1.39 (29-May-2006)
Superblock has an invalid ext3 journal (inode 8).
Clear? yes

*** ext3 journal has been deleted - filesystem is now ext2 only ***

Resize inode not valid. Recreate? yes

/dev/vgken/lvol3 was not cleanly unmounted, check forced.
Pass 1: Checking inodes, blocks, and sizes
Root inode is not a directory. Clear? yes

Now I mount the volume group’s logical volume under /mnt2 and since I copied all the files from the /space directory to this logical volume we can get an idea of just how many files are left and the space that are used by them. Keep in mind that they are all now located in /mnt2/lost+found under directories like ######

A count of files and directories in the recovered logical volume:

[root@centos5_3-1 mnt2]# ls -lRa | wc -l
27930
[root@centos5_3-1 mnt2]# ls -lRa | wc -l | grep -v dr
27930
[root@centos5_3-1 mnt2]# ls -lRa | grep -V drw

A count of the files and directories of the original filesystem which was duplicated in the logical volume:

[root@centos5_3-1 mnt2]# ls -lRa | wc -l
27930
[root@centos5_3-1 mnt2]# ls -lRa /space | wc -l
31437
[root@centos5_3-1 mnt2]# ls -lRa | grep -v drw | wc -l
22017
[root@centos5_3-1 mnt2]# ls -lRa /space | grep -v drw | wc -l
24540
[root@centos5_3-1 mnt2]#

Doing a disk space check we can see the difference of bytes/blocks in /space vs /mnt2 and it’s obvious we lost quite a bit of files:

[root@centos5_3-1 mnt2]# df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sdb2 1091020 947908 87688 92% /space
/dev/mapper/vgken-lvol3
1024024 368276 634944 37% /mnt2

I viewed some of the files in the filesystem and they did have information within them, some bigger files were truncated based on the stripe size since there was no journal or available data to recover the missing parts of the stripe where the physical volume once existed. But it appears that in the case of the primary drive failure that there were more files available within the filesystem.

Now that we have tested what happens if the primary physical volume member fails we will test what happens if the secondary physical volume member fails.

Now I power off and remove the old sdd and add new hard disk as sdd then do a pvscan, the new drive shows as an unknown device:

[root@centos5_3-1 ~]# pvscan
Logging initialised at Mon Nov 16 07:55:46 2009
Set umask to 0077
Wiping cache of LVM-capable devices
Wiping internal VG cache
Walking through all physical volumes
/dev/cdrom: open failed: Read-only file system
Attempt to close device '/dev/cdrom' which is not open.
Couldn't find device with uuid 'CgGLuS-KpPm-2HOq-dmOI-RgCx-PTtg-0AmiP4'.
PV /dev/sdc1 VG vgken lvm2 [508.00 MB / 0 free]
PV unknown device VG vgken lvm2 [508.00 MB / 0 free]
Total: 2 [0.99 GB] / in use: 2 [0.99 GB] / in no VG: 0 [0 ]
Wiping internal VG cache

Trying to restore the volume group from backup does not work if a physical volume does not have the uuid that the backup configuration file is looking for:

[root@centos5_3-1 ~]# vgcfgrestore vgken
Logging initialised at Mon Nov 16 07:56:13 2009
Set umask to 0077
Wiping cache of LVM-capable devices
/dev/cdrom: open failed: Read-only file system
Attempt to close device '/dev/cdrom' which is not open.
Couldn't find device with uuid 'CgGLuS-KpPm-2HOq-dmOI-RgCx-PTtg-0AmiP4'.
Couldn't find all physical volumes for volume group vgken.
Restore failed.
Wiping internal VG cache

To allow the volume group to accept the new drive into the group we must create the proper uuid on the physical volume:

[root@centos5_3-1 ~]# pvcreate --uuid CgGLuS-KpPm-2HOq-dmOI-RgCx-PTtg-0AmiP4 /dev/sdd
Logging initialised at Mon Nov 16 07:56:55 2009
Set umask to 0077
Wiping cache of LVM-capable devices
Set up physical volume for "/dev/sdd" with 1048576 available sectors
Physical volume "/dev/sdd" successfully created
Wiping internal VG cache

A restore of the volume group from backup works and reports no errors:

[root@centos5_3-1 ~]# vgcfgrestore vgken
Logging initialised at Mon Nov 16 07:57:13 2009
Set umask to 0077
/dev/cdrom: open failed: Read-only file system
Restored volume group vgken
Wiping internal VG cache

Here is where I activate the volume group and make it known to the kernel:

[root@centos5_3-1 ~]# vgchange -ay vgken

Logging initialised at Mon Nov 16 07:57:29 2009
Set umask to 0077
Using volume group(s) on command line
Finding volume group "vgken"
Archiving volume group "vgken" metadata (seqno 4).
Archiving volume group "vgken" metadata (seqno 5).
Creating volume group backup "/etc/lvm/backup/vgken" (seqno 5).
Found volume group "vgken"
Creating vgken-lvol3
Loading vgken-lvol3 table
Resuming vgken-lvol3 (253:2)
Activated logical volumes in volume group "vgken"
1 logical volume(s) in volume group "vgken" now active
Wiping internal VG cache

Now run an fsck on the filesystem to clear up any issues with missing blocks, note this had loads of errors which were not included in this post and it dumpped many files to lost+found which had nothing but …… inside them so this test did not recover much data either and even less than when the primary physical volume was lost. We had the superblock so there was no need to supply the -b flag.

[root@centos5_3-1 ~]# fsck.ext3 /dev/vgken/lvol3 -y

Here I compare the number of files in the filesystem that was not damaged to the filesystem that was damaged and you can see there is quite a difference in the amount of files and directories left after the recovery effort. Even with a partial recovery like this we still don’t expect much data within the files. I did note that some files had full recovery with the secondary physical volume failure:

[root@centos5_3-1 mnt2]# ls -lRa | wc -l
2598
[root@centos5_3-1 mnt2]# ls -lRa /space | wc -l
31437
[root@centos5_3-1 mnt2]# ls -lRa | grep -v drw | wc -l
2068
[root@centos5_3-1 mnt2]# ls -lRa /space | grep -v drw | wc -l
24540

You can see the bytes and blocks are also quite smaller than the filesystem that we copied over to the failed group example:

[root@centos5_3-1 mnt2]# df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
1523568 1359848 85080 95% /
/dev/sda1 101086 12015 83852 13% /boot
tmpfs 154020 0 154020 0% /dev/shm
/dev/sdb2 1091020 947908 87688 92% /space
/dev/mapper/vgken-lvol3
1024024 495552 507668 50% /mnt2

So to sum things up, when you consider using LVM:

Always backup your filesystem.

When using LVM make sure you have a backup copy of /etc/lvm.

When selecting a filesystem for your root device you should be aware that LVM is not going to benifit you in any way if you have a single drive and will always have a single drive.

Using LVM with more than one drive will provide a decent level of recovery if you use mirroring. Almost none if you use stripe only.

LVM is commonly used to allow a user to extend the logical device by adding additional physical devices to the logical device. I submit to you that it’s far easier to mount another drive on the / filesystem and move files there to add space.

Setting up a personal iptables firewall

Linux has many protections available, from account security to selinux to apparmor, but one of the most over looked protections available is iptables. iptables is a program for creating a firewall to do whatever you need, when a lot of people think of a firewall they think a machine sitting between their router and computer that stops access attempts and that is true, it is a firewall. However you can also have a firewall on just the local machine to protect you directly regardless of what other protections or lack-there-of are available on the network. A few simple rules and you can protect yourself from outside intrusion quite easily. You can do creative things like limit how many connects to a particular port you’ll allow before denying that IP range connection, or deny establishment of connections from outside entirely. You can set it up to allow only the hosts you specify. It’s a very versatile tool and I’m going to show you the basics of how to use it on your local machine to increase your security.

First lets start by making a simple iptables bash script we can execute. Change to the root user and create a basic script in your home directory called iptables.sh

#!/bin/bash
IPT=/sbin/iptables
ETH="eth0"

$IPT -F

$IPT -P INPUT DROP
$IPT -A INPUT -i lo -j ACCEPT
$IPT -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT

$IPT -A OUTPUT DROP
$IPT -A OUTPUT -o lo -j ACCEPT
$IPT -A OUTPUT -m state --state NEW,ESTABLISHED,RELATED -j ACCEPT

$IPT -P FORWARD DROP

This is about the most basic set of protection you can make with iptables, when I say basic I don’t mean it’s weak protection, to the contrary this will deny almost all outside connections to you that weren’t established by you previously. Lets review exactly what we’re doing here. There are 4 basic sections relating to iptables:

This section flushes the current iptables so you’re working from a blank slate:

$IPT -F

This section describes how to handle input to the machine from the network, the rules we define here– we accept all local traffic and we accept traffic that has a state of established or related (eg: we initiated the traffic).

$IPT -P INPUT DROP
$IPT -A INPUT -i lo -j ACCEPT
$IPT -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT

This section describes how to handle output from the machine to the network, the rules we define here– we accept all local traffic and we accept any out bound traffic in the new, established, or related states which means that we can start new connections outward.

$IPT -A OUTPUT DROP
$IPT -A OUTPUT -o lo -j ACCEPT
$IPT -A OUTPUT -m state --state NEW,ESTABLISHED,RELATED -j ACCEPT

This section describes forwarding rules, for the most part unless you’re the machine doing NAT/Masquerading you can ignore requests for forwarding, so we just drop all traffic since our script is meant to protect this individual machine.

$IPT -P FORWARD DROP

Now this is pretty basic and likely not exactly what you’re looking for, it will prevent all connection originating from outside from going through at all. One thing you probably want to do is open up the ssh port so you can access your machine from outside, this is an easy process to do. We just add a line to the input section. What this line says is accept input on the destination port 22 which is the port ssh runs on.

$IPT -A INPUT -p tcp -m tcp --destination-port 22 -j ACCEPT

If you’re running a web-server you may want to open up port 80 (http) and 443 (https).

$IPT -A INPUT -p tcp -m tcp --destination-port 80 -j ACCEPT
$IPT -A INPUT -p tcp -m tcp --destination-port 443 -j ACCEPT

If you want to open up ftp for examine you need to open both port 20 (ftp) and 21 (ftp-data).

$IPT -A INPUT -p tcp -m tcp --destination-port 20 -j ACCEPT
$IPT -A INPUT -p tcp -m tcp --destination-port 21 -j ACCEPT

So now we have a fairly complete personal protection script that allows ftp, ssh, and both http and https.

#!/bin/bash
IPT=/sbin/iptables
ETH="eth0"

$IPT -F

$IPT -P INPUT DROP
$IPT -A INPUT -i lo -j ACCEPT
$IPT -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
$IPT -A INPUT -p tcp -m tcp --destination-port 22 -j ACCEPT
$IPT -A INPUT -p tcp -m tcp --destination-port 80 -j ACCEPT
$IPT -A INPUT -p tcp -m tcp --destination-port 443 -j ACCEPT
$IPT -A INPUT -p tcp -m tcp --destination-port 20 -j ACCEPT
$IPT -A INPUT -p tcp -m tcp --destination-port 21 -j ACCEPT

$IPT -A OUTPUT DROP
$IPT -A OUTPUT -o lo -j ACCEPT
$IPT -A OUTPUT -m state --state NEW,ESTABLISHED,RELATED -j ACCEPT

$IPT -P FORWARD DROP

With your firewall setup like this one of the things you’re going to note fairly quickly if you examine the logs is that there will be a lot of brute-force ssh attacks on your server, and if you have a good password they’ll likely never break your password, however it’s also not good to allow them to continually attack your server as it takes up resources to handle the incoming requests and it takes up bandwidth. So lets slow them down to a trickle and most will quickly move onto a better target. This is very easy to do, we simply add the following into our INPUT chain at the end. What this does is watches for NEW state connections on the input chain to the ssh port, it counts them and if it hits three from a single ip address in 6m it will drop that ip address. After it has been 6m without any activity from the specific ip address it will clear the dropped address (this prevents you from being locked out of your own server for more than 6m.)

$IPT -A INPUT -i $ETH -p tcp -m tcp --dport 22 -m state --state NEW -m recent --update --seconds 360 --hitcount 3 --name SSHATTEMPTS --rsource -j DROP
$IPT -A INPUT -i $ETH -p tcp -m tcp --dport 22 -m state --state NEW -m recent --set --name SSHATTEMPTS --rsource

That brings us to a basic iptables setup for your local machine that looks like this–

#!/bin/bash
IPT=/sbin/iptables
ETH="eth0"

$IPT -F

$IPT -P INPUT DROP
$IPT -A INPUT -i lo -j ACCEPT
$IPT -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
$IPT -A INPUT -p tcp -m tcp --destination-port 22 -j ACCEPT
$IPT -A INPUT -p tcp -m tcp --destination-port 80 -j ACCEPT
$IPT -A INPUT -p tcp -m tcp --destination-port 443 -j ACCEPT
$IPT -A INPUT -p tcp -m tcp --destination-port 20 -j ACCEPT
$IPT -A INPUT -p tcp -m tcp --destination-port 21 -j ACCEPT
$IPT -A INPUT -i $ETH -p tcp -m tcp --dport 22 -m state --state NEW -m recent --update --seconds 360 --hitcount 3 --name SSHATTEMPTS --rsource -j DROP
$IPT -A INPUT -i $ETH -p tcp -m tcp --dport 22 -m state --state NEW -m recent --set --name SSHATTEMPTS --rsource

$IPT -A OUTPUT DROP
$IPT -A OUTPUT -o lo -j ACCEPT
$IPT -A OUTPUT -m state --state NEW,ESTABLISHED,RELATED -j ACCEPT

$IPT -P FORWARD DROP

You could move this script to /usr/local/sbin and call it from rc.local or you could add in the additional hooks required to call it from your distribution’s /etc/init.d or you could just add the script itself to the rc.local. As long as it is called on start-up. This covers a basic introduction to using iptables for protection of your local machine with a minimum of inconvenience to yourself and maximized protection. If this isn’t a machine that you’re directly protecting but a firewall for the network you could setup dhcp on the machine and network address translation (NAT) and route specific ports to specific machines. The possibilities are endless and I hope this article opened your eyes to some of the potential that iptables has for securing your network. Good Luck

P2V From Windows XP to a KVM Virtual Host Using Centos 5.4

KVM, or Kernel Virtualized Machine, is a great tool for quickly getting virtual copies of physical environments up and running. It is as simple as pointing the command at an image of the machine’s hard drive. I will describe this process, and for the sake of completeness I will show how to make this work with libvirt, and the virtual-manager as well. You could skip the gui and bring up the image using the qemu-kvm command if you desire.

For this procedure we are using a Centos 5.4 host server, and a Windows XP source machine.

First, we prepare the running source machine, which is Windows XP.

Since KVM uses fairly generic IDE drivers, it helps to merge in a bulk of these into the source machine’s registry before transferring a copy of the hard drive image. I merged the registry entries into the source machine by using regedit
to backup the registry, then right clicking on the registry file and
merging.

A copy of the registry file can be found here:mergeide.reg.

Then, on the host server, we start a netcat session which output to an image file on the server.

nc -vvnl 3333 > /var/lib/libvirt/images/hdimage.raw

This runs from the command line and will sit there until it is done.
When it is finished it will go back to the command line. The size of the
image can be checked with ls -alh hdimage.raw

On the client we boot using a live cd, anything that can get you to a shell and that includes the dd and nc commands. You will need to enable networking as well, either DHCP or use ifconfig to being up a static connection. Once this is done, start the hard drive image transfer. Here I am assuming only one hard drive, and that it is /dev/hda.

dd if=/dev/hda | nc -w 60 -vvn 192.168.1.xxx 3333

Once the transfer is complete, we move the image from hdimage.raw to
winxp.img, or some other appropriate name, and boot it with kvm.

mv /var/lib/libvirt/images/hdimage.raw /var/lib/libvirt/images/winxp.img

We can test the machine image by booting it with qemu-kvm, or just move on
to setting it up in virt-manager

qemu-kvm -hda winxp.img

This next section deals with bridged networking. This is necessary to allow the guest image to access the local network as a normal machine, and is only necessary to do once before adding your guest images.

We begin by editing the

/etc/sysconfig/networking-scripts/ifcfg-br0

and ifcfg-eth0 files. Of course how you do this will depend entirely on your local network setup.

ifcfg-br0

DEVICE=br0
TYPE=Bridge
BOOTPROTO=static
BROADCAST=192.168.1.255
IPADDR=192.168.1.xxx
NETMASK=255.255.255.0
NETWORK=192.168.1.0
GATEWAY=192.168.1.1
ONBOOT=yes

ifcfg-eth0

DEVICE=eth0
HWADDR=xx:xx:xx:xx:xx:xx
ONBOOT=yes
BRIDGE=br0

At this point you will need to reboot or restart networking. Do not do this remotely as you will of course lose your connection. Once this is done you will have a bridge, with eth0 being a member of the bridge, and the guest images will also become members of the bridge.

The transfer process provides us with a file that can be loaded into KVM, but to use the virt-manager (libvirt) utility, it expects an xml config file to be present to tell it how to run the image.

The libvirt default storage location is /var/lib/libvirt/images, and that is where we have been putting our image.

libvirtd reads all the xml config files when it starts. The config can be created manually or through the gui. These files live in /etc/libvirt/qemu/.

Open the virt-manager gui, and right click the localhost connection.

Select New

Select Fully Virtualized

CPU architecture = x86_64

Hypervisor: kvm

We have to select ‘local install media’ to get past this point (We need a bootable livecd in the drive, or preferrable a bootable iso located somewhere on the filesystem to use temporarily)

OS Type: Select

OS Variant: Choose what makes sense here

Choose the bootable cd or iso

Choose hard drive image location

File (disk image):

Location: /var/lib/libvirt/images/winxp.img (You can choose an image that was already transferred and it should not overwrite it, or you can just tell it to create one, which you can overwrite later when transferring the image and renaming it.)

Network/Shared Physical Device:eth0 (Bridge br0)

Memory: Select the appropriate amount of memory and cpu to allocate

Finish the installation.

Since we had to point to a cd, or iso it will boot the cd or iso. Once it does you can select shutdown and stop the process.

The xml config file should now be created in /etc/libvirt/qemu/

If you want to create an xml file by copying a previous one, it should be of the same
general OS type, and three things will need to be modified each time.

The UUID will need to be unique, and the MAC address should be unique. Changing the last couple of hex digits should be sufficient. You will also need to modify the image location for the hard drive image for the machine.

One modification is necessary for some winxp and 2003 machines in the xml config.

In the features section, acpi should be enabled by adding it.

The resulting section will look like

<features>
   <pae/>
   <acpi/>
</features>

Without this the XP image was constantly restarting itself. Your mileage may vary, and depending on the source configuration you may need to leave this out. It is the acpi line that matters.

Once configuration files are created or modified, you must make them active, and the quickest way to do so is by restarting libvirtd. You can also define the image using the proper commands. Here we will restart.

Close the virt-manager gui, then /etc/init.d/libvirtd restart.

Once the machine xml file is created, it will either be pointing to an existing hd image, which has either already been transferred, or you can transfer a new image to the server and name it properly for this machine.

From here you can use the gui to start and stop the image, etc. Enjoy!

Setting up OpenLDAP 2.4 with the cn=config feature.

After my last post about the new features in OpenLDAP 2.4 I decided to write a post that gives step by step instructions on setting up OpenLDAP 2.4 and enabling the cn=config feature. If you wish to convert your existing OpenLDAP installation to use the cn=config feature then skip Section A and go right to Section B. Again for the purpose of this document we will call the OpenLDAP daemon slapd regardless of what the other distros call the daemon, and the domain BASE will be dc=kens,dc=lan, also the config directory will be referred to as /etc/OpenLDAP.

Section A
Install OpenLDAP 2.4 and make sure openldap is not running at this point.
edit the /etc/ldap/slapd.conf file to include your domain, my example uses:
dc=kens,dc=lan as the BASE
set the root password by typing

slappasswd

enter the password and paste the output into /etc/ldap/slapd.conf in the rootpw area.

cp /etc/openldap/DB_CONFIG.example /var/lib/ldap/DB_CONFIG

 

chown -R ldap:ldap /var/lib/ldap

create the initial ldap domain entry and administrator account by
making the file:
init.ldiff

with the contents, edited to your domain of course and matching the values you have used in the files /etc/openldap/slapd.conf, and /etc/openldap/ldap.conf:

dn: dc=kens,dc=lan
objectclass: dcObject
objectClass: organization                                                       
o: kens.lan
dc: kens
dn: cn=admin,dc=kens,dc=lan
objectclass: organizationalRole
cn: admin

Now save the file and run:

slapadd -l init.ldiff

start up slapd or ldapd
You should be able to do slapcat

Section B:
Enable the config database in /etc/ldap/slapd.conf by adding the following 3 lines right above the first database definition. Normally the line would read database bdb. Stop slapd prior to doing this.

database config

rootdn .cn=admin,cn=config.
rootpw config

to set the config password to something else type

slappasswd

enter the password twice and then copy the output:
{SSHA}5T+9VFI9cieYZCog8GKY3nDj10RmyUfT
and paste this for the rootpw instead of using config.

cd /etc/openldap
mkdir slapd.d
slaptest -f slapd.conf -F slapd.d
chown -R ldap:ldap *

so you know that slapd.conf is not active rename it slapd.old
You should now be able to open up a connection to the container:

cn=config
with username:
cn=admin,cn=config
and password config

From here you will be able to edit the runtime configuration of the ldap server and the changes will be realized as soon as the modification is made, without restarting the server.

If any of this is too much for you then you can contact a Pantek Engineer at
1-877-Linux-Fix and we will be able to help you.

Setting up system auth via PAM/LDAP on Debian Etch

There are many ways to setup ldap and many versions of ldap that you can use. Each system and version has its differences so what I’m going to demonstrate here is how to take a stock debian-minimal install and turn it into an install that authenticates off ldap. From a totally minimal base install, with working network you first edit your sources list to add contrib and non-free.

Edit /etc/apt/sources.list to read

 
deb http://http.us.debian.org/debian etch main contrib non-free
deb http://security.debian.org/ etch/updates main contrib non-free
deb-src http://http.us.debian.org/debian etch main contrib non-free
deb-src http://security.debian.org/ etch/updates main contrib non-free

Update the system to the current security fixed version of everything by issuing:

apt-get update && apt-get upgrade

If prompted accept all upgrade choices.

Now we’re going to install what I consider essential apps and the required apps reasonable build environment for later and slapd, libnss-ldap, and libpam-ldap:

apt-get install autoconf automake1.9 bison build-essential bzip2 colordiff ctags \ 
debconf-utils debian-keyring elinks flex gcc-4.1-locales gdb gpm htop ldap-utils \ 
libltdl3-dev libmudflap0-dev libnss-ldap libpam-ldap libtool \
linux-headers-`uname -r` lynx mimedecode mime-support ncftp2 netcat nmap \
openssh-blacklist openssh-client openssh-server psmisc screen slapd ssh sysstat \
sysv-rc-conf telnet telnetd urlview vim vim-scripts

After the applications finish downloading lets start editing some files, replace the default /etc/ldap/slapd.conf with the following:

include         /etc/ldap/schema/core.schema
include         /etc/ldap/schema/cosine.schema
include         /etc/ldap/schema/nis.schema
include         /etc/ldap/schema/inetorgperson.schema
include         /etc/ldap/schema/misc.schema

pidfile         /var/run/slapd/slapd.pid
argsfile        /var/run/slapd/slapd.args
loglevel        0
modulepath      /usr/lib/ldap
moduleload      back_bdb
sizelimit 500
tool-threads 1
backend         bdb
checkpoint 512 30

database        bdb
suffix          "dc=fakedom,dc=dom"
rootdn          "cn=admin,dc=fakedom,dc=dom"
rootpw          (run slappasswd and paste output here)
directory       "/var/lib/ldap"
lastmod         on

access to attrs=userPassword,shadowLastChange
by dn="cn=admin,dc=fakedom,dc=dom" write
by anonymous auth
by self write
by * none

access to *
by dn="cn=admin,dc=fakedom,dc=dom" write
by * read

Then we’re going to modify /etc/nsswitch.conf to look like the following:

# /etc/nsswitch.conf
#
# Example configuration of GNU Name Service Switch functionality.
# If you have the `glibc-doc-reference’ and `info’ packages installed, try:
# `info libc "Name Service Switch"‘ for information about this file.

passwd:         compat ldap
group:          compat
shadow:         compat ldap

hosts:          files dns
networks:       files

protocols:      db files
services:       db files
ethers:         db files
rpc:            db files

netgroup:       nis

Then we replace the default (if it exists) /etc/libnss-ldap.conf with the following:

base dc=fakedom,dc=dom
uri ldap://127.0.0.1
ldap_version 3
rootbinddn cn=admin,dc=fakedom,dc=dom

Then we create /etc/pam_ldap.conf with the following:

host 127.0.0.1
base dc=fakedom,dc=dom
uri ldap://127.0.0.1
ldap_version 3
rootbinddn cn=admin,dc=fakedom,dc=dom
pam_password exop

Then we create the base ldap.conf file /etc/ldap/ldap.conf with the following:

BASE    dc=fakedome, dc=dom
URI     ldap://127.0.0.1

Now we need to use slappasswd to generate a password for our accounts, run it once for the admin account and once for the testy account.

slappasswd

Now the tricky part and what you don’t find in tutorials on setting this system up; we need to create a base.ldif file to import, lets call it /root/base.ldif:

dn: dc=fakedom,dc=dom
objectClass: top
objectClass: dcObject
objectClass: organization
o: fakedom.dom
dc: fakedom

dn: cn=admin,dc=fakedom,dc=dom
objectClass: simpleSecurityObject
objectClass: organizationalRole
cn: admin
description: LDAP administrator
userPassword: (Paste output from slappasswd)

dn: ou=People,dc=fakedom,dc=dom
ou: People
objectClass: organizationalUnit
objectClass: top

dn: uid=testy,ou=People,dc=fakedom,dc=dom
uid: testy
cn: testy
objectClass: account
objectClass: posixAccount
objectClass: top
loginShell: /bin/bash
uidNumber: 10000
gidNumber: 10000
homeDirectory: /home/testy
gecos: Testy,,,,
userPassword: (Paste output from slappasswd)

Then we need to start slapd–

/etc/init.d/slapd restart

Then we need to add our base.ldif information into the database–

ldapadd -x -W -D 'cn=admin,dc=fakedom,dc=dom' -f /root/base.ldif  (enter password when prompted)

Restart slapd–

/etc/init.d/slapd restart

Now to perform a little testing, lets see if we get our information back from our testy account.

getent passwd | grep testy (should return testy’s entry)

If you get an entry for testy all is well and we can start a service that relys on the information in the ldap database, like the telnet daemon we installed earlier:

/etc/init.d/openbsd-inetd start
telnet localhost

and use testy’s login credentials, if it works you’re set and ready to authenticate any application off of ldap that authenticates off PAM!

If it doesn’t, carefully review the above instructions and ensure all steps have been taken in the correct order. A single step out of order can cause the entire setup to fail. I also suggest after you get the base setup working and installed that you take a look at the program called phpLDAPadmin (http://phpldapadmin.sourceforge.net/wiki/index.php/Main_Page) although it would also require the installation of a webserver it makes management and adding new containers and classes much easier than creating ldifs by hand each time.

Setting up Heartbeat for high availability

So you’re interested in having your services be highly available in the case of an outage? Well the easy to way to do that is with a simple heartbeat setup.

The first thing I recommend for anyone attempting to setup heartbeat is to run ntpdate and turn on ntpd against a local common ntp server. If the times become seriously out of sync sometimes the heartbeat will misbehave.

The actual installation of heartbeat itself is simple and part of the standard distribution in rh5 or centos5, the way we are going to set this up is known as an active/passive configuration. The service is active only on one node of the cluster at a time and the other node only takes over service for the other node if it is unavailable.

yum install heartbeat

Next we need to create some basic files for heartbeat to use.

The first thing we need to setup is the ha.cf, it contains all the information on how the cluster should behave, what the nodes are and how to log and how long before something should be considered dead. There are additional options you can setup here but for simple testing and ease of use these are good starting values:
/etc/ha.d/ha.cf

logfile /var/log/ha-log
logfacility local0
udpport 694
keepalive 5
warntime 10
deadtime 30
initdead 60
bcast eth0
auto_failback on
node server1.domain.tld
node server2.domain.tld

The most significant item here is the auto_failback which means it will restore the services to the appropriate servers as quickly as possible once things are stable.

The next thing that needs setup is what resources each node is primarily responsible for, in the event of failure of a node then the other node will take over operations of that process if possible which means the services need to be installed on a shared device or at least use shared resources or synchronized configurations.
/etc/ha.d/haresources

server1.domain.tld 192.168.1.77 named
server2.domain.tld 192.168.1.78 httpd

It is critical that you use the name here that you gave the node name in the ha.cf. If they do not match “Bad Things Will Happen”(tm). The format is simple, node ip service to use.

The nodes need to have a pre-generated key that they share that is exactly the same, all of these files should contain exactly the same data otherwise you are likely to have significant problems. The ip address should NOT be the main ip address of the machine but an alias used by the machine for just this process.
/etc/ha.d/authkeys

auth 2
2 sha1 ThisIsAKeyThatMustMatch

Copy each of these files to the other server into the /etc/ha.d directory, I also suggest you setup a /etc/hosts entry for each of your nodes in case dns is unavailable. Some people keep the files in sync on both servers by having them mount the /etc/ha.d directory from an nfs or samba file share at boot, others rsync them on a regular basis or use cvs to keep things synced. There are a lot of ways to do it, but at this point lets take the easiest way and just copy the files from server1.domain.tld to server2. You can accomplish this by executing the following command provided you have ssh running on both machines.

scp /etc/ha.d/* root@server2.domain.tld:/etc/ha.d

On both servers you want to start heartbeat and let it take care of starting the daemons it controls or needs to take control of.

chkconfig heartbeat on
chkconfig httpd off
chkconfig named off

Heartbeat itself will take care of starting the daemon if it’s not already running. Now shutdown both systems and watch as they come back up, if one boots significantly faster than the other it may immediately acquire the resources of the other machine until its fully online and heartbeat has started to respond. If you have console access you should be able to test functionality by downing one of the machines Ethernet interfaces.

If everything worked correctly you should now have a redundant machine that will take over control of another machine in the event of a machine failure.

Unix permissions

Many people migrating from windows are immediately intimidated by unix permissions … what does chmod 0751 mean? What does chmod g+rw mean? There is no reason to be alarmed or intimidated.

Unix permissions are simple and flexible once you understand the basics.

There are two basic ways of dealing with Unix permissions, the first is numeric. Numeric notation is consider more complex by new administrators but generally faster by older administrators. It also works on some very old versions of Unix where the newer more simplified notation doesn’t.

Before we can seriously examine permission we need to understand what we’re looking at, so lets take a look at a directory entry:

core ~$ ls -al
drwxr-xr-x  2 ralph users 4096 2009-03-31 14:43 .
drwxr-xr-x 16 root users 4096 2009-10-22 09:28 ..
-rw-r--r--  1 ralph users    6 2009-01-28 13:07 file.1
-rwxr-x---  1 ralph users  521 2009-01-28 13:23 test.sh
core ~$

So what does all of this information mean? The left most column made up of the -rwxds characters are the permissions, and in order the first item is the “special” column, the next 3 are the owners permissions, the next 3 are the groups permissions, and the final 3 are “everyone elses” permissions. Next you see the owner and group for the item and then the size of the file followed by the date and time the file was last modified and finally you see the name of the item itself. So for starters lets take a good look at file.1

-rw-r–r– means there are no special items set on this file, the owner has read(r) and write access. The group has only read(r) access and everyone else may also read(r) the file.

The only complex part of grasping how this works is understanding that this is all basically a single byte of octal/binary information. For unix permissions execute permissions (x) are set with a value of 1, write permissions (w) are set with a value of 2, and read permissions (r) are set with a value of four. So to calculate what the a numerical permissions are you would simply add up the values. A – character means, none or nothing and is equal to 0. So… Special: – 0, Owner: rw- 6, Group: r– 4, r– 4 or 0644 or 644 in shorthand. The special value determines the type of file, a – is a normal file, a d a directory, a l a link. There is also the sticky bit which you may occasionally see set. Typically the special portion can be left off and it is assumed to be zero.

Permissions become especially tricky when you consider directory, a r bit on a directory means you can read the contents of the directory, an w means you can create, modify, and delete items from the directory and x means you can change to the directory. So often times if you don’t want your users to for example be able to ls the /home directory you might set it at 711 which would allow anyone to change into the directory and access sub directories of it that they have rights to but not able to get a directory listing.

On newer versions of *nix there is also a shorthand version ‘chmod t+bits’ for an example ‘chmod o+rw filename’ would give the owner read-write access to the file. Alternately ‘chmod g-rwx filename’ would give group no access at all to a particular file.

There are also advanced bits which can be set like the setuid and sticky bit, but those range beyond the details of this article itself and are more rarely used and can easily be dangerous normal system operations and security.

Reasons for upgrading to OpenLDAP 2.4

After using OpenLDAP for years I decided to research and test the new features of OpenLDAP 2.4. In the past I have always been accustom to making changes in the slapd.conf file, then restarting the server to realize changes, it’s just the way things are done. All that has changed with the new version if you decide to utilize the cn=config feature. Replication in a master/slave configuration was always the norm when someone wanted to have a backup ldap server, I have even set it up to do master/slave and slave/master so there could be more redundancy. Now the Syncrepl mechanism supports many methods of replication which make OpenLDAP 2.4 superior to older versions and it also has quite a few advantages over Active Directory.

The best major advantages of 2.4 are the following:

Performance has been enhanced greatly. The speed of the database is only limited by the memory bandwidth of the machine. Searches are almost instant even on large databases.

The cn=config backend allows you to modify, delete, or add almost all runtime settings and schema where changes take effect immediately, no downtime required.

Now Syncrepl is greatly enhanced in quite a few ways. Replicating slapd config is now possible with Syncrepl and cn=config that will allow you to fully replicate an entire server to another server or many other servers. Delta-syncrepl addresses bandwidth concerns, ensures ordered updates, and falls back to plain Syncrepl if a consumer looses sync with the log. Push-Mode Replication is another feature of the syncrepl mechanism which allows a syncrepl consumer sitting on lack-ldap. MultiMate replication supports full N-Way replication and conflict resolution.

Some other notable features that are new:

Enhanced TLS configuration allows settings to be individually configure per-item so you can have different certificates for different clients.

There are new overlays for slap-constraint, slapo-dds, slapo-memberof, and some new features in existing overlays.

Monitoring of back-*db cache fil-in, non-indexed searches.

There is now session tracking control, sub tree delete in back-sql.

Values in multi-valued attributes are sorted for faster matching.

I am quite impressed with the new OpenLDAP 2.4 offerings and feel that there should be many people who are prepared to upgrade to 2.4 and reap the benefits of doing so.

Mail Abuse

Did you just get notified by your ISP that your machine is an open spam relay and your access will be turned off if you don’t shutdown the relay?

Is all of your mail being returned as rejected because you’re blacklisted?

By the time you reach the end of this article you will understand how to prevent yourself from becoming blacklisted and you’ll also know how to implement some basic features to make you a less desirable host for sending spam TO regardless of which mail transport agent you prefer.

Let’s start with the old standby– Sendmail. Starting in 8.9 by default does not permit relaying of messages. So the initial problem with sendmail typically becomes how to setup which hosts are allowed to relay through your domain. There are two ways to configure sendmail… manually and using the M4 config generator, all of these features depend on using M4 to create your sendmail configuration files because then it’s fairly simple to setup a file (/etc/mail/relay-domains /etc/mail/access) which lists each host or address individually that is allowed to relay or not relay though your system specifically. You just enter this into the m4 configuration:

FEATURE(relay_hosts_only)
FEATURE(access_db)

This will let you reject, relay, or reject with a specific error messages from any given host or address without making you an open relay.

There are additional features you can use to deter people who are likely trying to send you spam directly also and I’ll briefly explain them.

FEATURE('greet_pause',2500)
APPENDDEF('confENVDEF', '-DPIPELINING=0')

This feature makes the remote client wait a certain amount of time before giving them a greeting, most spam bots are made to dump the messages as quickly as possible and may not wait at all if they don’t immediately make connection. The second item also turns on pipelining which means the client must respect delays in the SMTP process, this has lost effect in recent years as some spambots tend to obey the various SMTP RFCs.

define('confMAX_RCPTS_PER_MESSAGE', '25')

What this says is no message may be delivered to this domain to more than 25 addresses at once. If you do a lot of mailing list or company wide emails with large cc lists this probably isn’t a good option to use, but for most smaller ISPs and smaller personal domains it works well.

You may further be able to restrict relaying from bad hosts by enabling the feature, which turns on real-time black hole lists, which gives some control of your mail to a remote administrator is often useful for those who don’t have time to keep up on spam report.

FEATURE(dnsbl)

The last major sendmail feature I consider essential is actually more for denial of service than it is for preventing spam although often one is the result of the other.

define('confREFUSE_LA', '15')

What this says is, if the load on the machine goes above 15 refuse all connections until the load drops back under 15. Unless you have a machine capable of maintaining a load of 15 and remaining stable I recommend setting this option. If you have a very heavy machine with plenty of ram and lots of processors you might be able to get away with a number higher than 15, but for most sites 15 is well above what they can handle on a sustained basis.

After making these changes, copy your current sendmail.cf file, recreate it from the mc file and then restart sendmail to have the changes in place.

Next lets take a look at Postfix, it is one of the more consistently configured mail agents and the configurations remain fairly standard against most distributions. The only place you need to make configuration changes is in the postfix.conf file and you generally need to only change a few variables to setup safe relaying for your domains if required or deny it all together.

mynetworks_style = host
mynetworks = hash:/etc/postfix/networks
smtpd_recipient_restrictions = permit_mynetworks reject_unauth_destination reject_unknown_recipient_domain

Then run postmap against /etc/postfix/mynetworks after adding in the hosts you wish to allow to relay through this host and restart the postfix daemon.

If you wish to further make your host undesirable for spammers you can set these additional smtpd_recipient_restrictions:

reject_invalid_hostname reject_unauth_pipelining reject_unknown_sender_domain reject_non_fqdn_recipient

For the most part these function exactly as you’d expect, they disallow someone to use a non-existent hostname or a domain for which there are no records, they require mail be delivered to and from a fully qualified domain name and they turn off unauthorized pipelining. These features combined make you a less desirable spam target as well as the previous features preventing you from being an open relay.

The one additional spam prevention measure you can take is setting up an rbl or several rbl. This is done by adding the line–

reject_rbl_client url.of.rbl

for each of the rbl’s you wish to use to the smtpd_recipient_restrictions line.

The last of the major three mail servers to discuss is Exim and this is particularly thorny because Exim is setup significantly different in different distributions and even inside the same distribution there may be multiple ways it can be setup by the distribution itself (in Debian/Ubuntu’s case.) This makes it hard to give a single example that always works across the board. Exim’s relaying is controlled by it’s ACLs which can become fairly complex. Now by default Exim doesn’t allow relaying and in some cases without modification won’t even accept mail.

The first item of note is the behavior of the various access control lists, for example “acl_smtp_rcpt” *must* be set or no mail will be accepted over smtp at all which would prevent mail from arriving at the host period.

The next major area to look to is the domain list to see what you’re locally accepting mail for and what you’re willing to relay for. Unless you have a specific reason to allow relaying (you’re acting as a secondary mta) the relay_domains directive should not be set or have anything in it and the relay_hosts should only be set if you need it set to allow relaying from perhaps a particular subnet much like sendmail.

By default Exim’s philosophy is flexibility and it’s acl’s provide a lot of opportunity for deterring spammers and you can do things as simple as

deny host = 1.2.3.4

or

verify helo

or even so complex that they require their own configuration files and external programs.

Exim of course supports implementing black holes, the format for defining what black holes to use is

deny dnslists = url.of.rbl

in the case of mail lists which append inverted DNS to their domain name (eg:blackholes.mail-abuse.org) or alternately it can use the format of

deny dnslist = url.of.rbl/$sender_address_domain

for black holes who are based on look ups of the domain directly (eg:dsn.rfc-ignorant.org).

That covers the basic setup on all three MTAs for preventing your mail server from becoming an open relay and sending mail that might inadvertently get you blacklisted, but did you know that there are other ways you can get blacklisted? For example there are blacklists that will add you for not having an “abuse@yourdomain.com” account or a “postmaster@yourdomain.com” account, the RFC for SMTP says those accounts must exist and must be read by a human. There are blacklists for backscatter which is typically a flood of messages received when an email address is forged as the sender for spam and then accepted at the remote domain for delivery but can’t be delivered so a bounce is generated to the forged domain, often times if the spam is wide scale enough, denial of servicing the forged domain’s server. There are literally blacklists for every violation of the RFCs imaginable to help prevent spam and it’s fairly important to remain off the major blackhole lists if you want your mail delivered.

You can perform a multi-rbl check at several sites:
http://checker.msrbl.com/
http://www.anti-abuse.org/multi-rbl-check/
http://multirbl.valli.org/

Please remember that if your domain or ip range shows up as being blacklisted by any significant number of the rbl’s checked by those sites, you will have trouble getting your mail delivered in a timely manner if at all.

In conclusion mail is a simple but flexible protocol with a wide variety of transfer agents and configuration methodologies, but if you want your mail to flow cleanly it is critical you act as a good internet neighbor and prevent spam from originating at your server and obey the RFCs.

Photoshop holding you back from going opensource?

So you’re thinking about moving to Linux but you just hate to give up the good photo editing program you’re using. Well let me tell you about Gimp (GNU Image Manipulation Program). GIMP is written and developed under X11 but basically the same code also runs on MS Windows and Mac OS X.

It’s easy to use, it has all the functions of Photoshop and best of all -it’s Free!

Gimp’s current version 2.6 is a streamline of the old software and the expected changes for version 2.8 seem even better. It’s customizable user interface makes it ideal for people who want things to look a certain way. You can change anything from icon size to recoloring widgits and with the docking feature you can move things into tabs or leave them spaced out to your preference. It also has a full screen mode if you prefer to get the most from your workspace.

Some of the more notable features of Gimp 2.6 is the reworked photo enhancing capabilities, digital retouching and available hardware support. The user is able to save files in everything from the common .JPG, .TIFF and .GIF but also more specialized file formats such as multi-resolution and multi-color-depth Windows icon files. The architecture allows to extend GIMP’s format capabilities with a plug-in. You can find some rare format support in the handy GIMP plugin registry. Because Gimp features a transparent virtual file system, it is possible to load and save files to from remote locations using protocols such as FTP, HTTP or even SMB (MS Windows shares) and SFTP/SSH. To save disk space, any format can be saved with an archive extension such as ZIP, GZ or BZ2 and GIMP will automatically compress the file without you needing to do any extra steps!

Not only will Gimp allow you to manipulate and edit pre-existing images, but you can also create new images from a set group of templates or you can get advanced and specify your own image size (by inches, pixels, millimeters, points, picas and more) to adjusting the size based on the X/Y scale and customizing your color format (RGB, Greyscale, etc). Once you’ve got a blank image box to work with, you can create layers with text or your own images.

One of the most appealing features of the gimp is that it works across almost any operating system you might be on, so once you learn it you have a very versatile tool you can use anywhere, a short list of places the gimp works is almost any linux, unix, or bsd with a GUI, Any version of windows newer than XP, and MacOS X.

Gimp also provides a handy Help Menu with direct links to just about everything you could want to help you learn the program including the incredibly useful plug in registry, documentation, tutorials, and more advanced features and easy to read How-tos.