Linux BLog: June 2015

Tuesday 23 June 2015

Migratepv and Replacepv

what is the difference between migratepv and replacepv?

replacepv command simply moves all the logical partitions on one physical volume to another physical volume. The command is designed to make it easy to replace a disk in a mirrored configuration.

migratepv command also very similar.

The biggest difference is that migratepv allows you to copy the LPs on a logical volume basis, not just on a physical volume basis. For example, if you have a disk that has two logical volumes on it and you want to reorganize and put each logical volume on a different disk, migratepv can do it.

migratepv -l lv01 hdisk1 hdisk2
migratepv -l lv02 hdisk1 hdisk3

In this case, the logical partitions from logical volume lv01 are moved from hdisk1 to hdisk2.
The logical partitions from logical volume lv02 are moved to hdisk3.

Changing default Gateway in Linux SUSE

To change the default route permanently in SuSe Linux, make an entry in/etc/sysconfig/network/routes file.

For example, to make 192.168.10.1 as default route, add the following line into /etc/sysconfig/network/routes file.

default 192.168.2.1 - -

Using route command:

To route all the traffic via 192.168.1.1 gateway connected via eth1 network interface:

# route add default gw 192.168.1.1 eth1

To view routes configured:

# netstat -rn

Network Card change to '__tmpxxxx" insted of ethX after reboot in Linux

We have faced a strange issue recently and would like to share that to LazySystemAdmin readers.

Problem / Issue:

After upgrading the kernel, we rebooted the server. After reboot, some of the Ethernet network cards (NIC) are renamed to '__tmpxxxx' instead of ethX. Ethernet interface keeps changing into '__tmpxxxx' even after two more reboot of the server.

"ifconfig -a" output is like below: Invalid network interface names after reboot

root:testsrv1# ifconfig -a | grep HW
__tmp1428126851 Link encap:Ethernet HWaddr 1C:C1:DE:72:4D:53
__tmp1516900339 Link encap:Ethernet HWaddr 1C:C1:DE:72:4D:52
__tmp1854964292 Link encap:Ethernet HWaddr 78:E7:D1:FB:B1:2F
__tmp1950613216 Link encap:Ethernet HWaddr 78:E7:D1:FB:B1:2E
bond0 Link encap:Ethernet HWaddr 1C:C1:DE:72:4D:50
eth0 Link encap:Ethernet HWaddr 1C:C1:DE:72:4D:50
eth1 Link encap:Ethernet HWaddr 68:B5:99:B4:9F:E8
eth4 Link encap:Ethernet HWaddr 1C:C1:DE:72:4D:50
eth5 Link encap:Ethernet HWaddr 1C:C1:DE:72:4D:51

While checking the backup files which were taken before reboot. Network cards looks like below:

root:testsrv1# cd /root/backup-testsrv1-24-05-12/
root:testsrv1# grep HW network-interfaces.24-05-12
bond0 Link encap:Ethernet HWaddr 1C:C1:DE:72:4D:50
eth0 Link encap:Ethernet HWaddr 1C:C1:DE:72:4D:50
eth1 Link encap:Ethernet HWaddr 68:B5:99:B4:9F:E8
eth2 Link encap:Ethernet HWaddr 78:E7:D1:FB:B1:2E
eth3 Link encap:Ethernet HWaddr 78:E7:D1:FB:B1:2F
eth4 Link encap:Ethernet HWaddr 1C:C1:DE:72:4D:50
eth5 Link encap:Ethernet HWaddr 1C:C1:DE:72:4D:51
eth6 Link encap:Ethernet HWaddr 1C:C1:DE:72:4D:52
eth7 Link encap:Ethernet HWaddr 1C:C1:DE:72:4D:53

Why this happens? what is the root cause?

It is because of the behavior of udev. udev does not load modules sequentially but loads the modules in parallel. You will get non-deterministic Ethernet device ordering if you have Multiple Network Drivers in the machine. It's inevitable.

So, it is required to use "HWADDR= " in the ifcfg files to accomplish that mapping.

How to fix this issue?

To prevent this from occurring, the "HWADDR=" parameter should be used in /etc/sysconfig/network-scripts/ifcfg-ethX. You should mention the HWADDR in the ifcfg-eth* files or remove any "#" in start of that line and restart the network service.

# vi /etc/sysconfig/network-scripts/ifcfg-ethX
# service network restart

Kblockd Process - High Utilization in Linux

I have came across the situation where multiple kblockd process are utilizing the CPU heavily and causing server load high in Linux servers. I was wondered what is kblockd and why it is taking high CPU utilization. But there is not much information available in internet about kblockd. After lot of research, the below is what I learned.

What is kblockd?

In a general, the kblockd kernel threads are responsible for performing low-level disk operations.

Why kblockd processes are heavily utilizing the CPU which causing server load?

A high utilization of these could indicate that the server IO queue is backed up and the server is not managing to perform its disk writes quick enough. Most of the times, At that point, the SD drivers fails the IO and fails it to EXT3, which then aborts the journal for safety reasons. So when it is middle of transaction, it forgets about it and retrying to rollback. Also the kblockd message is a symptom of "server running low on memory and starting to fail normal kernel memory allocations".

Most of the times this is a sign of pathological behavior by the kernel or merely a symptom of an overloaded server, depends on the workload of the server and its hardware. Please note that there is always a potential for hangs when something can't allocate memory.

What can be done to resolve this issue?

For now, I don't find any immediate resolution to fix this issue. However keeping your kernel version and block device driver modules up to date might help fixing this issue. Also Upgrading the Linux server to the latest available service pack level is recommended. These might fix any known bugs in earlier versions.

Disable Ping on Linux server

How do you disable ping to Linux server? Here is the quick steps:

To disable ping:

echo 1 > /proc/sys/net/ipv4/icmp_echo_ignore_all

To enable ping:

echo 0 > /proc/sys/net/ipv4/icmp_echo_ignore_all

That's all..!

Reset Failed Login Count in Linux

Depends on the PAM configuration on Linux server, the Pluggable Authentication Module (PAM)

To check the login attempts to see if it needs to be reset type faillog -u <username>

root@testsrv:~ # faillog -u user1
Username Failures Maximum Latest
user1 15 0

Reset the counter with the -r flag:

root@testsrv:~ # /usr/bin/faillog -r user1
Username Failures Maximum Latest
user1 0 0

If you’re root but is not managing to become a user with su, you also need to reset the login counter:

root@testsrv~ # su – username
su: incorrect password
root@testsrv:~ # /sbin/pam_tally —-user user1 —-reset
User user1 (672) had 34
root@testsrv:~ # su – username
user1@testsrv:~ $

File System Extension on live Linux VMware Guest using vmdisk size extended

This article explains, Filesystem extension on live Linux VMware Linux Guest where vmdisk size is extended and by not new disk added.

We had a scenario as follows:

1. File system extension requirement on a live mounted file system without reboot.

2. It’s a Linux guest on VMware required a FS extension from 600 GB to 900 GB. The FS was a single 600 GB disk /dev/sdb

3. While assigning storage, the team did increase the underlying disk to 900 GB than adding a new disk.

4. Even after extension, /dev/sdb was not picking up the additional 300 GB space. [ rescan or partprobe did not help here ]

Note: The case also applies for situations where you have the underlying partition has been changed ( using fdisk ).

Following are the steps taken to make the kernel recognize the new partition structure and to extend the filesystem

First we verified the disk sizes and allocations

# pvs
# vgs
# lvdisplay -m /dev/vg_name/lv_name [ to get the underlying block devices ]

Now we had the partition table re-read for the underlying block device.

# blockdev --rereadpt /dev/sdb
OR# sfdisk –R /dev/sdb

Do note that if you are doing this on a physical machine where we have multipath involved, we would need to re-read the partition tables for all the underlying disks involved.

Now that we have the partition table re-read, we would need to make PV resized to the new disk. Else it would still show the old size.

# pvresize /dev/sdb

Check pvs / vgs output to see whether the new size is detected:

# pvs
# vgs

Once you have the new size detected, you can use the standard procedure to extend the filesystems

# lvextend -L +300G /dev/vg/lv
# resize2fs /dev/vg/lv

Check whether the new file systems are showing the correct sizes:

# df -h

Following are the screenshots of the entire activity which I performed in a test VM. A test VG and LV were created for this activity.

Verify current disk size of the mounted volume :

Check and verify on the available disk space on the underlying disk(s)

Increased the size of the vmware disk than adding a new disk in the virtual machine settings in vCenter.

Now,

Make the new sizes/partition visible on the system without reboot or taking the volume offline:

Extend the LV:

Resize FS:

Quick HOWTO : Flush DNS Cache in Linux

nscd (Name Service Cache Daemon) daemon provides caching service for the name service requests in Linux.

To configure the nscd caching service, edit /etc/nscd.conf

To Flush the DNS Cache in Linux server:

# /etc/init.d/nscd restart

# service nscd restart

Hope this helps..

Quick How To : Reduce SWAP partition Online without reboot in Linux

Recently I had a request to reduce the swap space and allocate that space to some other LV in one of our server. Below is what I followed and it perfectly worked for me. :)

Make sure you have enough physical memory to hold the swap contents.

Now, turn the swap off:

# sync

# swapoff <YOUR_SWAP_PARTITION>

Now check the status

# swapon -s

Then, Use fdisk command:

# fdisk <YOUR_HARDDISK_Where_SWAP_Resides>
List partitions with "p" commandFind Delete your partition with "d" commandCreate a smaller Linux-Swap partition with "n" commandMake sure it is a Linux-Swap partition (type 82) (Change with "t" command)Write partition table with "w" command

Run "partprobe" to update Filesystem table to kernel. (It is very important before proceeding further)

Then,

mkswap <YOUR_NEW_SWAP_PARTITION>

swapon <YOUR_NEW_SWAP_PARTITION>

check to make sure swap is turned on

swapon -s

Now you can use your free space to increase space for other Logical volumes (LV).

Use fdisk command to create new partition, then

# partprobe

# pvcreate <NEW_PARTITION_YOU_CREATED>

# vgextend <VG_TO_INCREASE> <YOUR_NEW_PV>

# lvextend -L +SIZE_TO_INCREASE <LV_NAME>

Note: It is extreme importance of syncing and turning the swap off before you change any partitions. If you FORGET TO DO THIS, YOU WILL LOST_DATA!!

Clean reboot of hung Linux Server

In day to day system administration job, you may come across the situation that your Linux server is hung or freeze and your system is not responding even for Ctrl+Alt+Del in console itself and you must need to do a hard reboot by pressing reset button. As everyone know, the hard reboots is not good and can crash the File systems. so what to do now?

There is a way in Linux,

Hold down the Right Alt and SysRq keys and press this sequence:

R E I S U B

This will cleanly unmount the drives, terminate the processes and nicely reboot your machine.

of course, To get this worked, you need to “enable” this feature on the running kernel first !

On 2.6 kernel

echo 1 > /proc/sys/kernel/sysrq

This will do the trick.

In Some distributions, you may have a way to enable this feature at boot time.

On Fedora and RHEL, edit the file /etc/sysctl.conf, and change the line kernel.sysrq = 0 tokernel.sysrq = 1

Hope this helps to some one..!

Troubleshooting Linux Server Performance issues

System Admins used to get the complaints that the servers are responding slow and low performance. One of the important reason for this is your server might be heavily loaded.Means overloaded.

How do you troubleshoot these server performance issues in Linux?

There can be a number of reasons for high load on the server such as,

Inadequate RAM/CPU
Slower Hard disk drives
Unoptimized software applications / Modules

In this article, I am going to explain you to identify what's the bottleneck and where do you need to focus on.

1) First, Lets check the server Load:

First let us look at the server load. You can probably execute the "uptime" command to find out what's the current load, but "top" command is better one. Top command helps you identify how many CPUs are being reported and You should be able to see something like cpu00, cpu01, etc.

A load of ~1 for each CPU is reasonable. More 1 for each CPU indicates that processes are waiting on resources like CPU, Memory or IO. The higher value shows that more process are in queue for resources and indicates that your server is heavily loaded. For example, you're fine if the load's 7.80 if you have 8 CPUs.

Another thing to consider while looking at the load via uptime or top, is to understand what it shows.

15:33:35 up 180 days, 5:17, 6 users, load average: 8.76, 6.77, 5.42

The first part (8.76) shows the load average in the last 5 mins, while the second (6.77) and third (5.42) shows averages of 10 and 15 mins respectively. It's probably a spike here, lets look further.

Are you OK about your server is load? sometimes servers are able to handle much more load than the load shown. The load averages aren't so accurate after all and cannot always be the ultimate deciding factor. Move ahead if your loads are something to worry over.

Note: If have a P4 CPU having HT technology will be reported as 2 CPUs in Top, even if you know your server has one Physical CPU. For example: on a 4HT Physical CPU server, the Top reports it as 8 CPU

2) Check for RAM Memory:

Note: Perform the checks multiple times, to reach a fine conclusion.

# free -m

The output should look similar to this:

# free -m

total used free shared buffers cached

Mem: 1963 1912 50 0 28 906

-/+ buffers/cache: 978 985

Swap: 1027 157 869

Look at the output. Don't panic that almost all the RAM is used up. Have a look at the buffers/cache that says "985" MB of RAM is still free in buffers. As long as you have enough memory in the buffers, and your server is not using much swap, you're pretty fine on RAM.

Whenever Server does not have enough Memory to keep all the Application processes and data, the server starts to use SWAP, which is part of your disk mapped as memory. But it is comparatively very slow and can further slower down your system. Keep in mind, higher SWAP usage, slow down your system.

At least 200MB available in buffers and not more than 200MB swap usage is Good.

If you find, RAM is the issue, look at Top output for which application process is using more memory. If your application is taking more memory then you should probably look into optimizations on your Application and its related scripts.

Alternatively, you can increase the RAM as well.

3) Check if I/O (input/output) usage is excessive

If there are too many read/write requests on a single hard disk drive, it will become slow and you'll have to upgrade it to a faster drive (with more RPM and cache). The alternate option is splitting the load onto multiple drives by spreading the data by using RAID. To identify, if your I/O issues:

# top

Read the output under "iowait" section (In some cases %wa), for each CPU. In ideal situations, it should be near to 0%. If you see higher value here, sometimes at time of a load spike, consider rechecking these values multiple times to reach a fine conclusion. Anything above 15% is bad. Next, you can check the speed of your hard disk drive to see if it's really lagging:

Try "df -h" command to check which is the drive that your data/Filesystem resides on.

# hdparm -Tt /dev/sda

The output:

/dev/sda:

Timing cached reads: 1484 MB in 2.01 seconds = 739.00 MB/sec

Timing buffered disk reads: 62 MB in 3.00 seconds = 20.66 MB/sec

It was awesome at the buffer-cache reads, most probably because of the disk's onboard cache, however, buffered disk reads is just at 20.66 MB / sec. Anything below 25MB is something you should worry about.

4) Check the CPU consumption:

# top

Check the top output to find out if you're using too much CPU power. You should be looking the value under idle besides each CPU entry. Anything below 45% is something you should really worry about. Look for %sy, %us, %id, %wa values as well as a next step. In the Top output you can determine which process is using higher CPU.

In the example, the problem was with the I/O usage and hard disk slow. we need to upgrade the disk to a faster drive or implement RAID kind of solution.

Troubleshooting process can never be complete in one article and No article can feed you everything which need to reach up to expert level. You need to keep learning.

Hope this helps..! Happy Troubleshooting..!