Archive for March, 2008

25
Mar

Why I love strace

Strace is a tool that should be in a toolbox of every system administrator. Not only that it can help in troubleshooting simple problems (ie. missing libraries in newly created chroot, which ldd mysteriously misses to report) but it also helps in debugging very complex system problems and performance issues.

Recently I experienced a very strange problem with one of the RHEL 3 servers we’ve got. Problem manifested in a very strange way, SSH and su logins hanged, other daemons were also hanging during the startup, only way to reboot or shutdown the server was to physically press the restart/power off button, etc. All this could have been caused by problems on both software and hardware level. First suspicious was bad RAID controller, but after tests this proved to be a mislead. After more tests and brainstorms hardware problems were definitely excluded, so problem has to be on the software side. But what could be the problem?

After few more misleading steps I tried to trace system calls created by su command and found very interesting results.

$ strace -f -s 1024 -o /tmp/su.strace.out su -
[-- cut --]
3138 open(“/dev/audit”, O_RDWR) = 3
3138 fcntl64(3, F_GETFD) = 0
3138 fcntl64(3, F_SETFD, FD_CLOEXEC) = 0
3138 ioctl(3, 0x801c406f

And this is where the strace output ends and su command hangs. Audit device file is opened (file descriptor 3) and as soon as the first request is dispatched to this device (ioctl system call to file descriptor 3) command freezes. According to this I should just disable audit on the server and the problem will be gone.
As a test, audit daemon was temporarily stopped and I tried to switch to another user and the problem was indeed gone.

After searching for similar problems with audit daemon I found an article in Red Hat knowledge base regarding the exactly same
issue (http://kbase.redhat.com/faq/FAQ_79_6169.shtm).
From the article:

When the free space in the filesystem holding the audit logs is less than 20%, the above notify command will error out and auditd will enter suspend mode. This causes all system calls to block.

So this behavior is not a bug but actual feature of the software. :o) From security point of view this is expected behaviour – attacker could fill up filesystem where audit logs are stored before the attack and audit will be disabled, meaning no logs of his activity, so better not to allow ANY activity on the system if audit is not able to write to its logs. But still, this kind of behaviour renders the system completely useless to legitimate users.

The topic of this post is not audit, so I will stop here. Important thing is that strace led us directly to the main source of the problem. Resolution of issues like this would be much more complex and time consuming without this great little tool. :)

09
Mar

Setting Pidgin Status with Python or How to Waste Perfectly Good Saturday

I was very bored today. Tired from working on Ratuus (don’t go there, site is under heavy construction :)) I needed something to help me take my mind off everything. And what better way to do it, than playing with Python, Pidgin and D-BUS. :D

To cut the long story short, I needed something that will update my Pidgin status message with the information about the current song I am listening. Till recently I was using Rhythmbox player and there is a perfect little Pidgin plugin called Current Track that worked with this player. Last week I discovered gmusicbrowser and fell in love immediately. It is fast, rich with functionalities but still simple to use. Exactly what I want from audio player. (Hm, I just noticed it is written in PERL. Now when Python is used for everything this comes as a big surprise.)

gmusicbrowser already has a plugin called NowPlaying. It will trigger some command whenever song is changed. I just needed to write this command that will inform Pidgin about the change. So, this seemed like a perfect exercise for slow Saturday. :)

Quick search on Pidgin and D-BUS showed extensive documentation about Pidgin API accessible through D-BUS. There is even a working example of how to change the status message! :)

But that was too simple, so I got another idea. Some time ago, I wrote a small daemon in C that will bind to a specific port and display random bofh-excuses fortune messages when someone would telnet to it. (Seems like I have a lot of spare time. I should really find some hobby!) Something similar to telnet bofh.jeffballard.us 666 (here for more information). So I was thinking about implementing the same for my Pidgin status. Random BOFH excuses in your status message! How cool geeky is that!

The result of all that is short (~60 lines of code) Python script that will set your Pidgin status message to:

a) you current song

pidgin_status.py -m The Real McKenzies – Outta Scotch

b) random line from a file

pidgin_status.py -f /usr/local/share/bofh-example

c) anything you give as the command line argument

pidgin_status.py Some very interesting and funny status message

Only difference between a) and c) is the type of the icon that will be shown. In example a) there will be a small musical note, while in example b) and c) nice arrow pointing to right side will be show.

In the middle of testing I noticed this strange message:

Being from Serbia myself, I find this extremely funny. Although, I didn’t know Serbian hackers are so notorious! :)

I hope someone will find it useful. In any case, I am accepting donations for some long and adventurous vacation. As you can see, I really need it! :D

pidgin_status.py