Archive for July, 2007

31
Jul

Linux LDOM

Linux kernel developers announced that they have successfully ported Logical Domain technology from Solaris kernel. Even more, it is already in the kernel! Good job guys! Another big step for the Linux virtualization community.

Speaking about Solaris, Sun opened source of it’s cluster suite. Can we expect some improvements in Linux cluster software?

26
Jul

HP-UX UNIX95 Compatibility

HP-UX is well known for the ease of patch and product manipulation. These operations are done via software called Software Distributor (SD). Situations where SD fails are very rare but they can be very strange.

One of those weird situations happened to me last week. I downloaded patch bundle from HP site and tried to create a depot. Very simple action – untar the bundle, run the create_depot_hp-ux_11 script and the script and SD will do all the necessary things. But, here comes the weird part – checksum error for all patches in the bundle.

# create_depot_hp-ux_11
DEPOT: /var/depot
BUNDLE: BUNDLE
TITLE: Patch Bundle
UNSHAR: y
PSF: depot.psf
Expanding patch shar files…
x – PHCO_23651.text
x – PHCO_23651.depot [compressed]
ERROR: wc results of PHCO_23651.depot are 7082 23582 522240 should be 7082 18520 522240
x – PHKL_18543.text
x – PHKL_18543.depot [compressed]
ERROR: wc results of PHKL_18543.depot are 146386 592281 20377600 should be 146386 524212 20377600

I checked the checksum of the bundle itself and it seemed perfectly fine. What a puzzle, a?

Here is the story. HP-UX was supposed to be compatible with UNIX95 specification, but the problem is that, for some reason, this compatibility breaks SD. This compatibility is enforced by environment variable called UNIX95. So if you ever notice problem like this, check first if this variable is active on your server and if that is the case just simply unset it and your SD will be fully functional again.

# set|grep UNIX95
UNIX95=yes
# unset UNIX95
# create_depot_hp-ux_11
DEPOT: /var/depot
BUNDLE: BUNDLE
TITLE: Patch Bundle
UNSHAR: y
PSF: depot.psf
Expanding patch shar files…
x – PHCO_23651.text
x – PHCO_23651.depot [compressed]
x – PHKL_18543.text
x – PHKL_18543.depot [compressed]

Happy patching! :)

12
Jul

AIX 6 ready for download!

Like I previously announced, IBM AIX 6 Beta will be openly available for free download and testing. This time has come and you can start downloading it right now from this page. More info here.

AIX 6 should bring a lot of new stuff especially when it comes to virtualization and high-availability issues. Some new features are ported directly from fault-tolerant systems which should provide even more stable and reliable systems. There will be no official support for Beta testing, but you can ask for help on one of the IBM forums.

Openness of IBM is a pretty new thing. This change in IBM policy is probably influenced by SUN’s opening of Solaris to the community. But even though some changes started, IBM is still far away from OpenSource and from opening code of it’s product to the OpenSource community. And that is a pity because I would really like to see the same usability features on some other UNIX operating systems. Sadly, even Linux is far behind AIX when it comes to usability.

10
Jul

Snow in Buenos Aires

Yesterday Buenos Aires received first snowfall in 89 years! As I don’t watch TV or any other form of news, I would probably never hear about this if Tatiana didn’t surprise me this morning with information about the snow. The news is even more surprising since Marica and I were visiting her two years ago at approximately same time of the year and the weather was everything but cold. We wore shorts and t-shirts all the time, even in the evening, and we used to sunbath under the Obelisco.

This seems to be one of the coldest winters in South America in a long, long time. Global worming works in strange ways, indeed. While you think about it enjoy in the pictures of beautiful Buenos Aires covered in snow.

Buenos Aires in Snow

Buenos Aires in Snow

Photos are courtesy of Jeff.

10
Jul

DIA from SQL

Today I needed a fast way to generate DIA diagram from an MySQL database structure and I found this great little project on SourceForge called, very convenient, sql2dia. This tool is actually a PERL script which generates XML file (yes, if you ever wondered, DIA file format is pure XML. Nice, a? :)) sql2dia does not provide support for relations between tables, but this feature is planed for the future releases. Anyway, it does exactly what I needed it to do – create a diagram of tables in my PostfixAdmin database (don’t ask why… yet :)). Here is an example how it works and the result.

[ home sql2dia ] # ./mysql2dia -D -d maildb -h localhost \
> -u username -p password -o mail.dia
Debug mode activated.
Creating header…
Creating object admin…
Creating object domain…
Creating object vacation…
Creating object mailbox…
Creating object log…
Creating object alias…
Creating object domain_admins…
Creating footer…

sql2dia

08
Jul

32 * 2 = 16h

Last week I had an interesting assignment, upgrading one AIX 5.2 server from 32bit to 64bit kernel. Process should be pretty straight forward and is very nicely explained in AIX documentation, but as usual, all actions that require application stopping have to be done after working hours – in this case after 9pm. Considering that all changes, system reboot and application start/stop sequence should not take more than 45 minutes this is not a big problem. As many times before, I didn’t count on good ol’ friend of all system administrators – Murphy.

But, let’s start from the start. First thing I did was to check if the server supports 64bit environment and what version of the kernel is currently running.

# bootinfo -y
64
# bootinfo -K
32

So, the hardware on this server is 64bit (as expected) and active kernel is 32bit. Now, let’s stop applications. Only important application on this server is a production Oracle database. We have to stop it before reboot. (Important thing to note at this moment is the version of database, it is old 8.1.7.4 release of Oracle.)

# su – oracle
% sqlplus /nolog
   
SQL*Plus: Release 8.1.7.0.0 – Production on Wed Jul 4 21:01:20 2007
   
(c) Copyright 2000 Oracle Corporation. All rights reserved.
   
SQL> conn / as sysdba
Connected.(
SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> exit
Disconnected from Oracle8i Enterprise Edition Release 8.1.7.4.0 – Production
JServer Release 8.1.7.4.0 – Production

In order to be able to execute 64bit binaries we must edit /etc/inittab so the syscall64 kernel extension is loaded during the boot. This is need even with 64bit kernel.

# mkitab “load64bit:2:wait:/etc/methods/cfg64 >/dev/console 2>&1″

The switch to 64bit kernel is done by simply relinking paths to the kernel and libraries, and updating boot image on the boot device. Followed by a reboot. Simple as that.

# ln -sf /usr/lib/boot/unix_64 /unix
# ln -sf /usr/lib/boot/unix_64 /usr/lib/boot/unix
# bosboot -a
# shutdown -Fr

After the reboot, I checked the version of running kernel to see if the change actually took place.

# bootinfo -K
64

Perfect! so simple isn’t it. I just love when things go so smoothly. Now let’s start Oracle.

# su – oracle
% sqlplus /nolog
Could not load program sqlplus:
Symbol resolution failed for sqlplus because:
Symbol pw_post (number 272) is not exported from dependent module /unix.
Symbol pw_wait (number 273) is not exported from dependent module /unix.
Symbol pw_config (number 274) is not exported from dependent module /unix.
Symbol aix_ora_pw_version3_required (number 275) is not exported from dependent module /unix.
Examine .loader section symbols with the ‘dump -Tv’ command.

“Argh, this can’t be happening!” I was thinking, so I tried again. Surprisingly, that didn’t help. After the initial shock, I looked at the message more carefully and tried to figure out what the hell it meant. Kernel doesn’t support necessary Oracle symbols – so maybe the Oracle kernel extension is not loaded, let’s check.

# loadext -r
   
Oracle Kernel Extension Loader for AIX
Copyright (c) 1998,1999 Oracle Corporation
   
sh: /usr/sbin/crash: not found
No Kernel Extension is currently running.

I was on a right trail. But this is strange, Oracle kernel extension is loaded from /etc/inittab during the boot, it SHOULD be loaded. Maybe the inittab got corrupted.

# lsitab -a|grep ora
orapw:2:wait:/etc/loadext -l /etc/pw-syscall

It is there. In the agony I thought maybe syscall64 extension was not loaded so it failed (although it should not matter).

# genkex|grep syscall
4635e70 390 /usr/lib/drivers/syscalls64.ext

It is there. Let’s try to call it manually, maybe it will work now.

# loadext -l /etc/pw-syscall
   
Oracle Kernel Extension Loader for AIX
Copyright (c) 1998,1999 Oracle Corporation
   
Kernel Extension Version: 3
SYS_SINGLELOAD: Exec format error
kmid: 0 (0×0)
path: ‘/etc/pw-syscall’
libpath: ”

Maybe, this extension does not support 64bit environment?

# strings /etc/pw-syscall|head -3
Kernel Extension Version: 3
$Revision: 1.9 $
Supported Oracle Instances: 32-bit & 64-bit

Now I am puzzled even more.

At this point I felt stuck. Reverting back to 32bit kernel was not even an option as this was only one part of the big migration process on this server. But, on the other hand Oracle has to be up and running by morning – this is a very important production server. As I am not an Oracle guru and there was no one from DB team around to ask for advice, I asked Google for help. As many times before, it proved to be wise choice. People already had this problem and solved it by applying small patch for Oracle.

Important thing here is that Oracle version 8 does not support 64bit kernel on AIX. It requires patch number 2896876 in order to do so.

After applying this patch you get a new kernel extension which loads without complaining.

# genkex|grep syscall
466c850 1218 /etc/pw-syscall64
4641ec0 390 /usr/lib/drivers/syscalls64.ext

Now, let’s try to start Oracle.

# su – oracle
% sqlplus /nolog
   
SQL*Plus: Release 8.1.7.0.0 – Production on Thu Jul 5 00:47:45 2007
   
(c) Copyright 2000 Oracle Corporation. All rights reserved.
   
SQL> conn / as sysdba
Connected to an idle instance.
SQL> startup
ORACLE instance started.
   
Total System Global Area  178704276 bytes
Fixed Size                    73620 bytes
Variable Size             135630848 bytes
Database Buffers           41943040 bytes
Redo Buffers                1056768 bytes
Database mounted.
Database opened.
SQL> exit
Disconnected
% ^D

Nice. :) Next thing is to change inittab to load new Oracle kernel extension,

# chitab “orapw:2:wait:/etc/loadext -l /etc/pw-syscall64″

stop oracle and reboot server again to see how it will behave after the reboot. Luckily everything works fine so at 01am I can finally go home. It was about time since I was there for almost 16 hours (hence the subject of the post.) Ah, the pleasures of being a system administrators are flexible working hours, isn’t it? :)

01
Jul

Usability… WTF is that?

One of the very important things when it comes to software development is usability. Software should be user friendly and easy to use. Despite sustained opinion software for system administration should not be an exception. After all, system administrators are still humans (although some people don’t agree with that :). So it was always a mystery to me why some OS developers, or at least developers of user space tools, try to complicate it as much as they can.

Perfect example for this is Veritas Volume Manager. Other UNIX LVM technologies provide very logical and simple to use tools for LVM administration, but seems that VxVM has “the more confusing – the better” philosophy. Perfect example for this is simple activity of checking how much free space is left in Disk Group (Volume Groups are called Disk Groups in Veritas Volume Manager :)).

# vxdg -g rootdg free
GROUP  DISK     DEVICE   TAG    OFFSET   LENGTH  FLAGS
rootdg rootdisk c1t0d0s2 c1t0d0 46595904 96722880 -

Now, all fields are self explanatory, but WTF are Offset and Length?! Well, Offset is the number of the block where free space begins and Length is size of the free space in blocks. I agree this is very informative and useful output, but why naming fields like this? Why not use simple names like for example “Used space” and “Free space”? Hm, beats me.

But fun doesn’t end there. In case you don’t have free space in your Disk Group vxdg command will not inform you about that, it will just output the header and exit. Very user friendly, isn’t it? :)

Don’t get me wrong, I am not saying VxVM is a bad piece of software. I think it is very powerful and with features that many other Volume Manager software lacks. But, people at Symantec could really hire some usability expert to work on VxVM, it would be a challenge of a lifetime. :)