Add to Google Reader or Homepage |
~ pjvenda / blog
$home . blog . photography

05 April 2013

djbdns logging to syslog instead of multilog

Recently I came across an issue in the djbdns service on a linux host that I am setting up. After resolving around 250 remote queries, tinydns just stopped responding with no error messages. The same happened with dnscachex. A reboot would give me another allowance of ~250 queries. I was not happy.

After chasing this bug for a ridiculous amount of time, which included a fair bit of strace, tcpdump, reboots, googling, scripting tests, comparing it with a working installation, etc, I got to the bottom of it.

To make a long story short, turns out that daemontools - a support package that monitors djbdns services - was firing up multilog which in turn was failing to run due to invalid permissions set in its log directory (multilog was executed under dnslog:nofiles and its log directory was owned by root:root, so there was little chance of multilog writing files there).

Logging in djbdns is implemented with a FIFO pipe between tinydns and the logger process - usually multilog. As multilog never really started, tinydns locked up after pumping enough data into the log pipe which apparently filled up its input buffer (guess I could find out exactly how large it is...). Changing ownership of the log directory to dnslog:nofiles fixed the problem.

drwxr-sr-x 2 dnslog nofiles 4096 Mar 19 03:18 main
-rwxr-xr-x 1 root   root      98 Mar 24 02:32 run
-rw-r--r-- 1 dnslog nofiles    0 Oct 14 17:59 status
drwx--S--- 2 root   root    4096 Mar 24 02:33 supervise
Then I took the chance to make a further tweak: in a Linux system with no storage, I was not interested in having djbdns writing log files locally. Having a syslog daemon available, configured to forward data to a remote server made the solution quite obvious. All I needed was to get djbdns to forward logs to the local syslog daemon.

This is surprisingly easy to do but not at all obvious. The logger process is started by daemontools via the wrapper script /service/tinydns/log/run.
#!/bin/sh
exec setuidgid dnslog multilog t ./main
Instead of multilog, logger can be used to pipe data to syslog like so:
#!/bin/sh
exec /usr/bin/logger -p local5.debug -t tinydns
I chose facility local 5, log level debug and program name tinydns to mark these log entries, but these parameters are user defined. See the logger man page for more information. The same technique can be applied to dnscachex.

On the remote syslog server, I segregate these logs with the following configuration rules [remember to rotate these log files!]:
$template HostDirFile_tinydns,"/var/log/%HOSTNAME%/tinydns.log"
$template HostDirFile_dnscache,"/var/log/%HOSTNAME%/dnscache.log"
# DNS services
if $syslogfacility-text == 'local5' and $programname == 'tinydns' then ?HostDirFile_tinydns
& ~
if $syslogfacility-text == 'local5' and $programname == 'dnscachex' then ?HostDirFile_dnscache
& ~
Job done.

19 March 2012

Creating Mac OS X Lion Installation media without having purchased Lion @ App store

Summary

It is possible to create a bootable Mac OS X Lion installation disk without having purchased the OS from Apple's online App store. I am _not_ talking about a recovery system, but instead of a USB stick or DVD or partition from which the OS can be fully installed or reinstalled onto a computer with an empty disk and no Internet connection. Skip to the process if you will.

  1. Summary
  2. Background
  3. Process
  4. About the Internet installation process
  5. Conclusion / References

All the guides I found on the Internet about creating Lion installation USB sticks or DVDs relied on the premise that the user has purchased their OS on Apple store and the process involves its re-download. I have not purchased Mac OS X Lion on App Store so I am unable to re-download it from there without buying again. The other alternative, about 3x more expensive, would be to buy Apple's Mac OS X Lion USB drive. Not going to happen either.

Background

Apple's latest Mac OS X Lion has a new model of installation and recovery. They have stopped distributing their OS in physical media. Instead people can now purchase Lion on the Apple store, which is delivered in a download. Because of this, a new mechanism to allow for operating system recoveries or reinstalls has been implemented. This new model of recovery relies initially on a 650MB bootable hidden recovery partition, labelled 'Recovery HD' containing what I would call a 'Recovery Mac OS X'. From here users can use recovery tools such as Disk Utility or a terminal, recovering a time machine backup or reinstalling the OS.

In fairness, creating physical installation media is not necessary because hardware released after Lion came out can boot in recovery mode from the Internet. Then a fully restore or installation into an empty disk can be done entirely from the Internet. Older hardware still compatible with Lion can do the same via a boot volume created by the Lion Recovery Assistant tool (same content as in the recovery partition). Both methods have the downside of requiring an Internet connection and time to download about 4GB of data.

I like their new installation and recovery method. It ends up being more flexible than its predecessors. The ability to install Lion without any type of media is great!
But some people - including me - would like to have some sort of physical media with from which a full OS install could be made. Even though I did not use them more than once, I kept the original install discs of 10.4 Tiger, 10.5 Leopard and 10.6 Snow Leopard.
Creating physical Lion installation media is feasible and fairly easy too. It is likely that the number of guides over the Internet about creating Mac OS X Lion installation media has reached triple digits by now.

However all the guides I have read (admittedly not all) assume that the OS has been bought at the Apple App Store. They all rely on extracting that InstallESD.dmg by re-downloading Lion from the App Store, eventually by making use of the command + click modifier to force re-download.
This excludes all Apple buyers that obtained their latest operating system by buying a macbook or imac computer recently. Like me.

I legally own a copy of Mac OS X Lion because it was pre-installed on a new laptop, which makes it legal but not purchased at the App Store. When I go to the App Store, Lion does not appear as a 'purchased' product under my apple id (makes sense). Therefore if I wanted to re-download Lion from the App Store I would have to buy it again... Not going to happen. Apple also sells Lion in physical media (a USB stick) but it costs about 3x the price of standard ,online install Lion... Not a good solution either.

Someone somewhere on the Internet has claimed that once the OS jumps a minor version, it would show up to download. I could not reproduce this as the OS got updated from 10.7.2 to 10.7.3.

It just does not seem fair that I do not get the same features as if I had purchased Lion on the App Store.

How to: Create a Mac OS X Lion installation volume without having purchased it from the App Store

The limitation of not having bought Lion on the App Store is not being able to re-download the OS's installer, specifically from where it is possible to extract InstallESD.dmg - the 3.6GB image with the full install tree. This can be achieved by running the Internet recovery process to reinstall OS X Lion on a blank disk.

Step 0. Before beginning

Ensure that what you are doing is legal and under Apple's terms and conditions. My laptop came pre-installed with Mac OS X Lion so I am elligible to perform this bare metal recovery.
It is relevant to keep the operating system installed on the internal disk functional. This process requires no changes to the internal disk. In fact if you don't have a working OS on the internal disk, you'll need a second Mac to get this done.

Step 1. Prepare the target disk

Find a 20GB+ external disk, USB or Firewire (Thunderbolt should work too) with disposable data (all data on the external disk will be deleted, of course). Use Disk Utility to create a new GUID partition scheme with one partition (labelled with whatever you like, but preferably different from the internal disk's label) formatted with 'Mac OS Extended (Journaled)' file system. Remember to apply the changes.

This will be your target install disk! I have labelled mine 'Lion Install'.


Target disk before preparation

Creating the GUID partition scheme on the target disk

Creating the Install partition on the target disk

Step 2. Go into Internet Recovery mode

Ensure that you have Internet connectivity via Wi-Fi or Ethernet and reboot (erm, remember to memorise the remaining steps or print them or view them in another device).

Connect the external disk and shutdown.

Hold cmd-r while pressing the power button to startup the computer into Recovery mode. Release cmd-r after the apple symbol appears.

You should have booted into recovery mode which has no user accounts, a grey background and starts with the 'Choose your language' screen.

This simpler OS has been loaded from a hidden partition of the internal disk or directly off the Internet. How cool is that??

Step 3. Initiate Lion reinstallation into the target media
Choose the option 'Reinstall Mac OS X Lion' and select your external disk as the target (in my case labelled 'Lion Install'.

Soon after this you will be asked to accept an EULA and Apple will verify your eligibility to perform this installation. If Apple says you're good to go, which should be guaranteed on any hardware released after Mac OS X Lion, the download process begins.

The recovery program mentions that 'your computer will reboot automatically': this is important, because the reinstallation process requires no interaction and will happily stop after the OS is fully installed on the target media, at which point the files we require will have been deleted.

Step 4. Interrupt the installation process
The installation process must not be allowed to finish. I ensured I was present when the download finished and the computer rebooted. At that point, I hijacked the process and forced the computer to boot into the internal disk's OS instead of the external disk's installer program. Simply disconnecting the external disk from the computer immediately after it reboots should suffice to startup into the internal disk's OS.

If the computer is allowed to reboot into the installer program, that is fine, but a reboot must be forced before the installation ends, because at that point the installer program is deleted, which is exactly what we're after.

Step 5. Extract InstallESD.dmg
Having booted back to a functional OS X, connect the external disk onto which Lion Internet Recovery was initiated and it should have the following files:

$ ls -lR
total 0
drwx------  15 pjvenda  staff  510 12 Dec 14:10 Mac OS X Install Data/

./Mac OS X Install Data:
total 7448928
-rw-------  1 pjvenda  staff       13324 12 Dec 12:42 InstallESD.chunklist.partial
-rw-------@ 1 pjvenda  staff  3788832912 12 Dec 14:10 InstallESD.dmg
-rw-r--r--  1 pjvenda  staff         916 12 Dec 12:42 InstallESD.dmg.partialState
-rw-r--r--  1 pjvenda  staff         182 12 Dec 14:10 MacOSXInstaller.choiceChanges
-rw-r--r--  1 pjvenda  staff       10884 22 Jul 05:44 MacOS_10_7_IncompatibleAppList.pkg
-rw-r--r--  1 pjvenda  staff         435 12 Dec 14:10 OSInstallAttr.plist
-rw-r--r--@ 1 pjvenda  staff      863920  6 Oct 14:07 boot.efi
-rw-r--r--  1 pjvenda  staff         408 12 Dec 14:10 com.apple.Boot.plist
-rw-r--r--@ 1 pjvenda  staff        6306 12 Dec 14:10 ia.log
-rw-r--r--  1 pjvenda  staff         786 12 Dec 14:10 index.sproduct
-rw-r--r--@ 1 pjvenda  staff    24087081  6 Oct 14:08 kernelcache
-rw-r--r--  1 pjvenda  staff         618 12 Dec 14:10 minstallconfig.xml

Locate and keep the file InstallESD.dmg by copying it to somewhere safe.

InstallESD.dmg holds the complete Mac OS X Lion installation program and getting to it was the reason to execute this process.

There it is, we were successful at obtaining InstallESD.dmg legally without having bought Lion at the App Store, simply by initiating an Internet Recovery and interrupting installation after the program had downloaded the OS.

The rest of the process of creating a bootable full Lion installation media is the same as in any guide on the Internet from after the step of re-downloading the OS from the App store.


Burning InstallESD.dmg image into the prepared target disk

Contents of Lion installation media prepared from InstallESD.dmg image

About the Internet Recovery process

Luckily Apple's implementaion of Internet recovery is simple:

  1. Download OS X Installer onto target drive;
  2. Reboot from the target drive and run the installer;
  3. Delete the installer just before rebooting again into the newly installed OS;

So, between steps 1) and 2), what is left on the target disk is actually the full Mac OS X Lion's installer program.

$ ls -lR
total 0
drwx------  15 pjvenda  staff  510 12 Dec 14:10 Mac OS X Install Data/

./Mac OS X Install Data:
total 7448928
-rw-------  1 pjvenda  staff       13324 12 Dec 12:42 InstallESD.chunklist.partial
-rw-------@ 1 pjvenda  staff  3788832912 12 Dec 14:10 InstallESD.dmg
-rw-r--r--  1 pjvenda  staff         916 12 Dec 12:42 InstallESD.dmg.partialState
-rw-r--r--  1 pjvenda  staff         182 12 Dec 14:10 MacOSXInstaller.choiceChanges
-rw-r--r--  1 pjvenda  staff       10884 22 Jul 05:44 MacOS_10_7_IncompatibleAppList.pkg
-rw-r--r--  1 pjvenda  staff         435 12 Dec 14:10 OSInstallAttr.plist
-rw-r--r--@ 1 pjvenda  staff      863920  6 Oct 14:07 boot.efi
-rw-r--r--  1 pjvenda  staff         408 12 Dec 14:10 com.apple.Boot.plist
-rw-r--r--@ 1 pjvenda  staff        6306 12 Dec 14:10 ia.log
-rw-r--r--  1 pjvenda  staff         786 12 Dec 14:10 index.sproduct
-rw-r--r--@ 1 pjvenda  staff    24087081  6 Oct 14:08 kernelcache
-rw-r--r--  1 pjvenda  staff         618 12 Dec 14:10 minstallconfig.xml

Problem solved!

My initial plan to obtain installation media from the Internet recovery method was significantly more involved. It consisted of doing an Internet based install onto an empty disk while routing the Internet connection via another host which intercepted all traffic. By analysing this traffic, I would hopefully be able to filter important payloads (hopefully most files would be downloaded via plaintext HTTP, but SSL mitm was also within reach).

At some point I reckoned that a full installation tree would at some point be created that could be reused on bootable media, ideally including the InstallESD.dmg image.

Fortunately, soon after I started analysing data (which provided some very interesting results), I realised that the simplest scenario was that Internet recovery simply downloaded the installer onto the target media and ran it from there. So I did a couple of quick tests and, sure enough that was the case.

Nonetheless I had the chance to analyse the network traffic exchanged between laptop and Apple's servers, which revealed the most interesting insights into the process.

Apple's approach to OS distribuition and installation has always been fairly unrestrictive from a technical point of view. There are no serial keys, no activations, no obvious applications of DRM, etc. I reckon the risk they take in facilitating illegal copying of their OS is far outweighted by hardware limitations and especially the pricing model of Mac OS X. OS X is very cheap by any standard and even more so considering how technically good it is. I also think that their legal customer base is a far better investment into the business than working against pirates: that they already know it is an arms-race, that companies tend to lose consistently.

Conclusion / References

All that's left is to provide a number of the references I used from the Internet to do this work and a few final remarks. I hope it's been informative and useful as it was for me.

While researching for this, I came across hundreds of online blog posts, news articles, original and copied howtos, copies of copied howtos, etc. After writing this post (believe it or not) I found one link written by somebody that had the same idea about extracting installation media from Internet recovery. Only one.

Some information about InstallESD.dmg's integrity (mine is below):

$ md5deep InstallESD.dmg
412cee9c4c77c04c9c8489c363a7e2e4  /Volumes/New HD/Mac OS X Install Data/InstallESD.dmg

Resources about Lion recovery disk assistant, Recovery mode and Internet recovery

And finally, arstechnica provides the best Mac OS X guides. These are the nerdiest most detailed guides I've ever seen about an operating system. About all 7 versions of Mac OS X in fact. And they're great!

11 December 2011

Quick tips for Mac OS X

Thought I should share a few tips I learnt today. All these work in multiple versions of OS X although I didn't research exactly which apart from Lion. A quick table of contents for this post is shown below:

I can't think of many situations where these would be required in day-to-day use of Mac OS X. If you don't know why you would need to use these or what for, then don't bother; you don't need them.

Hide or unhide files from finder: the invisible attribute

Finder does not display files that have the 'invisible' attribute enabled on them. Makes sense. Just like windows, there are files that the OS does not want users messing about with.

But users know better, so I have created two files 'public' and 'secret_to_finder' with text in them. The file 'secret_to_finder' was hidden. GetFileInfo and SetFile allow for these attributes to be listed and manipulated.

$
$ # list files
$
$ ls -l public secret_to_finder
-rw-r--r--@ 1 pjvenda  staff  7 10 Dec 16:28 secret_to_finder
-rw-r--r--  1 pjvenda  staff  8 10 Dec 16:28 public
$
$ # get attributes of file 'secret_to_finder'
$
$ GetFileInfo secret_to_finder
file: "/Users/pjvenda/secret_to_finder"
type: "\0\0\0\0"
creator: "\0\0\0\0"
attributes: aVbstclinmedz
created: 12/10/2011 16:28:40
modified: 12/10/2011 16:28:48
$
$ # get attributes of file 'public'
$
$ GetFileInfo public
file: "/Users/pjvenda/public"
type: "\0\0\0\0"
creator: "\0\0\0\0"
attributes: avbstclinmedz
created: 12/10/2011 16:28:42
modified: 12/10/2011 16:28:52

The three details to note in the listing above are that ls happily shows invisible files, a '@' symbol is shown on files with non-standard attributes and the capital 'V' in the attribute list of the file 'secret_to_finder' (same as ls marked with '@').

To make 'secret_to_finder' visible again (to Finder) the following code does it.

$
$ # change file attribute
$
$ SetFile -a v ./secret_to_finder
$
$ # list files again
$
$ ls -l public secret_to_finder
-rw-r--r--  1 pjvenda  staff  7 10 Dec 16:28 secret_to_finder
-rw-r--r--  1 pjvenda  staff  8 10 Dec 16:28 public
$
$ # check that hidden file is no longer hidden
$
$ GetFileInfo ./secret_to_finder
file: "/Users/pjvenda/secret_to_finder"
type: "\0\0\0\0"
creator: "\0\0\0\0"
attributes: avbstclinmedz
created: 12/10/2011 16:28:40
modified: 12/10/2011 16:28:48

Visible again.
man {ls,SetFile,GetFileInfo} is your friend.

View disk volumes on the command line: disk utility does not reveal everything

Mac OS X has a way of hiding partitions by using a particular partition type code: Apple_Boot. Having this type on a partition makes it invisible to Disk Utility. But it is there and I want to mount it.
It is actually very simple. diskutil handles this partition as if it was a common visible one (likely Apple_HFS).

$ # list disks and partitions (or slices :)
$
$ diskutil list
/dev/disk0
   #:                       TYPE NAME                    SIZE       IDENTIFIER
   0:      GUID_partition_scheme                        *320.2 GB   disk0
   1:                        EFI                         209.7 MB   disk0s1
   2:          Apple_CoreStorage                         319.3 GB   disk0s2
   3:                 Apple_Boot Recovery HD             650.0 MB   disk0s3
/dev/disk1
   #:                       TYPE NAME                    SIZE       IDENTIFIER
   0:                  Apple_HFS Macintosh HD           *319.0 GB   disk1
/dev/disk2
   #:                       TYPE NAME                    SIZE       IDENTIFIER
   0:     FDisk_partition_scheme                        *8.0 GB     disk2
   1:                 Apple_Boot Recovery HD             650.0 MB   disk2s1
$
$ # disk2s1 is hidden from Disk Utility but diskutil is able to mount it
$ # (actually disk0s3 is of the same type - hidden too)
$
$ diskutil mount disk2s1
Volume Recovery HD on disk2s1 mounted
$
$ # quick check
$
$ mount
(...)
/dev/disk2s1 on /Volumes/Recovery HD (hfs, local, nodev, nosuid, journaled, noowners)

Mounted hidden partitions are handled by Finder just as any other mounted volume. man diskutil is your friend.

Mount and unmount volumes and images on the command line: good for hidden volumes or image files

Handling disk images in Mac OS X is ridiculously simple. This is a powerful feature that OS X makes use of for all the right reasons, and then some more.
Using DMGs could hardly be made easier. Mounting and unmounting is done in Finder by double clicking icons and clicking 'eject' symbols or dragging icons to bins... What if the DMG file is invisible to Finder? One can make it visible (as shown earlier in this post) or it can be mounted via the command line with hdiutil like so:

$ # find out if BaseSystem.dmg is hidden
$
$ GetFileInfo ./BaseSystem.dmg 
file: "/Volumes/Recovery HD/com.apple.recovery.boot/BaseSystem.dmg"
type: "devi"
creator: "ddsk"
attributes: aVbstclinmedz
created: 10/06/2011 14:04:11
modified: 10/06/2011 14:04:11
$
$ # .dmg is hidden to Finder
$ # but it can still be mounted
$
$ hdiutil mount ./BaseSystem.dmg 
Checksumming Driver Descriptor Map (DDM : 0)…
     Driver Descriptor Map (DDM : 0): verified   CRC32 $81E6D0AF
Checksumming  (Apple_Free : 1)…
                    (Apple_Free : 1): verified   CRC32 $00000000
Checksumming Apple (Apple_partition_map : 2)…
     Apple (Apple_partition_map : 2): verified   CRC32 $1025E215
Checksumming Macintosh (Apple_Driver_ATAPI : 3)…
  Macintosh (Apple_Driver_ATAPI : 3): verified   CRC32 $F1E8BA9E
Checksumming  (Apple_Free : 4)…
                    (Apple_Free : 4): verified   CRC32 $00000000
Checksumming disk image (Apple_HFS : 5)…
..............................................................................
          disk image (Apple_HFS : 5): verified   CRC32 $97F66EDE
Checksumming  (Apple_Free : 6)…
                    (Apple_Free : 6): verified   CRC32 $00000000
verified   CRC32 $2F452569
/dev/disk5           Apple_partition_scheme          
/dev/disk5s1         Apple_partition_map             
/dev/disk5s2         Apple_Driver_ATAPI              
/dev/disk5s3         Apple_HFS                       /Volumes/Mac OS X Base System
$
$ # verify mount
$
$ mount
(...)
/dev/disk5s3 on /Volumes/Mac OS X Base System (hfs, local, nodev, nosuid, read-only, noowners, mounted by pjvenda)

The mounted image is fully useable in Finder as any normal image. man hdiutil is your friend.

Permanently disable spotlight indexing on a specific volume on any host

There is a way to ensure that a certain volume is never indexed by Spotlight regardless of which computer it is connected to. All that is required is to create a file called .metadata_never_index on the root of the said volume and Spotlight will refuse to touch it. This can be done in any OS capable of writing onto the volume's file system, not necessarily a Mac.

$ # start from a clean, indexable volume
$
$ mdutil -i on /Volumes/My\ Book
/Volumes/My Book:
 Indexing enabled. 
$
$ # create .metadata_never_index file
$
$ touch /Volumes/My\ Book/.metadata_never_index 
$
$ # disable spotlight indexing
$
$ mdutil -d /Volumes/My\ Book
/Volumes/My Book:
 Indexing and searching disabled.
$
$ # attempt to enable spotlight again
$
$ mdutil -i on /Volumes/My\ Book
/Volumes/My Book:
 Indexing and searching disabled.

If spotlight is running, creating .metadata_never_index does not stop it automatically, although disconnecting and re-connecting the volume will. This ensures that the disk will not be indexed by any Mac OS X system.

Disable ongoing spotlight indexing on a specific volume

I've stopped counting the times I've inserted someone else's external disk or USB stick to have OS X immediatelly hogging CPU and I/O bandwidth for hours indexing everyting it can read. In most cases, these volumes will very rarely be connected to my system, so there's absolutely no point in indexing it. Moreover, when it is someone else's drive, I don't want my OS snooping through every directory.
Mac OS X allows standard user accounts to manage Spotlight indexing on non-system volumes, which is a very nice touch. Administrative privileges are required for system volumes.
So to disable Spotlight indexing immediately, the tool to use is mdutil.

$ mdutil -d /Volumes/My\ Book
/Volumes/My Book:
 Indexing and searching disabled.

There is no mention of the '-d' switch on mdutil's man page but mdutil's online help has it. To enable indexing again: mdutil -i on /Volumes/My\ Book. Also, removing the drive and connecting it again does not resume spotlight, so this is disables spotlight permanently on the host where it was done for that specific disk but Spotlight instances running on other Macs may still index it. To disable spotlight permanently for the target disk on any Mac have a look at the previous suggestion on this post.

Usage: mdutil -pEsa -i (on|off) -d volume ...
 Utility to manage Spotlight indexes.
 -p             Publish metadata.
 -i (on|off)    Turn indexing on or off.
 -d             Disable Spotlight activity for volume (re-enable using -i on).
 -E             Erase and rebuild index.
 -s             Print indexing status.
 -a             Apply command to all volumes.
 -V vol         Apply command to all stores on the specified volume.
 -v             Display verbose information.
NOTE: Run as owner for network homes, otherwise run as root.

I also found a GUI tool that does this with buttons: Spotless

View extended file attributes and access control lists

While copying data off my old home directory onto a new installation of Mac OS X, I found a few directories that I was not able to delete.

$ ls -ld Documents
drwx------+ 2 pjvenda  staff  68 10 Dec 15:55 Documents/
$ rm -rf Documents
rm: Documents: Permission denied

What? I'm the owner, and I own the top directory, so why the permission issue? Sure enough, root is able to delete it, but that is no answer to the problem. The clue here is the '+' symbol in the privilege section of ls's output.
This is a fairly common feature among modern file systems but seldom used feature introduced into HFS+ in 10.4/Tiger: file system ACLs. Server editions of this operating system do provide a GUI to manage ACLs, but not the desktop version. ACLs may be controlled by using fsaclctl.
ls not only detects that the files or directories have ACLs applied to them, but it also shows details about the said ACLs.

$ ls -lde Documents
$ ls -lde Documents
drwx------+ 2 pjvenda  staff  68 10 Dec 15:55 Documents/
 0: group:everyone deny delete

Ah! So nobody is allowed to delete the file as per ACL #0. For the purpose, all I had to do was to get rid of the ACL - this can be done with chmod (err, surprise!).

$ # delete ACL with index 0
$
$ chmod -a# 0 Documents
$ ls -lde Documents
drwx------  2 pjvenda  staff  68 10 Dec 15:55 Documents/
$
$ # ACLs are gone. I can delete the directory now
$
$ rmdir Documents

Done. Alternatively chmod could be used explicitly with the same result like so: chmod -a 'everyone deny delete' Documents.
man {ls,chmod} is your friend

I looked into most these small tasks while examining Lion's new recovery/installation model. The Recovery HD volume is hidden but useable via the command line, inside there is a BaseSystem.dmg file that is invisible to Finder which can be tackled by either mounting it in a terminal or unhiding it.

Credit

27 November 2011

The joys of hardware RAID

Hardware RAID

After having used software RAID on Linux for longer than I'd care to admit, I decided to go business and get a proper RAID controller. I mean having a decent motherboard with a bunch of unused bandwidth (2 channel PCI-X bus), it seemed only fair to make use of it.

I was primarily looking for a good SATA-II PCI-X controller with more than 4 ports. The short list came down to LSI Logic Megaraid 300-8x, Adaptec 2820SA and 3Ware 9550SX-8. Availability and cost end up being the same thing in this case. Most can be bought new but they are extortionately expensive. Alternatively there's the 2nd hand market on ebay... but few cards of this type are there. Eventually I got the 12 port version of the 3ware card (9550SX-12) plus a cache battery (!!).

Photo & Video Sharing by SmugMug
3ware 9550SX-12

Advantages

The whole point of this was to free up system resources from RAID duties (mostly kernel tasks eating away system time, which isn't that much for RAID 1) but more importantely to gain performance by making use of more disks over multiple high bandwidth channels. This was achieved by the 3ware controller which does a wonderful job at managing devices and RAID volumes on its own, independently of the operating system. In addition, the Linux kernel does include a driver that supports the card and the vendor's management tool (tw_cli) is very good.

Below is a quick listing of me detaching two independent disks and reattaching them in a RAID 1 array. The backup battery unit had not been charge-tested yet by the controller (a 20+ hour process), so it refused to enable functionality that depended on it.

# tw_cli /c0 show

Unit  UnitType  Status         %RCmpl  %V/I/M  Stripe  Size(GB)  Cache  AVrfy
------------------------------------------------------------------------------
u0    RAID-1    OK             -       -       -       1862.63   OFF    OFF    
u1    JBOD      OK             -       -       -       931.513   OFF    OFF    
u2    JBOD      OK             -       -       -       931.513   OFF    OFF    
u3    RAID-1    OK             -       -       -       186.254   OFF    OFF    

Port   Status           Unit   Size        Blocks        Serial
---------------------------------------------------------------
p0     NOT-PRESENT      -      -           -             -
p1     OK               u0     1.82 TB     3907029168    WD-WCAZA3206335     
p2     NOT-PRESENT      -      -           -             -
p3     OK               u0     1.82 TB     3907029168    WD-WCAZA3189743     
p4     NOT-PRESENT      -      -           -             -
p5     NOT-PRESENT      -      -           -             -
p6     OK               u1     931.51 GB   1953525168    5QJ0RVB7            
p7     OK               u2     931.51 GB   1953525168    5QJ0ZA08            
p8     NOT-PRESENT      -      -           -             -
p9     NOT-PRESENT      -      -           -             -
p10    OK               u3     189.92 GB   398297088     B41AARNH            
p11    OK               u3     189.92 GB   398297088     B41AB7KH            

Name  OnlineState  BBUReady  Status    Volt     Temp     Hours  LastCapTest
---------------------------------------------------------------------------
bbu   On           No        Testing   OK       OK       0      xx-xxx-xxxx  

# tw_cli /c0/u2 del
Deleting /c0/u2 will cause the data on the unit to be permanently lost.
Do you want to continue ? Y|N [N]: Y
Deleting unit c0/u2 ...Done.
# tw_cli /c0/u1 del
Deleting /c0/u1 will cause the data on the unit to be permanently lost.
Do you want to continue ? Y|N [N]: Y
Deleting unit c0/u1 ...Done.
# tw_cli /c0 add type=raid1 disk=6-7 storsave=balance
Creating new unit on controller /c0 ... Done. The new unit is /c0/u1.
Setting Storsave policy to [balance] for the new unit ... Done.
Setting default Command Queuing policy for unit /c0/u1 to [on] ... Done.
Setting write cache=ON for the new unit ...Failed
.  BBU is not ready. Use /c0/u1 set cache=ON command 
  to change the write cache policy when the BBU is ready.

# tw_cli /c0 show

Unit  UnitType  Status         %RCmpl  %V/I/M  Stripe  Size(GB)  Cache  AVrfy
------------------------------------------------------------------------------
u0    RAID-1    OK             -       -       -       1862.63   OFF    OFF    
u1    RAID-1    OK             -       -       -       931.312   OFF    OFF    
u3    RAID-1    OK             -       -       -       186.254   OFF    OFF    

Port   Status           Unit   Size        Blocks        Serial
---------------------------------------------------------------
p0     NOT-PRESENT      -      -           -             -
p1     OK               u0     1.82 TB     3907029168    WD-WCAZA3206335     
p2     NOT-PRESENT      -      -           -             -
p3     OK               u0     1.82 TB     3907029168    WD-WCAZA3189743     
p4     NOT-PRESENT      -      -           -             -
p5     NOT-PRESENT      -      -           -             -
p6     OK               u1     931.51 GB   1953525168    5QJ0RVB7            
p7     OK               u1     931.51 GB   1953525168    5QJ0ZA08            
p8     NOT-PRESENT      -      -           -             -
p9     NOT-PRESENT      -      -           -             -
p10    OK               u3     189.92 GB   398297088     B41AARNH            
p11    OK               u3     189.92 GB   398297088     B41AB7KH            

Name  OnlineState  BBUReady  Status    Volt     Temp     Hours  LastCapTest
---------------------------------------------------------------------------
bbu   On           No        Testing   OK       OK       0      xx-xxx-xxxx  

# dmesg | tail
(...)
[58946.312871] 3w-9xxx: scsi0: AEN: INFO (0x04:0x001A): Drive inserted:port=7.
[58946.371418] 3w-9xxx: scsi0: AEN: INFO (0x04:0x001F): Unit operational:unit=2.
[58946.396867] sd 0:0:2:0: [sdc] Attached SCSI disk
[59352.626254] scsi 0:0:1:0: Direct-Access     AMCC     9550SX-12  DISK  3.08 PQ: 0 ANSI: 5
[59352.626400] sd 0:0:1:0: Attached scsi generic sg1 type 0
[59352.626770] sd 0:0:1:0: [sdc] 1953103872 512-byte logical blocks: (999 GB/931 GiB)
[59352.627651] sd 0:0:1:0: [sdc] Write Protect is off
[59352.627654] sd 0:0:1:0: [sdc] Mode Sense: 23 00 00 00
[59352.628233] sd 0:0:1:0: [sdc] Write cache: disabled, read cache: disabled, doesn't support DPO or FUA
[59352.783431]  sdc: unknown partition table
[59352.886156] sd 0:0:1:0: [sdc] Attached SCSI disk

Disadvantages

Downsides of this solution were few and at the time mostly neglectible. The 3ware driver for Linux is functional but there are reports of implementation issues, related to interrupt management and PCI interaction. It is a universal 3ware driver that supports a multitude of similar controllers maintained by the vendor but it seems that updates are focused on supporting new cards. Another con of the hardware RAID route is that on-disk format of the data is managed by the card which means that there is a strong possibility that disks and RAID volumes become readable only by compatible 3ware controllers (using the same on-disk format). This reduces flexibility and increases risk in case the controller fails. There is documentation on the Internet that shows this.

In use

Despite the disadvantages which I considered at first but digested over the initial period of testing I decided to go ahead and modify my server from software to hardware RAID. Both my 1TB and 2TB disks were made into RAID1 volumes which the operating system happily uses as if they were single disks which is very cool. I used these volumes as simple disks which I partitioned and gave to LVM.

The card supports and handles hot-swapping and moving disks between physical ports well. I disconnected and connected disks while the volumes were up and all went smoothly. I can't be sure now, but I don't think the card rebuilt the entire volumes - just the blocks that had changed. Swapping ports was no trouble either (even online) all disks were recognised and put into the correct volumes. Booting worked well too, so no complaints in terms of functionality.

However, and in line with reports on the Internet, performance in multiple access situations was not great - the system kind of locked down while multiple heavy I/O operations were taking place. Sure every system becomes sluggish when lots of I/O is happenning, but operations in memory, using cached files, using the shell, etc, all that stuff keeps working smoothly, as long as it does not need to touch disks. Unlike with the 3ware where the shell would become unresponsive to keyboard input. Single operation performance on the other hand was great! Can't remember the numbers - must have then written somewhere.

Breakage

A few months into the break-in period, I was finding ext4 errors being reported by the kernel and also my fault for not adding auto fsck to fstab (in a nutshell, it's the last column of fstab entries: '2' for non-root volumes, '1' for root, ''0' for swap. 'man fstab' for more info).

Keep calm and carry on.

Not happy about it, I decided to fix the errors, scan the disks, look in host and guest system logs and look for hardware faults on the controller logs. None found. A bit of research into Xen, ext4, LVM, 3ware, etc, revealed few clues.

Assuming it might be issues with ext4, I tried changing a few less important file systems back to ext3 which may be bad in many ways, but _not_ in stability. Soon into this operation errors became frequent, appeared under ext3 too and worryingly operations on one file system were generating errors in other file systems (eek!!). Therefore something bad was wrong. At this point the host OS's root filesystem started to fall apart and important files went missing.

Now panic.

In disaster recovery mode, I decided not to touch anything, verify if the file server vm was working (it was), buy a large external disk and proceeded to copy over all the important information out of it over the network (which took the most part of 2 days). This is exactly the type of trouble that raid1 won't get you out of - file system corruption. Fortunately the Xen guest images were largely unaffected so I mostly ok, although I was not fully aware of the extent of the damage at the time.

Incident analysis

Frankly I don't know what caused the file system corruption. However, the simple fact that corruption happened under ext4 *and* ext3, and that operations on one file system caused problems in other file systems leads me to look away from the file system itself and into some lower layer of code. Below the file system there is vfs, lvm and the 3ware driver on the kernel. Further downstream, we have the controller itself and the disks. Any of the above is able to interfere simultaneously in more than one file system, and I would imagine it would likely do if something misbehaved. Other variables to throw into the mix are, of course, Xen 4.1.1.

Given that I don't often have this type of issue, I decided to take back the last change that I had introduced: the RAID1 hardware implementation.

I went back to software raid, reinstalled the server and performed some tests which went well. I'm using the same disks as I didn't find any fault in them and I am also using the same controller card, except all disks are now being exported directly rather than in a RAID volume (some would call this JBOD exports). I couldn't resist not using the controller's 1GB of battery supported read/write cache memory... Hopefully it is not faulty.

Conclusion

If the same problem does not happen again, then I have to assume that something in the driver or hardware raid1 implementation is wrong or does not play nicely with Linux and/or with Xen. In the mean time I will also try to buy another SATA-II PCI-X card, but this time RAID is purely optional.

17 May 2011

Enabling a full XEN domU login console

So I got rid of vserver and I'm rebuilding my server with xen. I'm building a XEN 4.1 with Gentoo XEN kernels for domain 0 and unprivileged domains. There were a number of issues with the process but I managed to get a stable fully functional dom0 kernel going. Unprivileged domains will have to be built from scratch as the current file systems were tweaked for the vserver environment.

The base file system is a Gentoo amd64 stage3 mounted in loopback mode. I also have a functional domU kernel so it was time to create a sample configuration file and fire up a virtual host with

xm create <config_file> -c
It seemed to bootup properly but console output ceased immediately after the kernel booted - the point at which process 1 is called: init. Some theory as to why this happens can be found here: http://www.xen.org/files/xensummit_4/xensummit_linux_console_slides.pdf

So to enable a fully functional xen login console the following is required (as always, there are other methods for similar or different purposes):

  • Make sure your domU kernel has all serial ports disabled. This may not be required but it will save some potential hassle because of how xen handles domU kernels;

  • Make sure your domU file system is populated with a bare base of device files in /dev (console, null, etc.). Gentoo's stage {1,2,3} base filesystems have all the necessary files;

  • Configure the kernel's virtual terminal driver to use xen's subsystem by adding the following command line parameter
    xencons=xvc
    As far as I understand, this is the default for current XEN kernels, so this parameter may not be required (it wasn't in my case but it's here for the sake of completeness);

  • Configure the kernel's console to output to a xvc type terminal. This is done by adding
    console=xvc0
    to the domU's kernel command line;

  • Adding kernel command line parameters can be done by editing the configuration file and adding (or adding to) a 'extra=' entry with whatever command line parameters as required. Specifically for this case, that would be
    extra = 'xencons=xvc console=xvc0'
    If 'extra=' already exists and contains something, just add the console parameter at the end:
    extra = 'parameter=value param2=value2 xencons=xvc console=xvc0'

  • Observe the kernel bootup messages looking for lines with 'console'. There should be one similar to:
    Xen virtual console successfully installed as xvc0

At this point, there should be a working console past the init process, service startup output (rc*) will be visible. However, it is likely that a login prompt won't appear. If that's the case and you want one, read on.

  • /etc/inittab can be setup to fire respawning login terminals at character devices, such as serial ports or the xen console (xvc0). One or more terminal lines are probably already on /etc/inittab with getty processes such as
    c1:12345:respawn:/sbin/agetty 38400 tty1 linux
    I modified one of those to point at /dev/xvc0 rather than at /dev/tty1:
    c1:12345:respawn:/sbin/agetty 38400 xvc0 linux
    (in case you're wondering, the first parameter c1 is only a label). In addition, for xen domU virtual hosts, there is little point in having any other login terminals, so the remaining (at tty2, tty3 and so on) can safely be commented out;

  • Remember to setup a root password...;

  • The final step is to get your system to allow root logins on the xen console. /etc/securetty contains a list of terminal devices over which root logins are allowed, to which 'xvc0' needs to be added (no /dev/);

Done!

A few more things I learnt while setting up this template file system:

  • When creating sparse loopback file systems, make sure the host file system can accommodate the entire file, or else the loopback file system will become corrupt;
  • Linux does strange things when it runs out of space on /;

kthxby!

21 April 2011

code snippet and a quote

Code snippet unrelated to quote and quote unrelated to code snippet.

Code snippet [needed this for a while]:

#!/bin/bash

function fsuffix() {
       local LEN=2
       # ${1} full path
       FILE=$(basename ${1})
       DIR=$(dirname ${1})

       # get last file of the specified type
       LS=$(ls -1 "${1}"-* 2>/dev/null| egrep "${1}-[0-9]+$" | sort -nr | head -n 1)

       if [ -z "${LS}" ]; then
               FSTSUF=1
               LSTSUF=0
               LST=""
               LSTSFX_OUT=""
       else
               LSTSUF=$(echo ${LS} | egrep -o "[0-9]*$")
               FSTSUF=$((LSTSUF+1))
               printf -v LST "%s-%0${LEN}d" "${FILE}" "${LSTSUF}"
               LSTSFX_OUT="${DIR}/${LST}"
       fi

       printf -v FST "%s-%0${LEN}d" "${FILE}" "${FSTSUF}"

       FSTSFX_OUT="${DIR}/${FST}"
       return ${FSTSUF}
}

Quote:
I'm going to record this in your file, under commendations-oh, there's a lot of space here. "Did well...enough"

31 March 2011

Ducati Desmosedici RR

One of the past weekend's highlights was spotting a rare bike... Alice.


Alice is a Ducati Desmosedici RR: a £40k road legal MotoGP motorcycle replica. This is a rare find, particularly due to its price, but also because of scarce availability.

There's a lot special about this bike from an engineering point of view.


Here's a detailed technical description and comparison between a Desmosedici RR and a GPn: http://robotpig.net/__automotive/ducati_desmosedici.php?page=1
And this is a very good succinct 3d model of a desmodromic valve: http://www.seastarsuperbikes.co.uk/ducatiengines.html

From a rider's standpoint this is probably as close as one can get to a road legal motogp bike. And that's ~200bhp at the rear wheel, slipper clutch, 171Kg (== over 1000 bhp per 1000Kg), a beautiful growling Ducati V-4 engine attached to a glorious - barely legal - exhaust and handling manners of a race bike.
This is _not_ my type of bike, but it's as exciting as a GT1 racing car. Like such, I would not pass the opportunity to try it out. :-]

This week's reading list

A covert distributed file system implemented on top of hacked printers.
http://www.remote-exploit.org/wp-content/uploads/2011/03/Printers-Gone-Wild.pdf
Video here: http://www.remote-exploit.org/?page_id=764

A more generic, yet much longer and deeper printer hacking presentation. Included in the discussion are the issues of firmware infection and remote attacks to printers with malicious physical consequences.
http://archive.hack.lu/2010/Costin-HackingPrintersForFunAndProfit-slides.pdf

Cisco's guide of IPv6 for dummies. This is a long PDF presentation that is well worth the time to go through.
http://ipv6forum.se/wordpress/wp-content/uploads/2009/01/ipv6-for-dummies-se-090120.pdf

TCP hijacking state of the art (in the context of proxy services)
http://www.squid-cache.org/~adrian/talks/20080510%20BSDCan%20TCP%20Hijacking%202.pdf
Complements well with this tool: http://intrepidusgroup.com/insight/mallory/

fun!

08 March 2011

eroded compact disc

Metallica: the black album, an album of a very rare breed of musical work. I bought this CD a long time ago - around 1994 (about 17 years ago). Shortly after I bought it, maybe a year or so, I noticed these tiny cracks appearing around the edges. Those tiny cracks have been growing as if erosion or corrosion has been taking place in the reflective material.

eroded compact disc?
No other CD I own has ever had this kind of issue. I'm guessing this was a manufacturing defect, perhaps a one-off or an entire batch, who knows? The last track doesn't play any more - there's no reflective material left to cover the entire surface that contains it.

Isn't it ironic that this kind of physical wear would develop in a compact disc - one of the most robust digital support ever made, originally developed and aimed at the consumer market as reliable media to record and playback music? The fact that this was a one-off in my collection and that a (quick) google search revealed nothing of this kind probably means that this is indeed rare.

But a curious one, nonetheless...

14 February 2011

long traceroutes

Physical distance and network distance have an interesting relationship. Even though intra and inter-city links can be horribly slow and therefore not a good measure of physical distance, the same is not as true for international and inter-continental links.

From the UK (a.a.a.a) to Japan (z.z.z.z) via the Internet:

traceroute to z.z.z.z (z.z.z.z), 30 hops max, 40 byte packets
 1  b.b.b.b (b.b.b.b)  0.710 ms  0.606 ms  0.636 ms
 2  c.c.c.c (c.c.c.c)  3.065 ms  2.447 ms  2.817 ms
 3  d.ukcore.bt.net (d.d.d.d)  2.810 ms  2.793 ms  2.483 ms
 4  e.e.e.e (e.e.e.e)  7.235 ms  6.432 ms  7.068 ms
 5  f.ukcore.bt.net (f.f.f.f)  6.449 ms  6.360 ms  6.221 ms
 6  * g.eu.bt.net (g.g.g.g)  6.151 ms  5.906 ms
 7  h.eu.bt.net (h.h.h.h)  88.322 ms  88.575 ms  88.227 ms
 8  i.eu.bt.net (i.i.i.i)  88.652 ms  88.257 ms  88.850 ms
 9  * * *
10  j.j.j.j (j.j.j.j)  164.092 ms  163.958 ms  163.862 ms
11  k.kddnet.ad.jp (k.k.k.k)  151.157 ms  150.945 ms  150.999 ms
12  l.kddnet.ad.jp (l.l.l.l)  262.545 ms  264.215 ms  262.769 ms
13  m.kddnet.ad.jp (m.m.m.m)  284.810 ms  278.666 ms  276.577 ms
14  n.kddi.ne.jp (n.n.n.n)  270.713 ms  270.666 ms  282.805 ms
15  o.kddi.ne.jp (o.o.o.o)  278.036 ms  278.105 ms  278.458 ms
16  p.kddi.ne.jp (p.p.p.p)  269.161 ms  266.220 ms  270.104 ms
17  q.kddi.ne.jp (q.q.q.q)  279.407 ms  283.477 ms  279.755 ms
18  r.r.r.r (r.r.r.r)  295.864 ms  266.645 ms  267.897 ms
19  s.s.s.s (s.s.s.s)  268.964 ms  268.345 ms  267.745 ms
20  z.jp (z.z.z.z)  265.074 ms  275.107 ms  263.744 ms


The key hops are 6 to 7 (~82ms) [UK-Europe], 8 to 10 (~76ms) [Europe-Japan] and 11 to 12 (~110ms). These represent the respective network distances of the links between UK and somewhere in Europe and between somewhere in Europe and Japan.

From the UK (a.a.a.a) to Australia (z.z.z.z) via a private link:

traceroute to z.z.z.z (z.z.z.z), 30 hops max, 40 byte packets
 1  b.b.b.b (b.b.b.b)  0.522 ms  0.491 ms  0.827 ms
 2  c.c.c.c (c.c.c.c)  2.811 ms  2.788 ms  2.753 ms
 3  d.d.d.d (d.d.d.d)  0.567 ms  0.543 ms  0.515 ms
 4  e.e.e.e (e.e.e.e)  0.467 ms  0.771 ms  0.556 ms
 5  f.f.f.f (f.f.f.f)  6.864 ms  8.488 ms  11.544 ms
 6  g.g.g.g (g.g.g.g)  305.581 ms  306.083 ms  306.598 ms
 7  z.z (z.z.z.z)  305.507 ms  305.506 ms  305.377 ms


In this case, the important hops are 4 to 5 (~8ms) [likely intercity] and 5 to 6 (~295ms) [obviously intercontinental].

An interesting trivia to take away from the latter case is packets travelled half the world and back at a difficult-to-conceive 408 216 000 km/h or 253 653 663 mph. Or in a scientific perspective 0.378*c [where c is approximately the speed of light in vaccum].

My trusty UDP packets travelled half the world and back at an amazing ~37.8% of the speed of light! That's including routing and other network type processing, variable speeds of propagation trough different mediums, etc.