Saturday, December 15, 2012

ZFS

Finally I managed to get my desktop PC back online, after changing its motherboard, memory and hard drive, it's ready to use. Besides regular use my idea is to store all the content here until I get a proper NAS (if that ever happens).

Eventually I came to the point where I had to choose which file system to use, I could stay with ext4, however that's not really the best out there, I mean it is a very popular and good file system for general use, but there are other options that provide, for example, better data integrity.

And everyone seems to agree that the safest file system is zfs. Originally developed by Sun it's supposed to provide excellent data integrity. Unfortunately for Linux users, due to license's conflicts can't be included in the kernel, nonetheless there is an implementation using fuse which I plan to use and show here.

The content that I plan to store is mostly family pictures, videos and music, all of them already have some kind of compression, so a file system with compression will not do much for me, zfs supports compression but I won't use it, if you plan to store text file or non compressed formats then probably it will be a good idea to give it a try.

After installing it and get it running (it has a daemon called zfs-fuse), zfs is more than just a file system, it is also a volume manager, which means that you don't even need to partition the disk, you can just give the whole disk for it to use, or a partition, or even a file, and a file is what I will use for testing here.

First lets create the file, I'll use a 512MB file as data store:
# dd if=/dev/zero of=rawdisk.1 bs=4096 count=131072
After that we create a pool, which is like a container that can store (or not) multiple volumes, for this test I will use it as it is:
# zpool create testpool /home/paco/zfs-test/rawdisk.1
This not only creates the file system but it also mounts it on /testpool so it is ready to use, the information of the pool is stored in /var/lib/zfs/

The status of the pool can be visualized with:
# zpool status
 pool: testpool
 state: ONLINE
 scrub: none requested
config:

    NAME                             STATE     READ WRITE CKSUM
    testpool                         ONLINE       0     0     0
      /home/paco/zfs-test/rawdisk.1  ONLINE       0     0     0

errors: No known data errors
So far so good, now I have couple of images that I will copy there, after that, let's unmount the device:
# zfs unmount /testpool
Now I'll try to simulate a data corruption, since the device that I'm using for the pool is a file, I can modify some bytes of it and see how it goes, the images I took start with "JFIF" so let's try to find that on the file:
# hexdump -C rawdisk.1 | less 

0040ac00  ff d8 ff e0 00 10 4a 46  49 46 00 01 02 01 00 48  |......JFIF.....H|
0040ac10  00 48 00 00 ff e1 1d 7a  45 78 69 66 00 00 49 49  |.H.....zExif..II|
0040ac20  2a 00 08 00 00 00 0f 00  0f 01 02 00 09 00 00 00  |*...............|
0040ac30  c2 00 00 00 10 01 02 00  10 00 00 00 cb 00 00 00  |................|
0040ac40  12 01 03 00 01 00 00 00  01 00 00 00 1a 01 05 00  |................|

Let's add some disturbance on "JFIF", let's change the "IF" part with random data:
# echo $(((0x0040ac00)+8))

 4238344

# dd if=/dev/random of=rawdisk.1 seek=4238344  bs=1 count=2 conv=notrunc 

Let me explain this a bit, the first column of the hexdump is the offset, in our case it is 0x0040ac00 (hexadecimal) which represents the first byte "ff" from that we start counting, 1 is d8, 2 is ff, 3 is e0 and so on until we get to "IF" which is 49 and 46 (man ascii if you want to double check). With this information we use dd to introduce random data from /dev/random in that position, you can for example close it and hexdump it again and grep for 0040ac00, you will notice that "JFIF" will be "JF**" where * is something random.

I repeat it a couple of times and then mounted it again:
# zfs mount testpool

So far so good, now let's do a check:
# zpool scrub testpool

And let's check the output:
# zpool status testpool   
  pool: testpool
 state: ONLINE
status: One or more devices has experienced an error resulting in data
    corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
    entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: scrub completed after 0h0m with 2 errors on Fri Dec 14 23:59:27 2012
config:

    NAME                             STATE     READ WRITE CKSUM
    testpool                         ONLINE       0     0     2
      /home/paco/zfs-test/rawdisk.1  ONLINE       0     0     4

errors: 2 data errors, use '-v' for a list

It managed to detected 2 errors, let see more:
# zpool status -v testpool
  pool: testpool
 state: ONLINE
status: One or more devices has experienced an error resulting in data
    corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
    entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: scrub completed after 0h0m with 2 errors on Fri Dec 14 23:59:27 2012
config:

    NAME                             STATE     READ WRITE CKSUM
    testpool                         ONLINE       0     0     2
      /home/paco/zfs-test/rawdisk.1  ONLINE       0     0     4

errors: Permanent errors have been detected in the following files:

        /testpool/1920-1200-11295.jpg
        /testpool/1920-1200-217.jpg

These are the two image files that we modified. Very impressive!, I'm not using it with a mirror, so it can't restore them, but it is possible to add another disk/partition/file as a mirror, in which case it will recover the files from it.

References:
  1. Wikipedia's article about ZFS
  2. ZFS implementation for Linux using fuse
  3. Solaris ZFS Administration Guide 
Versión en español aquí (linuxlatino) 

Friday, November 2, 2012

Código ajeno

Semana difícil, como han sido las últimas, el trabajo aprieta pero de momento sigue siendo divertido. Llegué a la conclusión que disfruto modificando o retomando código ajeno más que el propio.

La magia detrás de esto es que puedo (de alguna forma) meterme en la mente del autor original, descifrar las razones por las cuales escogió implementar tal o cual cosa de esa o aquella forma.

Se aprende mucho de una persona leyendo su código, cada vez que pasa termino enfocándome más en la mente detrás del código que él... Lo cual laboralmente hablando no es del todo bueno.

En fin, últimamente me ha tocado ver código bastante malo, pero incluso el proceso de rehacerlo y optimizarlo es altamente gratificante.

Lo nuevo de esta semana fue identificar si una dirección IP (IPv4) pertenece a una red y máscara de red específica.

Tuesday, July 3, 2012

drg to sbg

After releasing version 1.2.11 I thought my job was done, wasn't planning to improve it in any way, until I received an email with a bug report. My first thought was "wait, people are still using this?" and it seems they do. The project has close to 5k downloads in total and after 3 days of releasing 1.2.12 the download count was already at 19.

Since people are still using it, I'll try to do my best to improve it, this is my TODO list (for version 2.0):

  • Migrate to git (I like git a lot)
  • Drop ChangeLog (git log will provide that)
  • Change version schema (2 digits instead of 3)
  • Allows raw output (I use a small function to parse the original output to be SBaGen friendly)
  • Allow dumping the image (every drg file has an image)
  • Drop openssl dependency (currently only used for base64 decoding, seems overkill)
  • Build drg (if you have an image, description and SBaGen code, you can make your own .drg file)
I have to give special thanks to Loïc H. for providing me with a nice set of Unofficial doses to test. Merci beaucoup, mon ami

Friday, June 15, 2012

Music and tags

After a long search for a tool that allows me to modify music file's tags in a text interface I decided to do my own. The main reason was that I couldn't find anyone that I liked and command line tools (there's a lot of them) didn't make it.

My requirements are simple, I need a tool that I can use over ssh to organize and modify the tags in my music files. X11 forwarding and VNCs are not an option since my client (Efika MX Smartbook) is limited and I can't stand the delay.

I decided to make it in C, for no particular reason other than it is my favorite programming language. But, I have to admit, if you ever consider doing GUI programming you should use an object orientated programming language. Every major GUI toolkit is using it. Oh, but what about gtk+? you may ask, well, gtk+ uses C, but they realized the need for OOP and build their own object-oriented framework for C, called GObject and gtk+ is build on top of that.

It will use ncurses and TagLib, hence the name nctagger.

That said, my code is still unusable, it is so basic that I haven't even pushed anything to github. I'm sure I will publish something eventually, but well you know, "my new job's a hassle and kids have the flu"

Thursday, June 7, 2012

Pomodoro

Recently I have been trying to improve my production. During my (quick) search I came up across the Pomodoro Technique, which is a time management method created by Francesco Cirillo in the 80s. Wikipedia explains all about it much better than I will ever do.

Basically you work for 25 minutes (a pomodoro) and after that you get a 5 minutes break, after 4 pomodoros you get a longer break, from 15 to 20 minutes.

Luckily for me there's a module available for my main work tool: Emacs. The module works well except for the fact that it doesn't beep when the pomodoro ends, adding this feature was super easy, even for someone that has never code in lisp before, like myself.

For Emac's beep (visible beep) notifications to work on Awesome WM, the following needs to be added to your .emacs:

(defun urgent-hint ()
  (let ((wm-hints (append (x-window-property "WM_HINTS" nil "WM_HINTS" nil nil t) nil)))
    (setcar wm-hints (logior (car wm-hints) #x00000100))
    (x-change-window-property "WM_HINTS" wm-hints nil "WM_HINTS" 32 t)))

(defun urgent-hint-helper (a b c)
  (urgent-hint))

(setq ring-bell-function 'urgent-hint)