| « ASUS EEE - few tricks for beginners ;) | Compiling kernel modules for Xandros ASUS EEE » |
Debugging SquashFS and tools
Recently, I was playing with squashfs and with its SquashFS LZMA companion. I hit the segmentatin fault while I was building squashfs from ASUS EEE read only image. I'll try to explain my debug process, so people who are not familiar can follow (and learn something). I run gdb on core dump and discovered that problematic function is write_file_blocks_dup (output of gdb):
Program terminated with signal 11, Segmentation fault.
#0 0x08050b84 in write_file_blocks_dup ()
(gdb) bt
#0 0x08050b84 in write_file_blocks_dup ()
#1 0x080519e9 in write_file ()
#2 0x0805257c in dir_scan2 ()
#3 0x08052502 in dir_scan2 ()
#4 0x08052502 in dir_scan2 ()
#5 0x08052502 in dir_scan2 ()
#6 0x08052502 in dir_scan2 ()
#7 0x08052c60 in dir_scan ()
#8 0x0805506e in main ()
(gdb) print reader_buffer
$1 = 139977672
Seg fault often happens when you try to read/write memory which you don't own (or you freed already). By looking at that function (write_file_blocks_dup), I found interesting piece of code:
Code:
if(read_buffer->c_byte) { | |
read_buffer->block = bytes; | |
bytes += read_buffer->size; | |
file_bytes += read_buffer->size; | |
if(block < thresh) { | |
buffer_list[block].read_buffer = NULL; | |
queue_put(to_writer, read_buffer); | |
} else | |
buffer_list[block].read_buffer = read_buffer; | |
} else { | |
buffer_list[block].read_buffer = NULL; | |
alloc_free(read_buffer); | |
} | |
buffer_list[block].start = read_buffer->block; | |
buffer_list[block].size = read_buffer->size; | |
progress_bar(++cur_uncompressed, estimated_uncompressed,columns); |
If you notice, there is alloc_free function which try to free read_buffer, but immediately after that - there's statements which try to access that memory (which doesn't exist any more). So, I changed it to:
Code:
if(read_buffer->c_byte) { | |
read_buffer->block = bytes; | |
bytes += read_buffer->size; | |
file_bytes += read_buffer->size; | |
if(block < thresh) { | |
buffer_list[block].read_buffer = NULL; | |
queue_put(to_writer, read_buffer); | |
} else | |
buffer_list[block].read_buffer = read_buffer; | |
} else { | |
buffer_list[block].read_buffer = NULL; | |
} | |
buffer_list[block].start = read_buffer->block; | |
buffer_list[block].size = read_buffer->size; | |
progress_bar(++cur_uncompressed, estimated_uncompressed,columns); | |
alloc_free(read_buffer); |
I made patch for it and sent to author via squashfs sourceforge page. That's one bug less...
But! Here it comes again. I hit another problem. My mksquashfs process was stuck at around 60% and progress bar is not moving. So, I connected to PID of mksquashfs process using gdb and discovered that it's waiting in get_fragment function on pthread as illustrated below:
xb7f32c01 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/tls/libpthread.so.0
(gdb) bt
#0 0xb7f32c01 in pthread_cond_wait@@GLIBC_2.3.2 ()
from /lib/tls/libpthread.so.0
#1 0x0804dcad in get_fragment ()
#2 0x0804e104 in duplicate ()
#3 0x0805155f in write_file_frag_dup ()
#4 0x080516f8 in write_file_frag ()
#5 0x08051a36 in write_file ()
#6 0x0805257c in dir_scan2 ()
#7 0x08052502 in dir_scan2 ()
#8 0x08052502 in dir_scan2 ()
#9 0x08052502 in dir_scan2 ()
#10 0x08052502 in dir_scan2 ()
#11 0x08052502 in dir_scan2 ()
#12 0x08052502 in dir_scan2 ()
#13 0x08052502 in dir_scan2 ()
#14 0x08052c60 in dir_scan ()
#15 0x0805506e in main ()
I tried to see on what file it hangs while building filesystem and it hangs at the following:
mksquashf 32709 root 3u REG 8,3 607996129 1484897 /asus-eee/asus-eee.lza
mksquashf 32709 root 4r REG 8,3 395 277747 /asus-eee/usr/share/games/frozen-bubble/gfx/menu/backgrnd-closedeye-right-green.png
I tried to start it again, and it again hang at the following file:
mksquashf 6544 root 3u REG 8,3 607996129 1101584 /asus-eee/asus-eee.lza
mksquashf 6544 root 4r REG 8,3 395 277747 /asus-eee/usr/share/games/frozen-bubble/gfx/menu/backgrnd-closedeye-right-green.png
It seems that pthread implementation is broken somehow. I tried to look at get_fragment function, and discovered that mksquashfs is waiting at the following piece of code:
Code:
pthread_mutex_lock(&fragment_mutex); | |
while(fragment_table[fragment->index].pending) | |
pthread_cond_wait(&fragment_waiting, &fragment_mutex); | |
pthread_mutex_unlock(&fragment_mutex); | |
disk_fragment = &fragment_table[fragment->index]; | |
size = SQUASHFS_COMPRESSED_SIZE_BLOCK(disk_fragment->size); |
Notice that there's no error checking on pthread functions at all (like in pthread_mutex_lock), so I have to implement error checking in order to see what's wrong around pthread implementation (because it seems like some pthread runaway). But, It's too late and this would take long as I have to put lot of error checkings around every pthread call, so I'm leaving this as exercise for the reader of this blog!
Good luck! :)
Technorati tags: asus eee hacking linux xandros debian squashfs gdb debug
Trackback address for this post
Trackback URL (right click and copy shortcut/link location)
1 comment, 353 trackbacks
The bug has also been fixed in the Squashfs CVS repository for sometime (go to the summary page off http://squashfs.org.uk which has directions for getting the current CVS version).
A new release with this and other fixes should have been released sometime ago, and will be released shortly.
Phillip
Много вкуÑных бўджетных рецептов на http://aromats.net/
Ð’Ñе Ð´Ð»Ñ Ð²Ñтречи Ðового года http://nowy-god.ru/ на нашем Ñайте