Sunday, December 21, 2008

PERL: Making Ctrl+C ineffective

** Before starting, I am not responsible for the ramifications you face after running the below mentioned scripts...So be contemplative before trying these out!! **

If you have been using PERL for a long time now, you must be cognizant of the fact that, if you see your program behaving in an unexpected/bizarre way, pressing Ctrl+C would help you to stop the program execution. But did you know, you can disable this functionality. By disable, I mean that, even after pressing Ctrl+C, your program won’t stop executing. Sure, this would have kindled your thinking caps!! Let’s see how we can achieve it.

This could be easily accomplished by scripting a single line at the onset of your program.
$SIG{‘INT’} = ‘IGNORE’;
But how do you check it? For checking, try this piece of code:
$SIG{'INT'} = 'IGNORE';
@arr = (0..10);
foreach (@arr) {
print("$_\n");
sleep 1;
}
While the code snippet is getting executed, if you pres Ctrl+C, it would not stop your program.
This worked on Windows XP for me....Cool!! Isn’t it??

Let’s say, instead of ignoring Ctrl+C, you want to achieve something else when Ctrl+C is pressed. Like, say you want the program to print some message or exit. This is how it can be done:
sub INT_handler { #function starts
print("Don't Interrupt!\n");
#exit(0); - similar to print, program can be exited using exit(0)
}
$SIG{'INT'} = 'INT_handler';
@arr = (0..10);
foreach (@arr) {
print("$_\n");
sleep 1;
}
While the program execution happens, pressing Ctrl+C will print “Don’t Interrupt!”
But beware, if again Ctrl+C is pressed, the program execution stops.

This worked on Windows XP for me.

Underlying facts:
Basically Ctrl+C is a signal that is sent to the program. So when Ctrl+C sequence is pressed, a signal called INT is activated. When we say $SIG{‘INT’} = ‘IGNORE’, we basically ignore this signal and hence it doesn’t affect program execution.
Similar to INT, we have many more signals. In order to see these, you can run this code:
foreach (keys %SIG) #%SIG is actually a hash that stores all signals like %ENV that stores environment variables.
{ print " $_ \n"; } #lists all the signals supported by the platform.

You are now free to play around with these....Enjoy!!

Monday, December 8, 2008

PERL: Approaches for FileSize

Hi,
I’ve been a great fan of all scripting languages, especially PERL (Practical Extraction and Report Language). My objective through this blog is to bring out some anomalies or rather some surprising facts that can swerve your program from the expected output.

Here’s one.
Consider a situation where you need to find out the size of a file present on the disk, say C:\test.txt.

The content of test.txt is:
-------------------------------------------------------------
Hi,”\n”
How about some interesting facts in PERL”\n”
Look at this!!!”\n”
Interesting!!”\n”
-Chetan”\n”
“\n”
-----------------------------------------------------------
Now there are two ways we approach this situation.

1. Open (FH, “< C:\\test.txt”) die $!; #open the file in read mode or exit with error code
print "Size of file (in bytes) is: ";
print –s FH; # -s options gets the size of filehandle, i.e. the file.
The output of the above snippet is: Size of file (in bytes) is: 90

2. open (FH, "< C:\\test.txt") die $!; #open the file in read mode or exit with error code
my @filecontent=; #get the contents of the file in an array
my $count =0; #initialize a counter
foreach(@filecontent) #Browsing through the file contents line by line
{
$count += length($_); #get the length of each line and add it to the counter
}
print "size of file (in bytes) is: $count"; #prints the file size.
The output here is: Size of file (in bytes) is: 84

Oh!! What’s the difference here? How come same file printing out to give different file sizes? Any wrong with the way either of programs is run? Which is the correct approach?

Explanation: The hitch here is, the function length($_) doesn’t consider the “\n” character that I have pointed out in the content of test.txt. So the 6 “\n” that are present, are not counted and hence the results differ.
But the correct size is given by the first approach as the new line character is also very much a character of the file. Now you know the next time you need to find the size of file what approach can you bank on!!