Tuesday, 11 January 2011

The C Programming Language (K&R) 01x09—Escape Sequences—Exercise 1-10

I am making my way through the book “The C Programming Language” by Brian Kernighan and Dennis Ritchie aka K&R. Some may look at the code in the previous blog posts and go: that’s sloppy, sloppy code. It may be, but that’s not the point. I’m following what’s in the book with minor variations for compiler issues. It’s about the code, but it’s not about the code. Think about it.

Exercise 1-10 from K&R

Write a program to copy its input to its output, replacing each tab by \t, each backspace by \b, and each backslash by \\. This makes tabs and backspaces visible in an unambiguous way.

So I’m writing this code from scratch.

I know what I have to do. First, get input from the keyboard. That’s straightforward. I will use the getchar() function in the <stdio> library.
I then have to test if the character is one of three escape sequences (\t, \b, or \\). If it’s not, I don’t do anything with the character, it gets printed as is. If it is a special case, I have to change it to display: \t, \b, or \\.

I immediately recognize a problem. The variable c, holding the character input, is an integer. Changing the escape sequence to an ASCII literal of “\t” for tab (ASCII 09) is a string of two characters. The getchar() and putchar() functions take a character parameter and c works fine with them. The putchar() function won’t take a string.
I have two options. First, look for a function that will take strings and print it to the console, but at this point in the book, strings haven’t been discussed so I won’t go down that road. Second, use what’s been discussed in the book. That means using putchar() twice for each of these special cases. That’s what I did.

In the special cases, the first character to print will be a backslash. The second character will vary depending on what was typed. I defined a variable, extra, as an integer to hold this extra character.

I use a nested if statement to test for the three conditions. Nesting the code saves on time if one of the conditions is true because the rest of the if statement is bypassed. I could have used a switch statement but it hasn’t been covered yet and I don’t see it as an improvement on the code I have below.

I wrote the code such that the extra variable is either null (‘\0’) or the extra character. If we have an extra character, the code will print a backslash followed by the second character otherwise it prints the c character without changes.

Sample Code.

I am using Visual C++ 2010 and creating the code as a console application.

// The standard library includes the system function.
#include <cstdlib>

// C++ standard I/O library
#include <cstdio>

// Change tab, backspace and backspace
// escape sequences with: \t, \b, \\
int main()
     // Character input variable.
     int c = 0;

     // Extra character for certain escape sequences.
     // Set to null if not a sepcial case.
     int extra = '\0';

     while ((c = getchar()) !=EOF) {
           // Replace tab.
           if (c == '\t')
                extra = 't';
           // Replace backspace.
           // Won't work because of buffer.
           else if (c == '\b')
                extra = 'b';
           // Replace backslash.
           else if (c == '\\')
                extra = '\\';

           // Display output.
           if (extra) {
                // Reset extra char to null.
                extra = '\0';

     // keep console window open

     // return some value
     return 0;
} // end main

Because of the way the getchar() function is implemented in Visual C++ 2010, any backspace entered isn’t passed to the program and never gets executed in this code. Another function without buffering or echoing would solve this problem.

I tested the program by entering the following: a<tab>b<tab>c<tab><enter><enter> \a<tab>\b<tab>\c<tab><enter><enter><ctrl>+Z<enter>

The code works.

No comments:

Post a Comment