Monday 24 January 2011

The C Programming Language (K&R) 01x17—Tab for Spaces—Exercise 1-20

Exercises 1-20 of “The C Programming Language” by Brian Kernighan and Dennis Ritchie aka K&R deal with parsing text entered by a user.

Exercise 1-20. Write a program detab that replaces tabs in the input with the proper number of blanks to space to the next tab stop. Assume a fixed set of tab stops, say every n columns. Should n be a variable or a symbolic parameter?

By this point in the book, writing this code, even from scratch, should be straightforward, but there is some work to figure out tab stops since a tab moves to fixed stops. It’s not the case that every tab is replaced with n number of spaces. The number of spaces will vary. Second, to replace a tab ‘\t’, which is one character, with several spaces, means stretching the char array. You can avoid this issue by outputting the text immediately and not storing any data in a character array.

On my computer, a tab takes up 8 characters. Here is an example.

Column:       1234567891123456789212345678931234567940
Element:      0123456789112345678921234567893123456794
              abcdT   abcabcT abcT    abc...

The first tab appears at element 4 or position 5 on the screen. The modulus of 5 and 8 is 5 and the number of spaces to add is 3. The tab width, 8, less 5 is three.

The tab character has to be replaced with a space so the number of spaces to add is: 1+ Tab width (8) less (Element position + 1) mod 8. This equation can be simplified since 1 mod 8 is one and subtracting it from 1 + Tab width leaves: Tab width less (Element position mod 8). The simplification of the equation is true for all values of tab width.

As for how to store the value of the tab width, it’s better to create a variable, TabWidth. If it’s a variable, it can be dynamically changed based on a user’s system without recompiling the program. If the value is stored as a symbolic constant, the value can only be changed by recompiling the program. The symbolic constant (e.g., #DEFINE TABWIDTH 8) is replaced by its literal value during the pre-compile process.


Sample Code.

I am using Visual C++ 2010 and created the sample code as a console application.

// Function prototype.

// The standard library includes the system function.
#include <cstdlib>

// Standard I/O library.
#include <cstdio>


int main()
{
     int c, i = 0;
     char line[80];

     // Get user input from keyboard.
     while ((c = getchar()) != EOF)
     {
           // Store char.
           if (c != '\n')
           {
                line[i] = c;
                ++i;
           }
           else
           {
                // Store null char to mark end.
                line[i] = '\0';
                // Parse text to replace tabs w spaces.
                detab(line);
                // Output text.
                printf("%s\n\n", line);
                // Reset counter.
                i = 0;
           }
     }

     // Keep console window open.
     system("pause");

     // Return some value.
     return 0;

} // end main
// Remove tabs and replace
int detab(char text[])
{
     int iFrom, iTo = 0;
     char temp[80];
     int SpaceCount = 0;
     int TabWidth = 8;

     for (iFrom = 0; text[iFrom]; ++iFrom)
     {
           // Find tab.
           if (text[iFrom] == '\t')
           {
                // Replace tab with spaces.
                // Calc the number of spaces to fill up to tab stop.
                // SpaceCount = 1 + TabWidth - ((iTo + 1) % TabWidth);
                // Equiv. stmt.
                SpaceCount = TabWidth - (iTo % TabWidth);
                for (; SpaceCount ; --SpaceCount, ++iTo)
                     temp[iTo] = ' ';
           }
           else
           {
                // Not a tab.
                // Store as is.
                temp[iTo] = text[iFrom];
                ++iTo;
           }
     }

     // Store null char to mark end.
     temp[iTo] = '\0';

     // Copy temp to text.
     iTo = 0;
     while ((text[iTo] = temp[iTo]) != '\0')
           ++iTo;

     // Store null char to mark end.
     text[iTo] = '\0';

     // Return something.
     return 0;
}
 
Output.


Sunday 23 January 2011

The C Programming Language (K&R) 01x16—Char Arrays & Functions—Section 1.10


Section 1.10 of “The C Programming Language” by Brian Kernighan and Dennis Ritchie aka K&R looks at the scope of variables in C. There is a difference between a variable declared inside a function and outside. The first is confined to the function in which it is declared. It exists only when the function is called and dies when the program leaves the function. Static variables are discussed elsewhere and provide an except. The other is global in scope and created when the program begins. It can be accessed by any function.

The sample code from 01x13 is modified to use global variables.

The variables are declared as they were in the functions but are placed at the top of the source file (i.e., outside of any function).

The functions contain the same variable “declarations” except the keyword extern is added in front.

I haven’t seen it used a great deal for a couple of reasons. As long as the program is running, these global variables take up memory even if they won’t be used again. A key objective should be to use a little RAM as possible. Second, global variables can be changed in ways that make debugging difficult. It can lead to unexpected results. I avoid using them, but they do serve a purpose at times.

Sample Code.

I am using Visual C++ 2010 and created the sample code as a console application.

// Function prototype.
int getline();
void copy();

// The standard library includes the system function.
#include <cstdlib>

// Standard I/O library.
#include <cstdio>

#define MAXLINE 1000 /* maximum input line length */

int max; /* maximum length seen so far */
char line[MAXLINE]; /* current input line */
char longest[MAXLINE]; /* longest line saved here */


int main()
{
     int len; /* current line length */
     extern int max;
     extern char longest[];

     max = 0;

     while ((len = getline()) > 0)
           if (len > max) {
           max = len;
           copy();
           }

     if (max > 0) /* there was a line */
           printf("%s", longest);

     // Keep console window open.
     system("pause");

     // Return some value.
     return 0;

} // end main

/* getline: read a line into s, return length */
int getline()
{
     int c, i;
     extern char line[];

     for (i=0; i < MAXLINE-1 && (c=getchar())!=EOF && c!='\n'; ++i)
           line[i] = c;

     if (c == '\n') {
           line[i] = c;
           ++i;
     }

     line[i] = '\0';

     return i;
}

/* copy: copy 'from' into 'to'; assume to is big enough */
void copy()
{
     int i;
     i = 0;
     extern char line[], longest[];

     while ((longest[i] = line[i]) != '\0')
     ++i;
}


Output.

The longest line is…


The C Programming Language (K&R) 01x15—Char Arrays & Functions—Exercise 1-19


There’s one last bit of secton 1.09 of “The C Programming Language” by Brian Kernighan and Dennis Ritchie aka K&R:

Exercise 1-19. Write a function reverse(s) that reverses the character string s. Use it to write a program that reverses its input a line at a time.

s[] is a character array holding the user’s input. The function will take element 0 and put it in element n and n to 0 and keep going until the string is reversed. There’s many ways this can be done, but the objective is clear: s[]has to contain the reverse order and the array is returned to the caller with the changed data.

If you work simply with the array variable, you can’t copy element n to m without losing m. You need some type of temporary variable.

Approach 1: Use the copy function to create a duplicate of s[]. Then loop through the duplicate to and store the data in s[]. Requires one loop counter, but double the memory for the user data.

Approach 2: Start at s[n], store that data in a temp variable. Move s[0] to s[n] and temp to s[n]. Continue through s[] until you’re at the midpoint.

I’m not certain which approach is faster, but the second clearly uses less memory and even with 4GB or more of RAM, minimizing memory is a good objective.

*   *   *

The highlights show the changes to the code from here.
 

Sample Code.

I am using Visual C++ 2010 and created the sample code as a console application.

// Function prototype.
int getline(char line[], int maxline);
void copy(char to[], char from[]);
void reverse(char s[]);

// The standard library includes the system function.
#include <cstdlib>

// Standard I/O library.
#include <cstdio>

#define MAXLINE 1000 /* maximum input line length */

int main()
{
     int len; /* current line length */
     int max; /* maximum length seen so far */

     char line[MAXLINE]; /* current input line */
     char longest[MAXLINE]; /* longest line saved here */
     max = 0;

     while ((len = getline(line, MAXLINE)) > 0)
           if (len > max) {
           max = len;
           copy(longest, line);
           }

     if (max > 0) /* there was a line */
           reverse(longest);
           printf("%s", longest);

     // Keep console window open.
     system("pause");

     // Return some value.
     return 0;

} // end main

/* getline: read a line into s, return length */
int getline(char s[],int lim)
{
     int c, i;

     for (i=0; i < lim-1 && (c=getchar())!=EOF && c!='\n'; ++i)
           s[i] = c;

     if (c == '\n') {
           s[i] = c;
           ++i;
     }

     s[i] = '\0';

     return i;
}

/* copy: copy 'from' into 'to'; assume to is big enough */
void copy(char to[], char from[])
{
     int i;
     i = 0;

     while ((to[i] = from[i]) != '\0')
     ++i;
}

// Reverse order of char array.
void reverse(char s[])
{
     // S is a character array.
     // This function will reverse the
     // order of the content.
     // 1 to n, n to 1.

     // Temp hold.
     int c = 0;
     // Counters.
     int iFrom = 0;
     int iTo = 0;

     // Find length of array.
     while (s[iTo] != '\n')
           ++iTo;

     // Calc midpoint.
     int Midpoint = iTo / 2;

     // Char array holds:
     // [user data], [\n], [\0]
     --iTo;

     // Loop through array.
     for (iFrom; iFrom < Midpoint; ++iFrom, --iTo)
     {
           // Temp holder.
           c = s[iTo];
           // Front to back.
           s[iTo] = s[iFrom];
           // Back to front.
           s[iFrom] = c;
     }
} // end reverse


Output.

The longest line, printed in reverse order, is…


Saturday 22 January 2011

The C Programming Language (K&R) 01x14—Char Arrays & Functions—Exercises 1-16-18


Still on Section 1.9 of “The C Programming Language” by Brian Kernighan and Dennis Ritchie aka K&R and their discussion of character arrays and how to use them as an argument for a function. This entry continues from here. That entry deals with the sample code in the book on finding the longest line of text entered by a user on the keyboard. This entry deals with their exercises 1-16 to 1-18.
 
Exercise 1-16. Revise the main routine of the longest-line program so it will correctly print the length of arbitrary long input lines, and as much as possible of the text.

It’s not clear to me what this exercise is about. I suspect it has to do with how console inputting works on their system. Does input stop when it hits the far right of the screen? Not sure. If the problem arises because only the first N characters are printed per line, then see below.

Exercise 1-17. Write a program to print all input lines that are longer than 80 characters.

Because of how the getchar() function is implemented in my compiler, this issue doesn’t show up. I can type lines far longer than 80 characters without it creating any problems, therefore, I can’t write code to fix a problem that doesn’t exit. I could simulate the problem, but won’t.

If you want to print only 80 characters per line, you would use a loop that prints when the modulus of the character counter and 80 is zero except when you when you hit the null character and you print what’s left.

Exercise 1-18. Write a program to remove trailing blanks and tabs from each line of input, and to delete entirely blank lines.

This exercise will be the focus of my sample code, but I have some questions. It’s seems the trailing blanks are from the last word to the end of line, but what about the tabs? All tabs or trailing tabs? It’s not clear to me. Let’s assume, it’s only trailing tabs and spaces that should be removed. There’s one problem with this, you won’t see any difference on the screen. Blank, nothing, a space or tab and are all the same. So, I’ll replace spaces with B and tabs with T. That way you can see.

As for blank lines, they exist when the element at index zero is the newline character or ASCII 10. That’s easy enough to find them.

Pseudocode.

While Input is not EOF.
     Get input until line is filled.

     If first character is newline, ignore.
     If first character is NOT newline, replace trailing spaces with B and trailing tabs with T and output line.

Time to write the code.

*    *    *

I had more difficulty coding this exercise than I expected. Part of the delay was learning certain aspects of the language, or making sure I had something correct, but mostly I followed the previous code instead of my pseudocode. Lesson learnt.
 

Sample Code.

I am using Visual C++ 2010 and created the sample code as a console application.

// Function prototype.
int PrintLine(char line[], int maxline);

// The standard library includes the system function.
#include <cstdlib>

// Standard I/O library.
#include <cstdio>

#define MAXLINE 1000 /* maximum input line length */

int main()
{
     int c;
     int i = 0;
     char LineIN[MAXLINE]; /* current input line */

     // Main loop
     while ((c=getchar()) != EOF)
     {
           // Get line of text from user.
           if (i < MAXLINE-1 && c != EOF && c!='\n') {
                // Add char to the line.
                LineIN[i] = c;
                ++i;
           }
           else {
                // Print the line.

                // Newline not stored in for loop.
                if (c == '\n') {
                     LineIN[i] = c;
                     ++i;
                }

                // End of char array.
                LineIN[i] = '\0';

                // Print line.
                PrintLine(LineIN, MAXLINE);

                // Reset counter.
                i = 0;
           } // end if
     } // end while

     // Keep console window open.
     system("pause");

     // Return some value.
     return 0;

} // end main

int PrintLine(char line[], int maxline)
{
     int i, len;

     // Don't print blank lines.
     if (line[0] == '\n')
     {
           // Show we don't print blank lines.
           printf("Blank line.\n");
           // Return.
           return 0;
     }

     // Find len of the char array.
     // Normally look for '\0' but here it's newline char.
     for (len = 0; line[len] != '\n'; ++len)
           ;

     // Look for trailing tabs or spaces.
     while (len)
     {
           --len;
           if (line[len] == ' ')
                line[len] = 'B';
           else if (line[len] == '\t')
                line[len] = 'T';
           else
                // No more trailing items.
                // End checking.
                len = 0;
     }
    
     // Outline line.
     printf("%s", line);

     // Return something.
     return 0;
} // End PrintLine.


Output.



Friday 21 January 2011

The C Programming Language (K&R) 01x13—Char Arrays & Functions—Section 1.9


Section 1.9 of “The C Programming Language” by Brian Kernighan and Dennis Ritchie aka K&R discusses character arrays and how to use them as an argument for a function. The general rule is that a function receives its argument as a value and not a reference to a variable. That means the called function can manipulate the argument without affecting the variable in the calling function. That’s not the case with a character array. The function can change the array even though the variable is declared in the calling function. Note: there are excepts for variables declared outside of a function (i.e., global scope) and where pointers are used.

The sample code in section 1.9 inputs characters from the keyboard one line at a time and finds the longest line from all entered.

Two lines of code of interest.

for (i=0; i < lim-1 && (c=getchar())!=EOF && c!='\n'; ++i)

The loop will continue provided three conditions are true:

1. The current character count (i) is less than the limit less 1,
2. The current character is not EOF, AND
3. The current character is not a newline.

The first condition is required because of the maximum number of elements the array can hold. Without it, you could exceed the limit and write data to memory that is being used elsewhere. This leads to corrupted data and crashed programs.

while ((to[i] = from[i]) != '\0')

C results in code you wouldn’t expect but work. A single equal sign is the assignment operator and not the equal to test. The condition is true until it reaches the end of the array as marked with the null character.

Sample Code.

I am using Visual C++ 2010 and created the sample code as a console application.

// Function prototype.
int getline(char line[], int maxline);
void copy(char to[], char from[]);

// The standard library includes the system function.
#include <cstdlib>

// Standard I/O library.
#include <cstdio>

#define MAXLINE 1000 /* maximum input line length */

int main()
{
     int len; /* current line length */
     int max; /* maximum length seen so far */

     char line[MAXLINE]; /* current input line */
     char longest[MAXLINE]; /* longest line saved here */
     max = 0;

     while ((len = getline(line, MAXLINE)) > 0)
           if (len > max) {
           max = len;
           copy(longest, line);
           }

     if (max > 0) /* there was a line */
           printf("%s", longest);

     // Keep console window open.
     system("pause");

     // Return some value.
     return 0;

} // end main

/* getline: read a line into s, return length */
int getline(char s[],int lim)
{
     int c, i;

     for (i=0; i < lim-1 && (c=getchar())!=EOF && c!='\n'; ++i)
           s[i] = c;

     if (c == '\n') {
           s[i] = c;
           ++i;
     }

     s[i] = '\0';

     return i;
}

/* copy: copy 'from' into 'to'; assume to is big enough */
void copy(char to[], char from[])
{
     int i;
     i = 0;

     while ((to[i] = from[i]) != '\0')
     ++i;
}


Output.
  
The longest line is…