Validating ASCII text files

In order to check files for valid ASCII text format I wrote a program in "C" with options for greater or less strictness.

The UNIX operating system was designed to use plain ASCII text files as a common container file format for many different purposes and it is a storage container for any data which can be encoded as a finite sequence of ASCII characters. Almost, since it uses ASCII which is an American oriented character set encoding because UNIX was developed by Bell Labs in the USA. Despite the great importance of this file format there seems to be no unique definition of what is a valid plain ASCII text file and there seems to be no standard utility program which will validate that a file does satisfy a precise definition of plain ASCII text file. Because of this curious situation, I was motivated to write a "C" program to validate plain ASCII text files.

I named it "picky" because it picks out any deviations from some precise definition of plain ASCII text file. It has several options to handle variants of the ideal precise definition. For example, tab characters are required in some text files and so picky has an option to allow tab characters. It also has an option to give a detailed report of any deviations. The standard UNIX "file" command is not able to handle these situations and so can not replace the use of picky or any similar validation programs.

Another use is to track down mysterious "invisible" characters that sometimes appear in text files and are difficult to detect because they are nonprinting. They can cause compilers to produce error messages when they appear in a program file but the errors go away when you retype the lines affected. My "picky" program can quickly find such bad characters in a file.

The source code of my program to validate ASCII text (picky.c), the associated picky man page (picky.1), and the plain text version of the picky man page (picky.txt) are available here. Get the picky tarball picky-2.6.tar.gz. A DOS/Windows version is contained in picky.zip.

Back to my computing page

Back to my home page
Last Updated Tue Sep 01 2015
Michael Somos <ms639@georgetown.edu>
Michael Somos "http://somos.crg4.com/"