Now available at
I hope this utility increases access to information stored in CHM archives.
It is explained below.
August 16, 2007
Copyright 2007 by Jamal Mazrui
Modified GPL License
Running on Windows 98 and above, CHM2TXT (chm2txt.exe) is a command line
utility that converts a file from Compiled HTML format (.chm) to structured
text (.txt). Combining multiple HTML and graphics files, the CHM format is
commonly used for software documentation, e.g., what is displayed by
pressing F1. The usual help viewing program, however, can be challenging to
search globally or to read continuously. A single, structured text file
provides an alternative in such cases. CHM2TXT is a free, open source
program that seeks to fill an observed need of many users. Note that its
present limitations include the fact that topics are ordered alphabetically,
rather than according to the outline view of the CHM file.
The command line syntax of CHM2TXT is as follows:
chm2txt "SourceFile.chm" "TargetFile.txt"
A file name should be fully qualified, that is, include a leading path —
either absolute or relative — if not located in the current directory.
Quotes around a file may be omitted if it does not include a space
character. The target may be omitted to produce one named like the source
except for extension. Status messages are displayed on the console (via
standard output) during the conversion process.
The chm2txt.exe executable may be copied to and run from any directory. The
program creates a workspace in a subdirectory of the user's temporary
directory. Batch files or other applications may invoke CHM2TXT in order to
convert multiple files with a single command, or to provide a graphical user
interface for specifying source and target files. For example, such
capabilities are included in the EdSharp editor available at
The text file produced by CHM2TXT observes a few conventions that facilitate
navigation in editors that implement the "Homer editor interface." Besides
EdSharp, TextPal is another such application, available at
A structured text document is divided into sections separated by a character
sequence consisting of a hard page break and line break (ASCII 12, 13, and
10 codes). The first section is the table of contents, and remaining
sections are the body. Each topic name in the contents is also a section
heading in the body.
Relevant Homer keys for navigation are as follows.
Press Control+PageDown to go to the next section, or Control+PageUp for the
Press F6 to go from a topic in the contents to its corresponding section in
the body. Press Shift+F6 to reverse that, going from a section in the body
to its topic in the contents.
Press Control+F6 to search for a section based on text in its topic name.
Press Alt+F6 to search for the next match.
A structured text document may also be converted to an equivalent HTML
version, with a table of contents linked to section headings. Press
Control+H to convert the current document to HTML format. Press Control+S
to save it to disk. Press F5 to launch it in the default web browser.
I developed CHM2TXT with the Perl Developer Kit 7.0 from
It incorporates Perl 5.8, as well as the libraries Text::CHM,
HTML::Stripper, and File::OldSlurp from the Comprehensive Perl Archive
The distribution archive, chm2txt.zip, contains Perl source code
(chm2txt.pl) and the batch file to compile it (compile.bat). The code is
covered by a modified version of the GNU General Public License (GPL), which
is explained at
Essentially, software that uses the code must be open source, except that I
am willing to relax GPL conditions in a particular case if persuaded that a
greater good would result.
I welcome feedback, which helps CHM2TXT improve over time. When reporting a
problem, the more specifics the better, including steps to reproduce it, if
possible. If you happen to be a programmer, please consider contributing
code that fixes a problem or improves functionality.
The latest version of CHM2TXT is available at the same URL,