This is the mail archive of the docbook@lists.oasis-open.org mailing list for the DocBook project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [docbook] Braille


I have doing some research into Braille and found many resources on the 
Internet that have been most helpful in giving me some insight into the scale 
of the problem and technical challenges posed.

While doing my research I encountered this gem, XML TO BRAILLE, from Computers 
to Help People, Inc. (CHPI) [http://www.chpi.org/]. It is free and open 
source, a tarball is available from http://www.chpi.org/xml2brl-0.2.tar.gz. 
Currently xml2brl only works on Linux, but press releases indicate that they 
are working on a Windows port. Simple tests on my box have given great 
results. I have yet to get the time to perform tests on complex documents. 
What I like is that xml2brl is easy to use and optimized for technical 
literature.

For your convenience I attached the README file for xml2brl. I hope 
attachments can go to this list. If they do not, either download the tarball 
or send me mail and I will send them to you off list.

-- 
Sean Wheller
Technical Author
sean@inwords.co.za
http://www.inwords.co.za
Registered Linux User #375355

Attachment: README
Description: Text document

Title: The xml2brl Program

THE xml2brl PROGRAM

This is Release 0.2 of the xml2brl program. Changes from the previous release are listed in the file ChangeLog. This README file details any changes in usage. The most notable is that the program now handles some MathML.

If you are reading the plain-text README file, you may find it useful to load README.html into your browser. This will enable you to go directly to the sites where you can download the libraries upon which this software depends. Once you have installed xml2brl, you can also get a braille copy by running README.html through the program. It is written in xhtml, which is an xml flavor.

The braille translation part of the xml2brl program is based on BRLTTY. All the necessary BRLTTY files for use in the U.S. have been included in the package. However, if you need different contraction tables or different text tables, you must obtain them from BRLTTY. You can download the latest version of BRLTTY from http://dave.mielke.cc/brltty.

Besides BRLTTY, this software depends on the following libraries:
glib ftp://ftp.gtk.org/pub/gtk/v2.4
libxml2 http://www.xmlsoft.org
gdome2 http://gdome2.cs.unibo.it

You must download the latest versions of these libraries and install them in the above sequence.

The program accepts input files in either xml or plain text and in many natural languages (which may be in UTF-8 Unicode) and produces a brf file suitable for printing directly on an embosser. The brf file has the same format as the files on Web-Braille and should behave exactly the same.

xml files must be well-formed. They are transcribed as specified by a semantic-actions (.sem) file. If no such file exists for a given root element, a prototype file is created. Its name is formed by adding ".sem" to the name of the root element, for example, "dtbook3.sem". The user must then edit this file to obtain phoper transcription. The program will print a warning message if the editing step is omitted. Instructions on how to do this editing are given in the section "SEMANTIC-ACTIONS FILES" in this document.

The program tests whether a file is xml. If not, it assumes a plain text file. In this file, lines may be of any length. Paragraphs should be separated by blank lines. Lines within paragraphs are concatenated before translation, with blanks in place of newlines. If a blank line is desired in the output, use three blank lines.

Whether the file is xml or plain text, paragraphs are indented two spaces. There is a braille page number in the lower right-hand corner of each page. If an xml file contains print page numbers, and this has been specified in the semantic-actionss file, a page-separator line is placed between print pages, and the print page number appears in the upper right-hand corner, proceeded by the letters a, b, etc.

The command line is:
xml2brl inputfile outputfile
If you omit both inputfile and outputfile the program acts as a filter, taking input from stdin and delivering output to stdout. This enables xml2brl to be used in a chain of printer drivers, with output directly to an embosser, if desired. If you wish to specify an output file but take input from stdin, use a minus sign in place of inputfile. Options are set in a configuration file discussed in the section "CONFIGURATION FILE".

The author wishes to acknowledge his debt to the BRLTTY team. to learn more about BRLTTY go to its official website, http://dave.mielke.cc/brltty. The section "FILES" below tells which files have been copied from BRLTTY.

Like BRLTTY, this software is under the Gnu Public License (GPL). The non-BRLTTY portions are copyright by the author, John J. boyer, director@chpi.org . The libraries listed previously are all part of the GNOME project and are under the Lesser Gnu Public License (LGPL). Details are given in the file C COPYING

INSTALLATIoN

This is an alpha release. Therefore, it is best to install it in a subdirectory of your home directory. To do this, download the distribution tarball into your home directory, then type
tar xfz xml2brl-xxx.tar.gz
This will create the directory xml2brl-xxx, where xxx is a version number. After installing any necessary libraries, go to the xml2brl-xxx directory and type "make". This will create the xml2brl program. If you wish to re-create the program, first type "make clean" and then "make".

Before you try to run the program, execute the following statement at the command prompt:
export LD_LIBRARY_PATH='/usr/local/lib'
You may wish to add this command to your .bashrc script.

SEMANTIC-ACTIONS FILES

These files tell xml2brl how to handle your documents. Whenever the program encounters a new root element, it creates a prototype semantic actions file. Each line in this file has two columns. The first column is the word "no", signifying that no semantic action has been specified. The second column may contain one of the following: an element name; an element name, followed by a comma, followed by an attribute name; an element name, followed by a comma, followed by an attribute name, followed by a comma, followed by the first few characters of an attribute value. The program prints a message saying it is creating this file, then terminates. Semantic files have names composed of the root element name and '.sem'.

To get xml2brl to transcribe your document correctly, you must edit the semantic file, replacing the word "no" in the first column with an appropriate semantic action, such as "para" for paragraph, "heading1" for the main heading, etc. The file sem-enum.h contains a list of valid semantic actions, most of which should be self-explanatory. If you rerun the program without editing the semantic-actions file, it prints a message saying that the output will be unformatted. You can add comments to the file by using a number sign (#) as the first non-blank character in a line.

If you transcribe a new document with the same root element, but with additional element names, attribute names or values, these will be added to the end of the semantics-action file, proceeded by the comment "#appended entries". You may then edit the new entries. If you wish the program to continue to take no action for an entry, leave it unchanged. Do not comment it out. This will cause the program to add it to the end of the file as a new entry.

Several semantic-actions files are provided with the program. There is one for dtbook3 files, such as those produced by Bookshare.org, for xhtml files, with or without included MathML, for Microsoft Word files exported as xml, and for docbook files.

CONFIGURATION FILE

As mentioned previously, options for xml2brl are set by a configuration file. This file is called "xml2brl.cfg" and resembles the semantics-actions files. Each line has two columns, a keyword, such as CellsPerLine, and a value such as 40. Comments are proceeded by "#". The keywords should be self-explanatory.

FILES

The following files have been copied without change from BRLTTY: brldefs.h brl.h countries.cti ctb_compile.c ctb_definitions.h ctb.h ctb_translate.c en-us-g2.ctb misc.h tbl.c tbl.h text.nabcc.tbl. Note the following exceptions: The line "include countries.cti" in us-en-g2.ctb has been changed to "include specsym.cti". The misc.c file was cut down to only the functions needed by xml2brl and these functions were considerably modified.

The following files were produced by the author:

brffilt.c: A small filter for viewing brf files on a braille display with translation mode in BRLTTY turned off. It can also be used as a prototype for writing other filters. To compile it, use the command line "gcc -o brffilt -O2 -Wall brffilt.c"

ChangeLog: log of changes made from release to release

COPYING: Detailed license

dtbook3.sem: Semantic-actions file for books from Bookshare.org

en-us-mathtext.ctb: Translation table for math documents

examine_document.c: Traverse2s the DOM tree to determine characteristics of the document, such as whether it contains math. Also does preprocessing.

html.sem: Semantics-action file for xhtml documents

Makefile: For compiling the whole program.

readconfig.c: Reads and processes the configuration file

readconfig.h: Header file for above

README: plain-text version of the folling

README.html: This file.

read_TextTable.c: Basically a wrapper for the functions in tbl.c

semantics.c: Contains functions for handling semantics-action files and tables

semantics.h: Header file for semantics.c; includes sem_enum.h

sem_enum.h: list of valid semantic actions. ºNote that if you change this file you must recompile the entire program.

sem_rout.c: Contains non-trivial semantic routines or rutines that may vary with natural language

sem_rout.h: Header file for above

specsym.cti: Special symbols needed in translation of xml files

transcribe_chemistry.c: Handles chemical formulas in DOM tree

transcribe_document.c: This is the basic transcription routine which traverses the DOM tree and calls transcribe_paragraph, transcribe_math, etc., as needed.

transcribe_graphic.c: Handles SVG graphics in the DOM tree

transcribe_math.c: Handles MathML and other xml math notations

transcribe_music.c: transcribes music notation expressed in xml

transcribe_paragraph.c: Handles "paragraphs", including headings, in the DOM tree

transcriber.c: Contains the low-level transcription routines, including the routine for transcribing plain text.

transcriber.h: Header file for above

w_wordDocument.sem: semantics-action file for Microsoft Word documents exported as xml

xml2brl.c: The main program.

xml2brl.cfg: Configuration file

xml2brl.h: Header file for main program

---------------------------------------------------------------------
To unsubscribe, e-mail: docbook-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: docbook-help@lists.oasis-open.org

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]