The Editor will
permit the language in which data entry is to be done, to be selected dynamically
during data entry. Thus it would be very easy for one to start the data
entry in say Tamil, and after entering some text, switch to Gurmukhi and
type in an explanation in Punjabi. The software supports more than eleven
scripts from which a choice can be made. Also, the Roman script used for
English, is also directly supported by the Editor. Unlike applications
under Microsoft Windows which require a switch to a different keyboard
layout, the IITM Editor allows switching the scripts using pull down menu.
4. Support for a very
comprehensive character set
The Editor permits data entry
and display of complex Conjunct characters (familiarly known as the Samyuktakshars)
in an effortless manner and permits Matras (Vowel Extensions) and special
symbols to be typed in without any difficulty. These Matras and special
symbols will be very useful for a teacher preparing lessons on the writing
methods for different scripts. The system supports data entry for sixteen
vowels, forty six consonants (being a superset of consonants from different
languages) and more than eight hundred conjunct characters which can be
formed from the basic set of consonants. Thus the system caters to more
than twelve thousand individually recognizable aksharas.
Besides these, standard Punctuation
marks and numerals are supported. These can be entered in the local language
input mode directly.
In this respect, the IITM
Editor scores over other approaches based on Unicode or ISCII where the
data entry proceeds based on a set of aksharas that in general, do not
provide many required symbols which are part of the writing systems in
vogue. If required, text in Unicode or ISCII may be converted to the syllable
based representation used by the software and thus permit fairly comprehensive
text processing.
The set of characters
supported by the Editor also includes special symbols used in Vedic texts
and classical (Carnatic) music. This feature is again very useful for academicians
and teachers who would produce printed text illustrating different aspects
of Vedic and Music rendering. The Editor will permit Vedic Notation conforming
to Yajur Vedic texts as well as Sama Vedic texts.
6. Support for other World
Languages
One may use Editor to prepare
documents in other languages, which are based on the Phonetic writing system.
Sinhalese, Bali, Hebrew, Greek and Japanese Hiragana are supported along
with a few other scripts. One may also add a new language to the Editor,
if the language support files and the font files for the language are included.
Top
Languages/Scripts
supported by the Editor
The Multilingual Editor is
based on the concept of "syllable based coding" or simply, a representation
that corresponds to the sounds of the Aksharas of Indian languages. This
way, any language whose writing system follows a phonetic approach may
be accommodated into the set of languages supported by the Editor. It is
this representation which has helped us develop an enhanced version of
the Editor which can speak the text during data entry and thus help visually
handicapped persons use a computer in their own mother tongue. The syllable
based representation is very useful for transliteration across the languages
which are based on the same set of sounds (phonemes).
The
Editor supports text preparation in all the official languages of India.
All the official languages are covered in about ten Scripts, even though
more than one script may be used for a language.
The
scripts are, Devanagari (for Hindi and Marathi besides Sanskrit), Gujarati,
Gurmukhi, Bengali (and Assamese), Oriya, Telugu, Tamil, Kannada and Malayalam.
Urdu is also a national language though its writing system is quite different.
A special version of the Editor caters to Urdu.
The
latest version of the Editor includes support for preparing text in Bharati
Braille including Nemeth codes so as to permit direct printing of Braille
Documents on standard Braille Embossers.
The following are some languages
having writing systems similar to those of the Indian languages, where
the letter or character shape seen in text directly corresponds to a sound.
These languages are also supported in the Editor. They are,
Sinhalese, Bali, Oromo (Ethiopia),
Hebrew, Greek, Japanese Hiragana, Avesta and Arabic.
It is possible to accommodate
almost all the South East Asian languages which include Thai, Burmese,
Malay, Tibetan and others.
One might view the Editor
as a data entry method for typing in composite letters i.e., characters
which are built up from multiple shapes. A syllable is precisely that.
Text involving accent marks may also be viewed as consisting of such characters.
Hence the Editor may also be used for data entry of such characters. It
will be of help in preparing text in the International Phonetic Alphabet.
Top
Application
Areas for IITM Software
In the earlier section it
was mentioned that the IITM Editor not only serves as a simple and useful
word processor but also produces a file which may be processed using the
text processing functions provided by the local language library. The local
language library forms the application programmer's interface for handling
text strings in Indian scripts.
Given below are the typical
application areas where the text prepared by the Editor would find immediate
use.
1. Text processing for
linguistic analysis
There are countless possibilities
for linguistic analysis with respect to Indian languages. The text prepared
by the Editor may be analyzed for
The Grammatical Structure
of sentences.
Analysis of the Frequencies
of words and aksharas in ancient texts and manuscripts.
Analysis of the metres used
in poetry and automatic identification of the "Chandas"
Concordance generation
General linguistic analysis
e.g., Morphological studies, Parsing etc..
2. Typesetting Documents
The documents prepared
by the Editor may be immediately typeset using TeX to produce high quality
multilingual printouts. The utilities for typesetting are also provided
with IITM Software Package. The Rich Text Format files generated by the
editor may also be used in Microsoft Windows based Typesetting Applications.
3. Generating multilingual
Web Pages
The Editor can be
used to prepare web pages supporting multilingual text in all the different
Indian languages. One can prepare the required HTML documents by directly
typing it in with the Editor. The llf2html.exe
utility supplied with the Editor may be used to produce the final HTML
file. Under Linux a similar utility is available (llf2html). Another useful
utility is the converter to generate files in the PDF format, suitable
for preparing e-texts.
4. Importing Multilingual
text into other applications.
The Editor will
permit compatible Windows applications to import multilingual text with
great ease. The cut/copy and Paste features of the Editor will be of great
help in preparing data for applications such as Email, DTP (using Microsoft
Word or similar application). The Editor may be used to generate the .rtf
format file which the compatible applications may freely import. You can
cut and paste into a Microsoft Instant Messenger window and send a message
effortlessly.
Scholars will find this feature
very useful for preparing articles in English (using Word) which incorporate
quotations or other references to Multilingual text.
Under Windows, the editor
will also generate a .rtf file which may be viewed using Word pad or Word
and the specially designed rtf viewer for Linux. The .rtf format is fairly
standard and it will thus be possible to exchange documents between Windows
and Linux in this format. Of course the .llf file itself could be exchanged
since a version of the editor runs under Linux as well.
The IITM Package also provides
utilities for converting the local language documents to Unicode text as
well as PDF. Since these two are generally accepted as universal formats,
one gets the advantage of using the editor for preparing text for several
other applications which are based on these formats. Specifically, the
PDF format can be used to prepare e-books in Indian languages.
5. Preparation of
Statistical Tables and Schedules which can be seen and understood in all
the languages of India.
The ability of the
IITM software to display a given text in all the different languages makes
it ideal for use in the preparations of tables, maps and other announcements
involving Names of Places, persons etc. The automatic transliteration capability
allows for transparent handling of the information across all the languages.
6. Direct feature to handle
Roman Text.
English Text (Roman)
may be typed into the document very easily without having to change any
software setting. A hot key is provided for this (details follow). The
"tconvert" (as well as "tview") utility may be used to convert Roman Transliterated
text into Indian scripts. The utility allows four different transliteration
schemes. This way, many manuscripts which have been prepared in the past
using transliteration, may be converted directly into the required Indian
scripts. Transliteration is covered in a subsequent section.
Top