Concordance for the volumes of Deivathin Kural.
1. Concordance - the basic motivation
The seven volumes of Deivathin Kural constitute a huge encyclopedia
of information relating to our scriptures, history, languages, life styles,
puranic history of temples and lot more.
This massive store of knowledge in seven volumes, is a rich source of
information for us to
understand what our scriptures say,
research into our past,
answer questions about Sanathana Dharma,
find out how places got their names,
how Sanskrit stands as a base language from which words
in different Indian languages as well as other languages of the
world, have come,
and many more intriguing thoughts.
With a thousand pages or more in each volume, it will not be
easy for any interested person to cite the section or page
in a volume that refers to a phrase, concept or principle.
A word list for each volume would run into several thousands
and it is for this reason a printed index accompanying each volume
is practically ruled out.
The computer age is providing us an opportunity to try an
attempt at generating the Concordance.
2. The richness of the text in the seven volumes
The volumes of Deivathin Kural are written in Tamil and cover
a vast range of words and phrases from Tamil and Sanskrit
literature.
It is no surprise for the reader who will often find
that Maha Periyava had provided in a simple manner,
explanations which automatically include meanings for
many words, thus making each volume almost like a
dictionary. This will help linguistic scholars cite references
to the volumes when faced with a need to provide a meaning
to an obscure usage of a word or phrase.
The volumes include hundreds of quotations which serve to
understand a concept, refer to historical events, recall proverbs,
etymology of words and more.
The references to Slokas and lines of Poetry from Scriptures and
Literature is a noteworty aspect of the contents of each volume.
This reference is provided by actually quoting the original text
that is being referenced. Often footnotes carry additional details
of the location of the reference in its original source. In essence,
this authenticates the reference beyond doubt.
Top of Page
3. Organization of text seen in the volumes
The following picture illustrates the basic organization of text
in each volume. A major concept is explained through a number
of sub-sections, with specific titles, and each subsection in turn
is split into short essays relating to a general topic pertinent to
the concept and the sub-section. Each essay is given a title and the
title itself can provide clues to what one may be searching for.
Each essay is made up of paragraphs and may include quotations
from scriptures, Sanskrit and Tamil literaure. Interestingly Maha Periyava
had often resorted to citing English words to illustrate an idea which the
reader familiar with English may easily understand whereas the same
expressed in Tamil may call for more effort, simply on account of the
difficulties with Tamil or Sanskrit vocabulary.
Important words and phrases are generally given in quotes for the
reader to appreciate their role in the context of the subject or concept
under discussion.
Most essays include footnotes which serve as additional information
for the reader to relate the text to well established references.
Many essays include long and sometimes very long paragraphs.
Such long paragraphs were not a problem for readers of the earlier
generations but may pose difficulties for the modern younger
generation. Also, Tamil writing style allows long compund words
built up by combining three or even four basic words. Here are
a few examples.
Top of Page
4. Word statistics
The table below provides details of the results of analysis of the words
from all the volumes of Deivathin Kural.
Vol |
All words in the volume |
ALL Long words |
Root Filtered Long words* |
All Short words |
Root Filtered Short words* |
Quoted words [Phrases] |
English words |
Vol-1 |
126955 |
19846 |
15098 (2232) |
43830 |
28083 (1754) |
636 [1166] |
576 |
Vol-2 |
163834 |
26111 |
19211 (2637) |
53554 |
31836 (2122) |
1541 [1497] |
996 |
Vol-3 |
148802 |
26546 |
20566 (2023) |
50500 |
33289 (2493) |
1610 [1592] |
992 |
Vol-4 |
178436 |
28584 |
21317 (2912) |
62101 |
33289 (3014) |
1865 [1902] |
638 |
Vol-5 |
180020 |
29438 |
21907 (2458) |
59217 |
33869 (2778) |
1685 [2465] |
509 |
Vol-6 |
213077 |
30980 |
23203 (3399) |
71352 |
40707 (6048) |
2881 [3135] |
706 |
Vol-7 |
150154 |
23355 |
19029 (4238) |
50688 |
34014 (3340) |
2073 [1644] |
688 |
* Explanation of Figures in parentheses is given at the end of this section
All words refer to the total count of the words in the volume. A word in
this context is a string of aksharas bounded by the space at
both ends. This list will include words which have just one akshara, a
number, words in English, Roman numerals and possibly some special symbols. The figures
do not reflect the accuracy of computations but can be taken to show
values which are probably correct to the nearest 50.
All Long words are characterized by their length above 6 aksharas. In this list,
duplications within a paragraph have been eliminated but those across the
titles are retained as also occurrences in the same title. Long words tend to have
fewer duplications.
Root Filtered Long words constitute the set where the first four
aksharas will typically share a root. What this means is that words sharing the first
four aksharas are likely to be derivations with varying suffixes. Identifying
the basic root words permits the long words list to be shrunk to a much
smaller set. This set is therefore the important collection of words most
likely to be searched for in the context of scriptures, historical data etc.
All Short words are obtained by identifying all the words having 3-6 aksharas.
Duplicates within a paragraph have been eliminated but duplication across titles
is retained. This set is perhaps the clue to common words which are also
part of the set used in the scriptures, historical texts, names of places etc.
Root Filtered short words are obtained by identifying words with common
roots within the set and selecting one or two to represent the root. This set is
likely to include most words one is likley to look for. Identifying the root words
is done based on the first three aksharas in a word.
Quoted words constitute the set of all words seen within quotes in each
titled section. These may be just one word aksharas or longer ones.
The count of words under quoted phrases refers to all the instances of
a phrase consisting of 2 to 20 words seen inside quotation marks. The
quoted words and phrases are likely to be of great interest to viewers.
English words refer to all instances of use of English within a titled section.
in this list common words like the artilce "a", "the" , "and" , "of" etc., have been
removed.
(*) For Root Filtered words, the figure in parentheses indicate
the number of words selected through a manual scan. This manually selected list
will be helpful for offline reference. More information
on this is given in The
approach to creating the concordance
Top of Page
5. Variations in spelling words
Text in all the volumes relate to concepts, ideas, and common
aspects of life across the country. When a word relates to a language
other than Tamil, writing the same in Tamil can often lead to variations.
Tamil orthography (writing system) employs fewer symbols compared
to Devanagari and most other scripts of India's languages. Consequently
the same word, typically from Sanskrit, may admit of multiple representations
in the Tamil script.
On account of this one may find the same word typed in differently in
different pages across the volumes. Searching for a word may then
pose difficulties if matches cannot be obtained due to spelling
differences. However, this may not result in frustrating attempts to
locate a word since matches in the first three aksharas are what
count. It is therefore adequate to remember possible variations for
the first few aksharas of the word in the query.
The image below illustrates some of the most common differences
observed in the volumes when the same world is seen typed in
differently. One may keep this point in mind when searching for words,
typically from Sanskrit or even English.
Top of Page
6. Rationale behind the current approach
At this point in time when Web Technologies have advanced to a high level
of sophistication, one can certainly question the need to have
an independent search application as provided at this site. While this is a
legitimate point, we wish to state the important reasons behind our efforts
to create this site along with the search application.
One does not look for keyword matching alone when attempting a search.
The person looking for information has a "context" in mind and this context
is difficult to specify in terms of keywords alone. That is perhaps the reason
why search engines return hundreds of results since all of them work on
keywords or keyword based phrases, for ascertaining the context is a very
complex issue.
At this site, the context is restricted to information contained in the volumes
of Deivathin Kural. Besides, the context is also enhanced when the type of
word one is looking for is also specified, not in the form of a keyword but as a
parameter in search. This helps considerably in practice as it allows a person
to specify the context, which in turn helps the application narrow down the
results.
Certainly no claim is made about the specific advantage of this approach.
one must concede however, that a massive store of information about our culture, history,
life styles, languages etc., contained in the seven volumes does benefit from the
availability of a facility to locate information in the store that is structured to
cater to the context specified above.
These ideas are further elaborated in the section on The
approach to creating the concordance
Top of Page
7. Possible enhancements
At this point in time (Nov. 2018) the exercise of concordance generation has
limited itself to preparing a word list. The somewhat arbitrary approach to filtering
words for concordance may be helpful but it is possible that one might think
of specifying contexts differently, say based on what scholars have traditionally
looked for in the volumes while citing a reference. This may lead to a search
feature that might identify phrases(other than those which appear in quotes),
where the context relates to specific scriptures, history, epigraphical references
and such.
Viewers who find our exercise a worthy attempt, may kindly offer their views
on this. Also, the computer savvy among the younger generation may come forward
to take this effort forward.
8. Offline versions of word lists
The downloads page has links
to downloadable files for offline reference. These lists are shorter in that they
display only a few thousand words for each volume. Yet, one may find that
these lists offfer a first level reference for the roots which may reach out to
much larger list through the search application at this site.
|
About Concordance
Items discussed in this page
Richness of Text seen in the volumes
Organization of Text in each volume
Word Statistics
Variations in spelling words
Rationale behind current approach
Concordance search
The concordance search page at this site will help you search for words in
all the volumes of Deivathin Kural
|