Tuesday, April 22, 2014

download sanskrit ocr


online translation of sanskrit to hindi

Online translation of Sanskrit<---> Hindi
Online Sanskrit<---> English
 Click >>  sanakrit Dictionary


Sanskrit, OCR


Almost every Greek and Latin text is freely available on the Internet, but the same can hardly be said for Sanskrit. However, Sanskrit's online presence has slowly increased over the past few years, and it is set to increase more and more in the years to come. This "online presence" is directly related to how many Sanskrit texts are available on the Internet. Texts, though, are of two kinds: scanned text, which exists as a collection of large images, and digitized text, which exists as a "text" file (just like this web page). Scanned text is OK, but it is often difficult to use. Digitized text, meanwhile, can be searched and processed more easily.
In order to quickly and easily turn scanned text into digitized text, we need a tool that can perform optical character recognition (or OCR for short). This page is a short guide tousing OCR yourself. If you have a copy of a text that is not freely available on the web, please consider using an OCR program to spread it and make it more usable for everybody.


Only a few Devanagari OCR programs are available for public use. The most useful one is SanskritOCR. The software is difficult to find these days, but a few copies still survive on the web. You can download a copy of the software by clicking the link below:

Using SanskritOCR

Creating a new project

Before you can start processing a text, you need to create and save a new project file. You can do so by clicking either File -> New Document or the white page button:
New document
Save the file in the .sat (Stapel-Datei) format. You must save before you start OCR.
Now that you have your project space set up, it's time to bring in some material to process.

Importing files

You can import files either by scanning directly into the program or using saved images on your computer. I only have experience with using saved images, so I will describe that here. To load any number of image files, select File -> Open Image Files. You can import any number of image files at once. The program, however, recognizes only three image formats.bmp.jpg, and .png.
Your imported files will be listed in the Tools window. Use this window to view the different pages in your project.
Tools window
If you do not see the Tools window anywhere, click Ansicht -> Werkzeugfenster to display it.

Preparing the image

You can rotate the image by 90 degree increments:
Do not worry if your image is slightly tilted. SanskritOCR will still be able to read it.
Once your image has the right orientation, you must mark the Devanagari portions of the image so that SanskritOCR will be able to scan more effectively. You are required to do this; SanskritOCR will not let you proceed if you do not. To do so, click the markup tool:
Markup tool
Using this tool, click and drag on the document to create boxes. If your image is at an uncomfortable size, you can use the zoom tools to adjust the image:
Zoom in and out
If the page consists only of Devanagari, you can put everything in one large box:
Everything in one box
But otherwise, it's best to draw multiple boxes around the separate blocks of Devanagari:
Multiple boxes
Your text output will be presented in the order in which the boxes are numbered. You can use the tools next to the markup tool to delete and renumber the boxes in the image. You can also drag the edges of a box to better fit the text.

Cleaning and scanning

Before the scanning begins, you are required to clean the image. To do so, click the Clean image tool:
Clean tool
SanskritOCR will automatically clean, realign, and pre-process your text:
Cleaned text
All that's left is the final scan, which you can run by clicking the Start recognition tool:
Recognition tool
Results will depend on the quality of your scan. In the best case, the text is represented perfectly. In the worst, odd or unlikely combinations will appear. If the scanned text is faded, patchy, blurry, or so on, complete gibberish could be the result as well.
Watch out for gray boxes, which mark areas that could not be processed. Using more and smaller boxes in your markup may help. Oddly enough, you can also try using fewer and larger boxes.

Saving, processing, and using your output

The output from the scan cannot be saved directly. To copy your output to your computer's clipboard, click Recognised text -> --> Clipboard. For general-purpose processing, leave the Transcription mode on program transcription. The Range box will likely be grayed out. Click OK, then click OK again to close the dialog box that pops up.
Your text can now be pasted into a separate file, where it can be saved. Try pasting it in your favorite text program. If you do, you will probably get data like this:
kï¸catkåntåvirahagu?½å svådhikåråtpramattaµ ¸åpenåstaºgamitamadvimå var¹abhogye½a bhartuµ / yak¹a¸cadh÷?ï janakatanayåsnånapu½yodake¹u snigdhacchåyåtaru¹u vasatiº råmagiryå¸rame¹u // 1 //
You must process the text one more time. To do so, you can use the Sanscript tool provided on this site. Set your input scheme to SanskritOCR, and set your output scheme to whatever you like.
Your final product may still have artifacts or strange characters from SanskritOCR's raw output. Certain vowels or consonant groups may have been converted in odd ways. For these and many other reasons, proofreading is very important!
Now you have some raw text! Feel free to format and use this text however you like. If you have made a clean copy of the text, I encourage you to share it with sites like Sanskrit Documents or GRETIL so that people all over the world can have access to the fruits of your labor.

Common Questions

  • How do I save my project?
    SanskritOCR saves automatically. You can just close the program when you're done. Your work will also be saved if SanskritOCR crashes. If SanskritOCR freezes or starts acting strangely, don't be afraid to close the program and start it up again.
  • My results are terrible! What am I doing wrong?
    Use large grayscale images. If your image is not grayscale already, SanskritOCR will convert it for you, but this conversion seems to severely affect program output.
  • Can I process more than one page at a time?
    No, unfortunately.
  • I have an image file that SanskritOCR does not recognize. What should I do?
    Use an image processing program to change the file format. Irfanview is a good choice: it's lightweight, fast, free, and extremely powerful.
  • I have a PDF/TIFF/DJVU file that I would like to split into separate pages. How can I do this?
    As before, use an image processing program to split the file up. Irfanview does not have much support for these files built-in, but if you download some plugins the program will be able to handle all of these file types. If you are using Irfanview, try opening your file and selecting View -> Multipage images -> Extract all pages.
  • Is SanskritOCR still under development?
    Yes. Dr. Hellwig is working specifically on Hindi OCR, but the software will likely be able to deal with Sanskrit as well. The official website for SanskritOCR can be found here. The site mentions a major relaunch for the program, but there is no date provided.
If you have more questions, you can try emailing Dr. Hellwig.

Wednesday, April 16, 2014

Basic mathematical operations in Sanskrit

|| संस्कृतम ||

|| संस्कृत अध्ययन कार्ये स्वागतम् अस्तु ||

Welcome to learning learning Sanskrit

|| Sanskrit_Adhyayan || Sanskrit_Sanskruti || Bharatiya_Knowledge_Traditions || Sanskrit_Jagruti ||

सरल गणित-प्रक्रिया: -

१. योगे युति: स्यात् क्षययो: स्वयोर्वा धनर्णयोरन्तरमेवयोग:||

अन्वय - क्षययो: (राश्यो: -,-) योगे युति: स्यात्, (एवं) स्वयो: (+,+) (राश्यो: योगे युति:) वा| धन (+) ऋणयो: (-)

अन्तरं एव योग: स्यात्|

Meaning : Two positive or negative numbers will be added together but positive and negative number's difference is their addition.

Explanation : This algorithm explains very basic rule of addition of signed integers.

२. स्वयोरस्वयो स्‍वम् वध: स्वर्णघाते|

क्षयो भागहारे अपि चैवम् निरुक्तम्||

अन्वय - स्वयो: (+*+), अस्वयो: (-*-) वध: स्वम् (+) (भवति|) स्व-ऋण-घाते (+*-)

वध: क्षय: (-) (भवति)| भागहारे (/) अपि च एवम् निरुक्तम्|

Meaning : Multiplication of two positive or negative numbers is positive. Multiplication of positive and negative number is negative. Same in case of division.

Explanation : This algorithm explains rule for multiplication and division of signed integers.

३. योगान्तरं तेषु समानजात्योर्विभिन्नजात्योश्च पृथक् स्थितिश्च||

अन्वय - तेषु (अव्यक्तयोगान्तरेषु) समानजात्यो: योग: अन्तरं च (भवति) विभिन्नजात्यो: पृथक् स्थिति: च


Meaning : (In case of variables' adition/subtraction) Their similar terms are added and subtracted, different terms remain separate.

Explanation : For example, in following addition of variables -


adding/subtracting similar terms, it will give this result -

(5a+7a-2a)+(3b+2b)+(4c-c+2c) = 10a+5b+5c

४. अस्मिन् विकार: खहरे न राशावपि प्रविष्टेष्वपि निसृतेषु|

बहुष्वपिस्याल्लयसृष्टिकाले अनन्ते अच्युते भूतगणेषु यद्वद||

अन्वय - अस्मिन् खहरे राशौ बहुषु प्रविष्टेषु अपि (अथवा राशौ) निसृतेषु अपि विकार: न स्यात्|

यद्वत् सृष्टिलयकाले बहुषु अपि भूतगणेषु अनन्ते अच्युते (प्रविष्टेषु) विकार: न स्यात्|

Meaning : This infinite number does not change even after adding or subtracting any number from it. Like, the 'Brahmanda' (universe) is not altered when at the end of the world, many lives enter into it.

Explanation : This algorithm explains the concept of Infinity.

Samskrit Amarkosh (Thesaurus)

 Samskrit Amarkosh (Thesaurus) Chapter-1

sanskrit Amarkosh(thesaurus) chapter -2

sanskrit Amarkosh(thesaurus) chapter-3

Basics of Geometry in Sanskrit || संस्कृतम ||

|| संस्कृतम ||

|| वदतु संस्कृतम || पठतु संस्कृतम ||

|| संस्कृत अध्ययन कार्ये स्वागतम् अस्तु ||

Welcome to learning learning Sanskrit

|| Sanskrit_Adhyayan || Sanskrit_Sanskruti || Bharatiya_Knowledge_Traditions || Sanskrit_Jagruti ||

१) अस्तित्वं भासते यस्य न शक्यं मापनं खलु |

निरकारोsपि साकारो स बिन्दुरिति कथ्यते ||

Meaning : A place whose existence is experienced/seen but which can not be measured is called as a Point.

Explanation : A point is a location in space.A point has no length,width or height, it just specifies an exact location.

२) बिन्दुनाम य: समूह: स्यात घनविस्तार: वर्जित: |

दीर्घाकार: स भूमित्याम रेखाखंड इति स्मृत: ||

Meaning : A line is a straight one-dimensional geometric figure formed by collection of points having no thickness and volume (and extending infinitely

in both directions ).

३) भवतो दैर्घ्यविस्तारौ घनता नैव विद्यते |

प्रुष्टभागसमम् रूपं भूमित्याम् प्रतलं हि तत् ||

Meaning : An (imaginary) flat surface that is infinitely large and with zero thickness or volume is defined as 'plane' in geometry.

४) रेखाया: प्रतलस्यापि द्वयम् यदि परस्परम् |

न संस्प्रुश्यति भूमित्याम् प्रतलम् हि तत् ||

Meaning : Two lines (in a plane) or two planes that do not intersect or meet are called parallel (lines and planes respectively).

५) प्रमाणम् नवतिर्यस्य काटकोण: स: उच्यते |

ततोSधिको विशाल: स्यात् तन्यूनो लघुरुच्यते ||

Meaning : The traingle that has (one of its interior )angle measuring 90 degrees is called as 'right traingle'.The one greater then that (that has one angle that measures more than 90 degrees)is called as 'obtuse triangle' and the one lesser than that( that has all interior angles measuring less than 90 degrees) is called as 'acute traingle'.

६) तिस्रो भुजा समा यस्य समानभुज उच्यते |

तथा द्वयम् समानम् स्यात् स समद्विभूज भवेत् ||

Meaning : The triangle whose three sides are equal is called as 'equilatera triangle'. Similarly, the triangle whose two sides are equal is called as 'isosceles triangle'

७) अंशयोगत्रिकोणस्य शतं चाशीति लक्षणम् |

चतुर्भुजस्य कोणानां योगो द्विगुणितो भवेत् ||

Meaning : The sum of the angles of a triangle is 180 degrees ,while the sum of the angles of a square is twice the 180 degrees i.e. 360 degrees.

८) त्रिभुजस्य फलशरीरम् समदलकोटीभुजार्धसंवर्ग: |

समपरिणाहस्यार्धम् विष्कम्भार्धहतमेव वृत्तफलम् ||

Meaning : Area of a triangle is ((1/2)*(perpendicular height)*base),while area of a cirlcle is (pi*r*r).

Explaination : फलशरीरम् - area, समदलकोटी - perpendicular,

समपरिणाह - circumference, विष्कम्भ - diameter

Area of a circle = (circumference/2)* (diameter/2)

=((2*pi*r)/2) * (2r/2)


९)चतुरधिकं शतं अष्टगुणं द्वासष्टीस्तथा सहस्त्रानाम |

अयुतद्वयं विष्कम्भ: स्या सन्नो वृत्तपरिणाह: ||

Meaning : The figure whose diameter is 20,000 has a circumference of 62,832. Explaination : Aryabhatt has given value of mathematical symbol π ('pi') in this shloka.

चतुरधिकं शतं अष्टगुणं - 104 * 8,(=832)

द्वासष्टीसहस्त्रानाम = 62,000, अयुतद्वयं = 20,000

विष्कम्भ: - diameter , वृत्तपरिणाह: - circuference

So, π = 62,832/20,000 = 3.1416

१०) वर्ग: समचतुरस्त्र फलम् च सदृशद्वयस्य संवर्ग:| सदृशत्रयसम्वर्गो घनस्तथा द्वादशाश्रि: स्यात् ||

Meaning : The figure with 4 equal sides is called as varg(square).Similarly,the multiplication of two equal numbers is callled as (its) varg(square).
The multiplication of three equal numbers is callled as (its) ghan(cube).Similarly,the figure with 12 equal sides is called as a ghan (cube).

उदाहरनानि (Examples) :

११)क्षेत्रस्य समकोणस्य दैर्घ्यम् यदि चतुर्दश | दैघ्यार्ध-मात्राविस्तार: फलम् परिमितिम् वद ||

Meaning : Tell (me) the area and circumference of a farm with length equal to 14 units and width equal to half of the length.

१२)कूपस्य सप्त विष्कभो निम्नता पंचविन्शति: | परिपूर्णे हि कूपेSस्मिन कियद्वारी भविष्यति? ||

Meaning : The diameter of a well is 7 units and its height is 25 units.What amount of water can completely fill the well?

संख्यारचना (Positional Number System)

१) एकं द्वे त्रिणि चत्वारि पञ्च षट सप्त चाष्ट च |

नव शून्यं दशांका स्यु: संख्या-लेखन-हेतवे ||

Meaning : 1,2,3,4,5,6,7,8,9,0 are the symbols to write numbers.

२) तस्मात् संयुज्य तान् अंकान् स्थानमानानुसारत: |

वामतो गति: अंकानाम् ज्ञात्वा संख्या च लिख्यते ||

Meaning : Hence,by combining these number-symbols (and) by knowing that their (local/positional) value increases (from right) towards left,

a number is written.