Hindu Sacred Texts Dataset

December 24, 2025 · View on GitHub

Overview

This repository contains a structured collection of sacred texts from Hinduism, formatted in JSON for easy access and analysis. It includes texts from the 'Ramcharitmanas' , 'Srimad Bhagavad Gita', 'Valmiki Ramayana' , 'Rigveda' , 'Yajurveda' and the 'Atharvaveda'.

File Structure

The dataset is organized into two main directories: Ramcharitmanas, Srimad Bhagavad Gita, Mahabharata, Valmiki Ramayana, Rigveda, Yajurveda, and Atharvaveda, each containing JSON files for different chapters or काण्ड.

TextStructure DetailsEstimated Number of VersesSource
RamcharitmanasComposed of chaupais, dohas (sorthas), and various chhandsApproximately 10,000 chaupaisIIT Kanpur Ramcharitmanas Project
Srimad Bhagavad GitaVerses (shlokas) in classical SanskritApproximately 700Gita Supersite by IIT Kanpur
Mahabharata18 books divided into multiple chaptersApproximately 1,00,000Sacred Texts Mahabharata
Valmiki RamayanaVerses (shlokas) in classical SanskritApproximately 24,000GitHub Repository - Ramayana Book
RigvedaHymns (suktas) divided into mandalasOver 10,000 (including all Mandalas)Vedic Heritage Portal
Yajurveda Shukla- Vajasaneyi Madhyandina Samhita: Prose and verses (shlokas)
- Vajasaneyi Kanva Samhita: Prose and verses (shlokas)
Madhyandina: Approximately 1,900
Kanva: Approximately 1,980
Vedic Heritage Portal
Yajurveda Krishna (to be added)Prose and verses (shlokas), often mixed with Brahmana and Aranyaka sectionsVaries widely-
AtharvavedaHymns and proseApproximately 6,000Vedic Heritage Portal

Ramcharitmanas

The Ramcharitmanas directory contains the following files representing each of the seven काण्डs:

Srimad Bhagavad Gita

The SrimadBhagvadGita directory contains files for each of the 18 chapters of the Bhagavad Gita:

Mahabharata

There are 18 books in mahabharata consiting of around 1 Lakhs shlokas, available in
Mahabharata folder

Valmiki Ramayana

All 7 kaands (~24,000 shlokas) of Valmiki Ramayana are present in the directory. Each kānd is available as a separate JSON file, detailing the various phases of Lord Rama's life:

  • Balakanda - बालकाण्ड: Describes the birth of Rama, his childhood and marriage to Sita.
  • Ayodhyakanda - अयोध्याकाण्ड: Details Rama's exile, the preparations for his coronation, and his departure to the forest.
  • Aranyakanda - अरण्यकाण्ड: Chronicles the forest life of Rama and his encounters with sages and demons.
  • Kishkindhakanda - किष्किंधाकाण्ड: Covers the meeting of Rama with Hanuman and the vanara (monkey) kingdom of Kishkindha.
  • Sundarakanda - सुंदरकाण्ड: Depicts Hanuman's journey to Lanka, his meeting with Sita, and his fiery escape.
  • Yudhhakanda - युद्धकाण्ड: Describes the great war between Rama's army and the forces of Ravana.
  • Uttarakanda - उत्तरकाण्ड: Talks about Rama's life after returning to Ayodhya, his coronation, and the banishment of Sita.

Credits to Ramayana book for providing the data.

Rigveda

Structure of Rigveda

The Rigveda directory has been added to the repository, containing JSON files for each of the ten Mandalas:

Yajurveda

The Yajurveda directory has been added to the repository, containing JSON files for the Shukla Yajurveda Samhitas:

  • Vajasaneyi Madhyandina Samhita
  • Vajasaneyi Kanva Samhita

Shukla Yajurveda Samhitas

Vajasaneyi Madhyandina Samhita

The Vajasaneyi Madhyandina Samhita directory contains JSON files for the chapters of the Vajasaneyi Madhyandina Samhita:

Vajasaneyi Kanva Samhita

The Vajasaneyi Kanva Samhita directory contains JSON files for the chapters of the Vajasaneyi Kanva Samhita:

Atharvaveda

Structure of Rigveda

There are 20 kaandas in atharvaveda which are present in Atharavaveda Repo

Usage

To use this dataset, simply clone the repository and refer to the individual JSON files as needed. The files are named in a way that should be self-explanatory, with each file corresponding to a specific chapter or काण्ड.

Applications

Projects and applications built using this dataset

  • Dharmik - App to browse bhagavad gita with audio transcriptions

Upcoming Additions

We plan to expand this collection by including the following texts:

  • Sam Veda
  • Krishna Yajurveda Samhita
  • The missing chapter 1 of the Vajasaneyi Kanva Samhita

These will be added in a similar structured format, making it easy for users to access and study these ancient texts.

Contributing

Contributions to this dataset are welcome. Please submit a pull request or raise an issue if you find any errors or have suggestions for improvements.

License

This dataset is made available under the Open Database License (ODbL). By using this dataset, you agree to the terms of the license.

Open Database License (ODbL)