text-parsing.md

July 15, 2021 · View on GitHub

Bookmarks tagged [text-parsing]

www.codever.land/bookmarks/t/text-parsing

tablib

https://github.com/kennethreitz/tablib

A module for Tabular Datasets in XLS, CSV, JSON, YAML.


openpyxl

https://openpyxl.readthedocs.io/en/stable/

A library for reading and writing Excel 2010 xlsx/xlsm/xltx/xltm files.


pyexcel

https://github.com/pyexcel/pyexcel

Providing one API for reading, manipulating and writing csv, ods, xls, xlsx and xlsm files.


python-docx

https://github.com/python-openxml/python-docx

Reads, queries and modifies Microsoft Word 2007/2008 docx files.


python-pptx

https://github.com/scanny/python-pptx

Python library for creating and updating PowerPoint (.pptx) files.


unoconv

https://github.com/unoconv/unoconv

Convert between any document format supported by LibreOffice/OpenOffice.


XlsxWriter

https://github.com/jmcnamara/XlsxWriter

A Python module for creating Excel .xlsx files.


xlwings

https://github.com/ZoomerAnalytics/xlwings

A BSD-licensed library that makes it easy to call Python from Excel and vice versa.


xlwt

https://github.com/python-excel/xlwt

Writing and reading data and formatting information from Excel files.


PDFMiner

https://github.com/euske/pdfminer

A tool for extracting information from PDF documents.


PyPDF2

https://github.com/mstamy2/PyPDF2

A library capable of splitting, merging and transforming PDF pages.


ReportLab

https://www.reportlab.com/opensource/

Allowing Rapid creation of rich PDF documents.


Mistune

https://github.com/lepture/mistune

Fastest and full featured pure Python parsers of Markdown.


Python-Markdown

https://github.com/waylan/Python-Markdown

A Python implementation of John Gruber’s Markdown.


PyYAML

http://pyyaml.org/

YAML implementations for Python.


csvkit

https://github.com/wireservice/csvkit

Utilities for converting to and working with CSV.


unp

https://github.com/mitsuhiko/unp

A command line tool that can unpack archives easily.