Command Line Tools
October 29, 2024 ยท View on GitHub
This list contains network and data processing tools with command line interface written in any programming langauge.
Contents
Network
EMPTY CONTENT
Web Scraping
- pipet - A swiss-army tool for scraping and extracting data using selectors, JavaScript and unix pipes
- trafilatura - Gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML
URLs
- courlan - Clean, filter and sample URLs to optimize data collection: Deduplication, spam, content and language filters