WikiExtractor

August 25, 2021 ยท View on GitHub

Extracts and cleans text from Wikipedia database dump and stores output in a number of files of similar size in a given directory.