Data

July 10, 2025 ยท View on GitHub

Here you can find the code we used to curate these datasets:

  • FineWeb-Edu
  • FineMath
  • SmolTalk