Sparkonda

May 29, 2016 ยท View on GitHub

=============================== Sparkonda

Minimalistic utility library to manage conda environments for PySpark jobs on Yarn clusters.

Features

Manage conda environments on PySpark executors to use specific packages on the remote workers without involving admins to install needed software on a Hadoop cluster.

Docs

http://sparkonda.readthedocs.org

Contents

  1. 1Features
  2. 2Docs