batch-processing.md

July 15, 2021 ยท View on GitHub

Bookmarks tagged [batch-processing]

www.codever.land/bookmarks/t/batch-processing

Apache Beam Home Page

https://beam.apache.org/

Apache Beam is an open source, unified model and set of language-specific SDKs for defining and executing data processing workflows, and also data ingestion and integration flows, supporting Enterpris...


PySpark

https://pypi.python.org/pypi/pyspark/

Apache Spark Python API.


dask

https://github.com/dask/dask

A flexible parallel computing library for analytic computing.


luigi

https://github.com/spotify/luigi

A module that helps you build complex pipelines of batch jobs.


mrjob

https://github.com/Yelp/mrjob

Run MapReduce jobs on Hadoop or Amazon Web Services.


Ray

https://github.com/ray-project/ray/

A system for parallel and distributed Python that unifies the machine learning ecosystem.