batch-processing.md
July 15, 2021 ยท View on GitHub
Bookmarks tagged [batch-processing]
www.codever.land/bookmarks/t/batch-processing
Apache Beam Home Page
Apache Beam is an open source, unified model and set of language-specific SDKs for defining and executing data processing workflows, and also data ingestion and integration flows, supporting Enterpris...
- tags: batch-processing, stream-processing, apache-beam
- :octocat: source code
PySpark
https://pypi.python.org/pypi/pyspark/
Apache Spark Python API.
dask
A flexible parallel computing library for analytic computing.
- tags: python, distributed-computing, batch-processing
- :octocat: source code
luigi
https://github.com/spotify/luigi
A module that helps you build complex pipelines of batch jobs.
- tags: python, distributed-computing, batch-processing
- :octocat: source code
mrjob
Run MapReduce jobs on Hadoop or Amazon Web Services.
- tags: python, distributed-computing, batch-processing
- :octocat: source code
Ray
https://github.com/ray-project/ray/
A system for parallel and distributed Python that unifies the machine learning ecosystem.
- tags: python, distributed-computing, batch-processing
- :octocat: source code