Contributing
March 17, 2024 · View on GitHub
Suggestions
If you have feedback, a suggestion or you simply found a bug, feel free to submit an issue.
Contributions
If you would like to submit a pull request, consider the following:
- do not sacrifice readability for performance,
tocis already fast enough for its use case - add comments to explain complex part of your function
- add a dedicate unit test if you implemented a new function
- make sure that all tests pass before committing
- contributions are licensed according to the open-source LICENSE
Tools
Here is a list of tools that you can use to improve code quality.
Type checking
Assigning a specific type (str, list, etc.) to variables and function outputs makes the code less prone to errors. Type checking ensures variables consistency and that functions are passed the expected type of input.
pip install mypy
mypy "toc/"
Code linting
Linting checks the code conformity to the PEP 8 (pepotto) style.
pip install black
black .
Test coverage
A high test coverage checks more code against unexpected behaviors, i.e. bugs.
pip install coverage
coverage run -m unittest
coverage html
firefox "htmlcov/index.html"
Virtual environment
Before running benchmarks, it is suggested to install toc in a virtual environment.
"(venv)" will be prepended to the shell prompt to indicate that every python/pip operation
is run in the "venv/" folder, without impacting the system.
python -m venv "venv/"
source venv/bin/activate
pip install -e .
Benchmarks
Running the code against a heavy workload amplifies the effect of inefficient sections in profiling operations.
Large files
To generate a single large file (use toc/cli.py tests/output/longfile.txt):
rm -f tests/output/longfile.txt
for i in {1..100000}; do
{
printf "# ################################################################ Test H1 $i\n"
printf "# ################################ Test H2 $i\n"
} >> tests/output/longfile.txt
done
More complex toc:
rm -f tests/output/longfile-complex.txt
for i in {1..100000}; do
{
printf "# ################################################################ Test H1 $i\n"
printf "# ################################ Test H2 $i\n"
printf "# ################ Test H3 $i\n"
printf "# ################ Test H3 $i\n"
printf "# ################################ Test H2 $i\n"
} >> tests/output/longfile-complex.txt
done
Multiple files
To generate multiple small files (use toc/cli.py -l tests/output/multi/_list.txt):
mkdir -p tests/output/multi
rm -f tests/output/multi/_list.txt
for i in {1..10000}; do
file=tests/output/multi/$i.txt
{
printf "# ################################################################ Test H1 $i\n"
printf "# ################################ Test H2 $i\n"
} > $file
printf "$file\n" >> tests/output/multi/_list.txt
done
More complex toc:
mkdir -p tests/output/multi
rm -f tests/output/multi/_list-complex.txt
for i in {1..10000}; do
file=tests/output/multi/$i-complex.txt
{
printf "# ################################################################ Test H1 $i\n"
printf "# ################################ Test H2 $i\n"
printf "# ################ Test H3 $i\n"
printf "# ################ Test H3 $i\n"
printf "# ################################ Test H2 $i\n"
} > $file
printf "$file\n" >> tests/output/multi/_list-complex.txt
done
Profiling
We can use a variety of profiling tools to understand the impact of functions on performance. For these tools to work, one line needs to be modified in "cli.py":
#from toc.toc import Toc
from toc import Toc
Function-level time
The built-in cProfile Python module allows for function-level profiling, which can be explored with snakeviz
pip install snakeviz
python -m cProfile -o "tests/output/prof_cprofile.prof" toc/cli.py "tests/output/longfile.txt"
snakeviz "tests/output/prof_cprofile.prof"
Line-level time
pprofile allows for line-level profiling, which can be explored with a "callgrind" reader like qcachegrind
pip install pprofile
pprofile --exclude-syspath -f callgrind -o "tests/output/prof_callgrind.prof" toc/cli.py "tests/output/longfile.txt"
qcachegrind "tests/output/prof_callgrind.prof"
Memory allocation
memray shows how much memory is allocated during code execution
pip install memray
memray run -o "tests/output/prof_memray.bin" toc/cli.py "tests/output/longfile.txt"
memray summary "tests/output/prof_memray.bin"
memray tree "tests/output/prof_memray.bin"
memray flamegraph "tests/output/prof_memray.bin"
firefox "tests/output/memray-flamegraph-prof_memray.html"
rm "tests/output/memray.bin" "tests/output/memray-flamegraph-prof_memray.html"
Launching without installing
If you really need to launch toc without without installing it first, you can run
python -m toc.cli --version
python -m toc.cli -f "toc/toc.py"
Release
Steps for a new release:
- Update version in "pyproject.toml"
- Save changes with
git commit - Add a temporary tag with
git tag v2.6.0and rewrite the tag name - Update the changelog with
git-cliff -c pyproject.toml > CHANGELOG.md - Run
toc -lf .tocfiles - Remove tag with
git tag --delete v2.6.0 - Add changelog changes with
git add CHANGELOG.md && git commit -m "minor: updated CHANGELOG.md" - Move tag to the new commit with
git tag -fa v2.6.0 - Upload the new commits and tags with
git push --follow-tags - Update AUR version once the new PyPI version is online
In case a tag has been pushed to GitHub, but the release failed, run git push --delete origin v2.6.0 and repeat the steps above