Contributing

March 17, 2024 · View on GitHub

Suggestions

If you have feedback, a suggestion or you simply found a bug, feel free to submit an issue.

Contributions

If you would like to submit a pull request, consider the following:

do not sacrifice readability for performance, toc is already fast enough for its use case
add comments to explain complex part of your function
add a dedicate unit test if you implemented a new function
make sure that all tests pass before committing
contributions are licensed according to the open-source LICENSE

Tools

Here is a list of tools that you can use to improve code quality.

Assigning a specific type (str, list, etc.) to variables and function outputs makes the code less prone to errors. Type checking ensures variables consistency and that functions are passed the expected type of input.

pip install mypy
mypy "toc/"

Code linting

Linting checks the code conformity to the PEP 8 (pepotto) style.

pip install black
black .

Test coverage

A high test coverage checks more code against unexpected behaviors, i.e. bugs.

pip install coverage
coverage run -m unittest
coverage html
firefox "htmlcov/index.html"

Virtual environment

Before running benchmarks, it is suggested to install toc in a virtual environment. "(venv)" will be prepended to the shell prompt to indicate that every python/pip operation is run in the "venv/" folder, without impacting the system.

python -m venv "venv/"
source venv/bin/activate
pip install -e .

Benchmarks

Running the code against a heavy workload amplifies the effect of inefficient sections in profiling operations.

Large files

To generate a single large file (use toc/cli.py tests/output/longfile.txt):

rm -f tests/output/longfile.txt
for i in {1..100000}; do
 {
  printf "# ################################################################ Test H1 $i\n"
  printf "# ################################ Test H2 $i\n"
 } >> tests/output/longfile.txt
done

More complex toc:

rm -f tests/output/longfile-complex.txt
for i in {1..100000}; do
 {
  printf "# ################################################################ Test H1 $i\n"
  printf "# ################################ Test H2 $i\n"
  printf "# ################ Test H3 $i\n"
  printf "# ################ Test H3 $i\n"
  printf "# ################################ Test H2 $i\n"
 } >> tests/output/longfile-complex.txt
done

Multiple files

To generate multiple small files (use toc/cli.py -l tests/output/multi/_list.txt):

mkdir -p tests/output/multi
rm -f tests/output/multi/_list.txt
for i in {1..10000}; do
 file=tests/output/multi/$i.txt
 {
  printf "# ################################################################ Test H1 $i\n"
  printf "# ################################ Test H2 $i\n"
 } > $file
 printf "$file\n" >> tests/output/multi/_list.txt
done

More complex toc:

mkdir -p tests/output/multi
rm -f tests/output/multi/_list-complex.txt
for i in {1..10000}; do
 file=tests/output/multi/$i-complex.txt
 {
  printf "# ################################################################ Test H1 $i\n"
  printf "# ################################ Test H2 $i\n"
  printf "# ################ Test H3 $i\n"
  printf "# ################ Test H3 $i\n"
  printf "# ################################ Test H2 $i\n"
 } > $file
 printf "$file\n" >> tests/output/multi/_list-complex.txt
done

Profiling

We can use a variety of profiling tools to understand the impact of functions on performance. For these tools to work, one line needs to be modified in "cli.py":

#from toc.toc import Toc
from toc import Toc

Function-level time

The built-in cProfile Python module allows for function-level profiling, which can be explored with snakeviz

pip install snakeviz
python -m cProfile -o "tests/output/prof_cprofile.prof" toc/cli.py "tests/output/longfile.txt"
snakeviz "tests/output/prof_cprofile.prof"

Line-level time

pprofile allows for line-level profiling, which can be explored with a "callgrind" reader like qcachegrind

pip install pprofile
pprofile --exclude-syspath -f callgrind -o "tests/output/prof_callgrind.prof" toc/cli.py "tests/output/longfile.txt"
qcachegrind "tests/output/prof_callgrind.prof"

Memory allocation

memray shows how much memory is allocated during code execution

pip install memray
memray run -o "tests/output/prof_memray.bin" toc/cli.py "tests/output/longfile.txt"
memray summary "tests/output/prof_memray.bin"
memray tree "tests/output/prof_memray.bin"
memray flamegraph "tests/output/prof_memray.bin"
firefox "tests/output/memray-flamegraph-prof_memray.html"
rm "tests/output/memray.bin" "tests/output/memray-flamegraph-prof_memray.html"

Launching without installing

If you really need to launch toc without without installing it first, you can run

python -m toc.cli --version
python -m toc.cli -f "toc/toc.py"

Release

Steps for a new release:

Update version in "pyproject.toml"
Save changes with git commit
Add a temporary tag with git tag v2.6.0 and rewrite the tag name
Update the changelog with git-cliff -c pyproject.toml > CHANGELOG.md
Run toc -lf .tocfiles
Remove tag with git tag --delete v2.6.0
Add changelog changes with git add CHANGELOG.md && git commit -m "minor: updated CHANGELOG.md"
Move tag to the new commit with git tag -fa v2.6.0
Upload the new commits and tags with git push --follow-tags
Update AUR version once the new PyPI version is online

In case a tag has been pushed to GitHub, but the release failed, run git push --delete origin v2.6.0 and repeat the steps above