README.org

November 3, 2024 · View on GitHub

#+TITLE: Clog: Declarative Program Analysis for C

Clog is a Datalog language extension that enables analysis of C programs inside Datalog.

#+BEGIN_SRC metadl FunctionWithGoto(func,g,l):<:func, g, l) :- <: type func(..)..:>,@func(..) { .. } :>, @func g <: goto label1;:>,@label1; :>, @func l <: label2:label2 : substmt :>, label1==label1 == label2.

BackwardGoto(g, l) :- g <: goto label1;:>,l<:label1; :>, l <: label2 : substmt:>,substmt :>, label1 == $label2, c_src_line_start(g) > c_src_line_start(l).

LabeledReturn(l) :- l <: label:returnlabel : return v; :>.

LabeledReturn(l) :- l <: $label : return ; :>.

WarnBackwardGoto(file, line) :- BackwardGoto(g, _), file = io_relative_path(c_src_file(g)), line = c_src_line_start(g).

WarnMoreThanOneLabel(file, line) :- FunctionWithGoto(f, _, l1), FunctionWithGoto(f, _, l2), l1 != l2, line = c_src_line_start(l1), line >= c_src_line_start(l2), file = io_relative_path(c_src_file(f)).

WarnLabeledReturn(file, line) :- LabeledReturn(l), file = io_relative_path(c_src_file(l)), line = c_src_line_start(l).

OUTPUT('WarnBackwardGoto, "WarnBackwardGoto.csv", "csv"). OUTPUT('WarnMoreThanOneLabel, "WarnMoreThanOneLabel.csv", "csv"). OUTPUT('WarnLabeledReturn, "WarnLabeledReturn.csv", "csv"). #+END_SRC

Ensure that the souffle executable is available in your $PATH.

For the hybrid and incremental evaluation modes, a version a modified version of Souffle is needed. This version exposes an extended JNI interface and uses a different format for SQLite input and output.

** Building Soufflé (optional)

Soufflé can be used for some execution modes, though by default Clog does not use it.

To build, you will need the following:

  • a complete build system for C, including the GNU autotools
  • mcpp
  • swig

Clone the repository #+BEGIN_SRC git clone https://github.com/creichen/souffle.git -b javadl #+END_SRC When building, make sure to enable 64-bit domains and SWIG (needed only for the hybrid mode): #+BEGIN_SRC cd souffle ./bootstrap ./configure --enable-swig --enable-64bit-domain make #+END_SRC

Environment variables (relative to clog project root):

  • PATH must contain ./souffle/src/, unless you install souffle globally (sudo make install)

** C support (required for Clog) For analyzing C, Clog uses the Clang front-end for pattern matching. We have built a Clang library to interface with Clog. Run the following commands to fetch and build a modified version of the LLVM repository:

Install the following dependencies:

  • swig4.0

#+BEGIN_SRC git clone git@github.com:alexdura/llvm-project.git -b clog #+END_SRC To build the library, run the following commands: #+BEGIN_SRC mkdir llvm-project/build-release cd llvm-project/build-release cmake -G Ninja -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS="clang;clang-tools-extra" -DBUILD_SHARED_LIBS=ON -DLLVM_APPEND_VC_REV=OFF ../llvm ninja #+END_SRC

Environment variables (relative to clog project root):

  • PATH must contain ./llvm-project/build-release/bin/
  • LD_LIBRARY_PATH must contain ./llvm-project/build-release/lib/

** Docker image (if you want to run in Podman or Docker) We provide a Docker image which contains Clog and evaluation scripts, packaged together with all the required dependencies. #+BEGIN_SRC cd docker ./build.sh #+END_SRC then run the image using: #+BEGIN_SRC docker run -it clog23:cc24 #+END_SRC

** Clog itself

Java versions that should work:

  • Java 11

#+BEGIN_SRC ./gradlew clean jar ./gradlew test #+END_SRC

*** Troubleshooting If any Clog tests fail, try the following tests to narrow down the problem. You can run an individual test (e.g., lang.CEvaluationTest) as follows:

#+BEGIN_SRC ./gradlew test --info --tests lang.CEvaluationTest #+END_SRC

  • Cannot find symbol: var: Make sure to use Java 11
  • Could not create an instance of type org.gradle.initialization.DefaultSettings_Decorated: Make sure to use Java 11
  • lang.CEvaluationTest fails with UnsatisfiedLinkError: See UnsatisfiedLinkError below
  • clang.ASTImporterTest fails with IOException: Cannot run program "clang": See "Missing clang" below
  • lang.EvaluationTest fails with IOException: Cannot run program "souffle": See "Missing Soufflé" below (not needed to run Clog with internal evaluation)

*** Troubleshooting: UnsatisfiedLinkError #+BEGIN_SRC ./gradlew test --info --tests lang.CEvaluationTest #+END_SRC

If you get the error java.lang.UnsatisfiedLinkError: no clangClogSWIG in ..., make sure that libclangClogSWIG.so is in your LD_LIBRARY_PATH. If you followed the "C support" steps above and are using bash, you might e.g. run:

#+BEGIN_SRC export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:pwd/llvm-project/build-release/lib/ #+END_SRC

*** Troubleshooting: Missing Soufflé

#+BEGIN_SRC ./gradlew test --info --tests lang.EvaluationTest #+END_SRC

If you get the error java.io.IOException: Cannot run program "souffle": error=2, No such file or directory, then the souffle binary is not in your PATH. Make sure it is visible there (i.e., you should be able to run souffle on the shell and see a help message). If you followed the "Building soufflé" steps above and are using bash, you might e.g. run:

#+BEGIN_SRC export PATH=${PATH}:pwd/souffle/src #+END_SRC

*** Troubleshooting: Missing Clang

#+BEGIN_SRC ./gradlew test --info --tests clang.ASTImporterTest #+END_SRC

If you get the error java.io.IOException: Cannot run program "clang": error=2, No such file or directory, then the clang binary is not in your PATH. Make sure it is visible there (i.e., you should be able to run clang --help on the shell and see a help message). If you followed the "C support" steps above and are using bash, you might e.g. run:

#+BEGIN_SRC export PATH=${PATH}:pwd/llvm-project/build-release/bin/ #+END_SRC

  • Running Clog supports multiple running modes, which represent a trade-off between speed and external dependencies. ** Internal evaluation Uses the internal semi-naive evaluator and it is reasonably fast when the number of tuples is small (< 1 million). #+BEGIN_SRC java -jar compiler.jar --eval metadl program.mdl -F fact_dir -D output_dir #+END_SRC ** Internal parallel evaluation Depending on the shape of the strata dependency is graph, the parallel evaluator may speed things up. #+BEGIN_SRC java -jar compiler.jar --eval metadl-par program.mdl -F fact_dir -D output_dir #+END_SRC

  • Language description ** Datalog [[https://en.wikipedia.org/wiki/Datalog][Datalog]] is a declarative query language, with roots in logic programming. Relations between tables are expressed as Horn clauses. Clog extends Datalog with syntactic patterns and associates side-effects to the following predicates EDB and OUTPUT. The order of evaluation is as follows:

  1. All predicates the EDB predicate depends upon are evaluated. For all tuples ('P, "file") in the the EDB relation, the file is read as a CSV and its tuples are added to the relation P.
  2. Fixpoint evaluation.
  3. For all values ('P) in the OUTPUT relation, the contents of relation P are written out to a file P.csv.

Additional Supported features:

  • Stratified negation !P(x1,...,xn)
  • Filtering expr1 < expr2, expr1 > expr2
  • Object creation v = expr binds a variable to the result of an expression
  • Arithmetic expressions ( +, -, *, /) and string concatenation (cat)
  • Monomorphic type inference

** Metalanguage description *** Syntactic patterns Patterns are a mechanism to match rules and bind metavariables to terms, expressions and predicate symbols.

*** Bounded patterns The root node of a pattern can be accessed by using a bounded pattern p<:p <:x + $y:>.

*** Gaps Datalog ... and C or Java ..

Gaps express missing elements inside a list.

*** Metavariables Datalog: x , x~, ~p or Java: c, i

Variables used inside analyze blocks to connect patterns with other literals in the rule

  • Terms: p(x,x, y)
  • Predicates: $p(x, y)
  • Arithmetic expressions: x+x + y
  • Index metavariables p(..., i:i:v, ...)
  • License This repository is covered by a BSD 2-clause license, see [[./LICENSE][LICENSE]].

  • Debugging The following commands are useful when debugging Clog:

  • Pretty print the desugared program in Clog format java -jar compiler.jar --pretty-print metadl program.mdl
  • Pretty print the desugared program in Souffle format java -jar compiler.jar --pretty-print metadl program.mdl
  • Enable internal debug printouts by setting METADL_LOG=debug|time|info in the environment.

** JastAdd [[http://jastadd.org/web/][JastAdd]] is a meta-compilation system that supports Reference Attribute Grammars (RAGs). It uses the parser generated from Beaver. In addition it takes an abstract grammar description file as input. The abstract grammar description is used to generate the classes that represent the pattern AST.