README.org
July 19, 2019 ยท View on GitHub
- Confluence to ElasticSearch
A collection of business processes used to mine Confluence, discover information, intelligently index it and expose all of it as a lightning fast search.
*** Prerequisites
- Git (of course).
- JRE 8 must exist in PATH (JRE 7 is fine too, but might be deprecated).
- Boot 2.x: installation instructions are avaliable [[https://github.com/boot-clj/boot#install][here]].
*** Configuration
There are 3 files: a standard resources/log4j.properties and SeazMe specific config.edn as well as mapping.edn which need to be configured properly.
*** Usage
After making sure that software from Prerequisites is installed, clone this repo and bring up REPL with boot repl command (it might take a few minutes to download all dependencies when run for the first time).
Examples: #+BEGIN_EXAMPLE export BOOT_JVM_OPTIONS="-Xmx12g -XX:+UseSerialGC"
./
./
./
./
./
#+END_EXAMPLE
Where executable is either sources or ~java -jar sources- build with build. Add -Dlog4j.configuration=file:resources/log4j.properties to java if logging is desired.
Once above is completed (should take a few minutes), a subdirectory db is created. It contains both copy of Confluence and other data sources.
*** crontab example
runme:
#+BEGIN_SRC #!/bin/bash cd ${0%/*} java -Dlog4j.configuration=file:log4j.properties -jar sources.jar -a update -c context-12h #+END_SRC
#+BEGIN_SRC SHELL=/bin/bash PATH=/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin
50 * * * * /usr/bin/flock -n /tmp/sources.lockfile seazme-sources/runme &>> runme.log #+END_SRC