Webis UUID Tool
June 23, 2020 ยท View on GitHub
Generator for version 5 (name-based SHA1) UUIDs to identify records in generated web corpus MapFiles.
Building the Source Code
Run
./gradlew build
inside the source directory. The generated JAR file will be in jar/webis-uuid.jar.
Example Usage
Command-line usage:
java -jar jar/webis-uuid.jar clueweb12 clueweb12-0200wb-93-16911
API usage:
import de.webis.WebisUUID;
// ...
System.out.println(WebisUUID.generateUUID("clueweb12", "clueweb12-0200wb-93-16911"));
Result: 7f476110-58fd-5698-b104-8b29c3ac6d55.
Other Languages
The Python standard library comes with UUID5 support out of the box and does not need this utility. The UUID from the example above can be generated in Python with
import uuid
uuid.uuid5(uuid.NAMESPACE_URL, "clueweb12:clueweb12-0200wb-93-16911")