Installation¶

In this document, we will discuss the overall steps on how to download and set up the PHGv2 package.

Quick start¶

Run on a Unix-based operating system (Windows not currently tested)
Make sure you have \(\geq\) Java 17
Make sure you have Miniconda installed

Make sure you have the libmamba solver installed:

conda update -n base conda
conda install -n base conda-libmamba-solver
conda config --set solver libmamba

Download the latest release of the PHGv2 package:

curl -s https://api.github.com/repos/maize-genetics/phg_v2/releases/latest \
| awk -F': ' '/browser_download_url/ && /\.tar/ {gsub(/"/, "", $(NF)); system("curl -LO " $(NF))}'

Untar the package:
```
tar -xvf <PHGv2_Release>.tar
```
Navigate into uncompressed PHGv2 package:
```
cd phg/bin
```
Invoke PHGv2 through the phg wrapper script:
```
./phg --version
```
Basic syntax is:
```
phg [<options>] <command> [<args>]...
```

Requirements¶

PHGv2 requires basic software components: a Unix-based operating system and Java version 17 or higher. PHGv2 also relies on external programs for alignment and storage, including AnchorWave and TileDB-VCF. To facilitate this, we strongly recommend using the Conda environment management system, with a focus on the lightweight Conda package manager, Miniconda.

Note

This has currently been tested on Fedora- and Debian-derived Unix systems

Note

AnchorWave is currently not supported on Windows. See AnchorWave documentation for further details.

Get PHGv2¶

You can download the latest version of PHGv2 here. Assuming you have downloaded PHGv2 locally, these instructions presume you will run the program directly. Obtain the .tar file manually from the provided link or use the following curl and awk commands to retrieve the latest release:

curl -s https://api.github.com/repos/maize-genetics/phg_v2/releases/latest \
| awk -F': ' '/browser_download_url/ && /\.tar/ {gsub(/"/, "", $(NF)); system("curl -LO " $(NF))}'

Once downloaded, untar the release using:

tar -xvf <PHGv2_release>.tar

...where <PHGv2_release>.tar is the downloaded PHGv2 package. After the source has been decompressed, we can remove the initial tar file using:

rm <PHGv2_release>.tar

"Installation"¶

No traditional installation is required, as the precompiled jar files are designed to function on any POSIX platform meeting the specified requirements. Just open the downloaded package and place the folder containing the jar files and launch script in a preferred directory on your hard drive or server filesystem.

To run PHGv2, you can manually enter into the package directory and run the phg wrapper script from the bin directory. Another option is to add the wrapper script to your PATH variable. If you are using the bash terminal shell, the classic syntax is:

export PATH="/path/to/phgv2-package/:$PATH"

...where /path/to/phgv2-package/ is the path to the location of the phg executable wrapper script.

Note

The above path example must be the path to the bin subdirectory found in the phg directory.

Note

The Java JAR files (.jar) in the lib subdirectory must remain in the same directory as phg for it to work.

Note

Be sure to include the final / in your path.

Test that PHGv2 works¶

To test that you can successfully run the phg executable. Run the following command:

./phg --help

Note

This assumes that you have added phg to your PATH using the above example, or you are within the bin subdirectory.

This should output summary text to the terminal including syntax help and a list of subcommands and descriptions.

Setting memory¶

The amount of data you wish to process will affect the amount of computational resources that you will need. Since PHGv2 leverages a Java virtual machine (JVM) for a majority of its tasks, we can manually alter the maximum amount of memory allocated to the JVM using the following command prompt:

export JAVA_OPTS="-Xmx<memory_amount>"

...where <memory_amount> is a specified unit of memory. For example, if I want to allocate a maximum of 50 gigabytes (GB) of memory for my operations, I would use the input "-Xmx50g", where g stands for GB:

export JAVA_OPTS="-Xmx50g"

Note

In order for memory to properly be set, you must set this before running any of the PHGv2 commands.

Note

Setting JVM memory will only affect JVM-intensive commands. Since PGHv2 utilizes several external pieces of software several commands will not be affected by this. Currently, these are:

setup-environment
initdb
align-assemblies
agc-compress

...which rely on conda, TileDB, AnchorWave, and AGC, respectively.