Installation¶
In this document, we will discuss the overall steps on how to download and set up the PHGv2 package.
Quick start¶
- Run on a Unix-based operating system (Windows not currently tested)
- Make sure you have \(\geq\) Java 17
- Make sure you have Miniconda installed
- Make sure you have the libmamba solver installed:
- Download the latest release of the PHGv2 package:
-
Untar the package:
-
Navigate into uncompressed PHGv2 package:
- Invoke PHGv2 through the
phg
wrapper script: - Basic syntax is:
Requirements¶
PHGv2 requires basic software components: a Unix-based operating system and Java version 17 or higher. PHGv2 also relies on external programs for alignment and storage, including AnchorWave and TileDB-VCF. To facilitate this, we strongly recommend using the Conda environment management system, with a focus on the lightweight Conda package manager, Miniconda.
Note
This has currently been tested on Fedora- and Debian-derived Unix systems
Note
AnchorWave is currently not supported on Windows. See AnchorWave documentation for further details.
Get PHGv2¶
You can download the latest version of PHGv2
here.
Assuming you have downloaded PHGv2 locally, these instructions
presume you will run the program directly. Obtain the .tar
file
manually from the provided link or use the following curl
and awk
commands to retrieve the latest release:
curl -s https://api.github.com/repos/maize-genetics/phg_v2/releases/latest \
| awk -F': ' '/browser_download_url/ && /\.tar/ {gsub(/"/, "", $(NF)); system("curl -LO " $(NF))}'
Once downloaded, untar the release using:
...where<PHGv2_release>.tar
is the
downloaded PHGv2 package. After the source has been decompressed,
we can remove the initial tar file using:
"Installation"¶
No traditional installation is required, as the precompiled jar files are designed to function on any POSIX platform meeting the specified requirements. Just open the downloaded package and place the folder containing the jar files and launch script in a preferred directory on your hard drive or server filesystem.
To run PHGv2, you can manually enter into the package directory and
run the phg
wrapper script from the bin
directory. Another
option is to add the wrapper script to your PATH
variable. If you
are using the bash
terminal shell, the classic syntax is:
...where /path/to/phgv2-package/
is the path to the location of the
phg
executable wrapper script.
Note
The above path example must be the path to the bin
subdirectory
found in the phg
directory.
Note
The Java JAR files (.jar
) in the lib
subdirectory
must remain in the same directory as phg
for it to work.
Note
Be sure to include the final /
in your path.
Test that PHGv2 works¶
To test that you can successfully run the phg
executable. Run
the following command:
Note
This assumes that you have added phg
to your PATH
using the
above example, or you are within the bin
subdirectory.
This should output summary text to the terminal including syntax help and a list of subcommands and descriptions.
Setting memory¶
The amount of data you wish to process will affect the amount of computational resources that you will need. Since PHGv2 leverages a Java virtual machine (JVM) for a majority of its tasks, we can manually alter the maximum amount of memory allocated to the JVM using the following command prompt:
...where <memory_amount>
is a specified unit of memory. For
example, if I want to allocate a maximum of 50 gigabytes (GB) of
memory for my operations, I would use the input "-Xmx50g"
, where g
stands for GB:
Note
In order for memory to properly be set, you must set this before running any of the PHGv2 commands.
Note
Setting JVM memory will only affect JVM-intensive commands. Since PGHv2 utilizes several external pieces of software several commands will not be affected by this. Currently, these are:
setup-environment
initdb
align-assemblies
agc-compress
...which rely on conda, TileDB, AnchorWave, and AGC, respectively.