TensorFlow Python Virtual Environment

This page is under construction.

This page documents the creation of a Python virtual environment (virtualenv) containing the TensorFlow software for machine learning on the Caviness HPC system1). It assumes that the user is adding the software to the workgroup storage.

Prepare to add software in the standard sub-directories of the workgroup storage:

[user@login01 ~]$ workgroup -g my_workgroup
[(my_workgroup:user)@login01 ~]$ mkdir --mode=2775 --parent ${WORKDIR}/sw/tensorflow
[(my_workgroup:user)@login01 ~]$ mkdir --mode=2775 --parent ${WORKDIR}/sw/valet

These commands create any missing directories. All directories created will have group-write and inherit permissions.

First, we will need to load the Miniconda VALET package and create the virtual environment:

[(my_workgroup:user)@login01 ~]$ vpkg_devrequire miniconda/25.1.1.2
Adding package `miniconda/25.1.1.2` to your environment

The conda search tensorflow command can be used to locate the specific version you wish to install. The example shows a search for TensorFlow CPU release at least 2.17.0.

[(my_workgroup:user)@login01 ~]$ conda search 'tensorflow>=2.17.0'
Loading channels: done
# Name                       Version           Build  Channel
tensorflow                    2.17.0 cpu_py310h42475c5_0  conda-forge
tensorflow                    2.17.0 cpu_py310h42475c5_1  conda-forge
tensorflow                    2.17.0 cpu_py310h42475c5_2  conda-forge
 
...
 
tensorflow                    2.19.0 cpu_py312h69ecde4_3  conda-forge
tensorflow                    2.19.0 cpu_py312h69ecde4_52  conda-forge
tensorflow                    2.19.0 cpu_py312h69ecde4_53  conda-forge

All versions of the TensorFlow virtual environment will be stored in the directory $WORKDIR/sw/tensorflow; each virtual environment must have a unique name that will become the VALET version of TensorFlow. In this tutorial, we will install the Tensorflow version 2.17.0 with the CPU support and with CUDA support. An appropriate version for the former would be 2.17.0:cpu and the latter 2.17.0:cuda. Those versions can be translated to VALET-friendly directory names:

[(my_workgroup:user)@login01 ~]$ vpkg_id2path --version-id=2.17.0:cpu
2.17.0-cpu
[(my_workgroup:user)@login01 ~]$ mkdir --mode=3750 ${WORKDIR}/sw/tensorflow/2.17.0-cpu
 
[(my_workgroup:user)@login01 ~]$ vpkg_id2path --version-id=2.17.0:cuda
2.17.0-cuda
[(my_workgroup:user)@login01 ~]$ mkdir --mode=3750 ${WORKDIR}/sw/tensorflow/2.17.0-cuda

The virtualenvs are created using the --prefix option to specify the directories created above:

[(my_workgroup:user)@login01 ~]$ conda create --prefix=${WORKDIR}/sw/tensorflow/2.17.0-cpu 'tensorflow[version=2.17.0,build=cpu_py310h42475c5_0]'
WARNING: A directory already exists at the target location '/work/workgroup/sw/tensorflow/2.17.0-cpu'
but it is not a conda environment.
Continue creating environment (y/[n])? y
 
   :
 
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
#     $ conda activate /work/workgroup/sw/tensorflow/2.17.0-cpu
#
# To deactivate an active environment, use
#
#     $ conda deactivate

We're not going to activate that virtualenv – we will install the other one next:

[(my_workgroup:user)@login01 ~]$ conda create --prefix=${WORKDIR}/sw/tensorflow/2.17.0-cuda 'python==3.10' -c conda-forge
WARNING: A directory already exists at the target location '/work/workgroup/sw/tensorflow/2.17.0-cuda'
but it is not a conda environment.
Continue creating environment (y/[n])? y
 
   :
 
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
#     $ conda activate /work/workgroup/sw/tensorflow/2.17.0-cuda
#
# To deactivate an active environment, use
#
#     $ conda deactivate

We will need to run the conda activate command and then install the Tensorflow with cuda:

[(my_workgroup:user)@login01 ~]$ conda activate /work/workgroup/sw/tensorflow/2.17.0-cuda
(/work/workgroup/sw/tensorflow/2.17.0-cuda)[(my_workgroup:user)@login01 ~]$ pip install "tensorflow[and-cuda]==2.17.0"

Use conda deactivate command to exit the virtual environment. Roll back the environment changes before proceeding:

(/work/workgroup/sw/tensorflow/2.17.0-cuda)[(my_workgroup:user)@login01 ~]$ conda deactivate
[(my_workgroup:user)@login01 ~]$ vpkg_rollback all

Assuming the workgroup does not already have a TensorFlow VALET package definition, the following text:

tensorflow:
    prefix: /work/my_workgroup/sw/tensorflow
    description: TensorFlow Python environments
    flags:
        - no-standard-paths
    actions:
        - action: source
          script:
              sh: miniconda-activate.sh
          order: failure-first
          success: 0
    versions:
        "2.17.0:cpu":
            description: 2.17.0 with CPU support
            dependencies:
                - miniconda/25.1.1.2
        "2.17.0:cuda":
            description: 2.17.0 with CUDA support
            dependencies:
                - miniconda/25.1.1.2

would be added to ${WORKDIR}/sw/valet/tensorflow.vpkg_yaml. If that file already exists, add your new version at the same level as others:

tensorflow:
    prefix: /work/my_workgroup/sw/tensorflow
    description: TensorFlow Python environments
    flags:
        - no-standard-paths
    actions:
        - action: source
          script:
              sh: miniconda-activate.sh
          order: failure-first
          success: 0
    versions:
        "2.17.0:cpu":
            description: 2.17.0 with CPU support
            dependencies:
                - miniconda/25.1.1.2
        "2.17.0:cuda":
            description: 2.17.0 with CUDA support
            dependencies:
                - miniconda/25.1.1.2
        "1.8.0":
            description: 1.8.0 from pkgs/main
            dependencies:
                - miniconda/25.1.1.2
Make sure you modify prefix: /work/my_workgroup/sw/tensorflow for your workgroup (e.g. If my workgroup is it_nss, then use I would use prefix: /work/it_nss/sw/tensorflow).
On Caviness after a user has used the workgroup command, VALET searches for package definitions in ${WORKDIR}/sw/valet by default. VALET also searches a ~/.valet directory (in your home directory) if it exists, so that's the best location for personal package definitions – for software you've installed in your home directory, for example.

With a properly-constructed package definition file, you can now check for your versions of TensorFlow:

[(it_nss:frey)@login00 ~]$ vpkg_versions tensorflow
 
Available versions in package (* = default version):
 
[/work/my_workgroup/sw/valet/tensorflow.vpkg_yaml]
tensorflow               TensorFlow Python environments
* 2.17.0:cpu              2.17.0 with GPU support
  2.17.0:cuda             2.17.0 with CUDA support
 
     :

Any job scripts you submit that want to run scripts using this virtualenv should include something like the following toward its end:

#
# Setup TensorFlow virtualenv:
#
vpkg_require tensorflow/2.17.0:cpu

#
# Run a Python script in that virtualenv:
#
python3 my_tf_work.py
rc=$?

#
# Do cleanup work, etc....
#

#
# Exit with whatever exit code our Python script handed back:
#
exit $rc

1)
The steps should also work on the DARWIN HPC system, though with different package versions.
  • technical/recipes/tensorflow-in-virtualenv.txt
  • Last modified: 2025-11-06 15:30
  • by thuachen