Show pageOld revisionsBack to top This page is read only. You can view the source, but not change it. Ask your administrator if you think this is wrong. ====== TensorFlow Python Virtual Environment ====== <note warning> This page is under construction. </note> This page documents the creation of a Python virtual environment (virtualenv) containing the TensorFlow software for machine learning on the Caviness HPC system((The steps should also work on the DARWIN HPC system, though with different package versions.)). It assumes that the user is adding the software to the workgroup storage. ===== Prepare Workgroup Directory ===== Prepare to add software in the standard sub-directories of the workgroup storage: <code bash> [user@login01 ~]$ workgroup -g my_workgroup [(my_workgroup:user)@login01 ~]$ mkdir --mode=2775 --parent ${WORKDIR}/sw/tensorflow [(my_workgroup:user)@login01 ~]$ mkdir --mode=2775 --parent ${WORKDIR}/sw/valet </code> These commands create any missing directories. All directories created will have group-write and inherit permissions. ===== Create TensorFlow Virtualenv ===== First, we will need to load the Miniconda VALET package and create the virtual environment: <code bash> [(my_workgroup:user)@login01 ~]$ vpkg_devrequire miniconda/25.1.1.2 Adding package `miniconda/25.1.1.2` to your environment </code> The ''conda search tensorflow'' command can be used to locate the specific version you wish to install. The example shows a search for TensorFlow CPU release at least 2.17.0. <code bash> [(my_workgroup:user)@login01 ~]$ conda search 'tensorflow>=2.17.0' Loading channels: done # Name Version Build Channel tensorflow 2.17.0 cpu_py310h42475c5_0 conda-forge tensorflow 2.17.0 cpu_py310h42475c5_1 conda-forge tensorflow 2.17.0 cpu_py310h42475c5_2 conda-forge ... tensorflow 2.19.0 cpu_py312h69ecde4_3 conda-forge tensorflow 2.19.0 cpu_py312h69ecde4_52 conda-forge tensorflow 2.19.0 cpu_py312h69ecde4_53 conda-forge </code> All versions of the TensorFlow virtual environment will be stored in the directory ''$WORKDIR/sw/tensorflow''; each virtual environment must have a unique name that will become the VALET version of TensorFlow. In this tutorial, we will install the Tensorflow version 2.17.0 with the CPU support and with CUDA support. An appropriate version for the former would be ''2.17.0:cpu'' and the latter ''2.17.0:cuda''. Those versions can be translated to VALET-friendly directory names: <code bash> [(my_workgroup:user)@login01 ~]$ vpkg_id2path --version-id=2.17.0:cpu 2.17.0-cpu [(my_workgroup:user)@login01 ~]$ mkdir --mode=3750 ${WORKDIR}/sw/tensorflow/2.17.0-cpu [(my_workgroup:user)@login01 ~]$ vpkg_id2path --version-id=2.17.0:cuda 2.17.0-cuda [(my_workgroup:user)@login01 ~]$ mkdir --mode=3750 ${WORKDIR}/sw/tensorflow/2.17.0-cuda </code> The virtualenvs are created using the ''%%--%%prefix'' option to specify the directories created above: <code bash> [(my_workgroup:user)@login01 ~]$ conda create --prefix=${WORKDIR}/sw/tensorflow/2.17.0-cpu 'tensorflow[version=2.17.0,build=cpu_py310h42475c5_0]' WARNING: A directory already exists at the target location '/work/workgroup/sw/tensorflow/2.17.0-cpu' but it is not a conda environment. Continue creating environment (y/[n])? y : Preparing transaction: done Verifying transaction: done Executing transaction: done # # To activate this environment, use # # $ conda activate /work/workgroup/sw/tensorflow/2.17.0-cpu # # To deactivate an active environment, use # # $ conda deactivate </code> We're **not** going to activate that virtualenv -- we will install the other one next: <code bash> [(my_workgroup:user)@login01 ~]$ conda create --prefix=${WORKDIR}/sw/tensorflow/2.17.0-cuda 'python==3.10' -c conda-forge WARNING: A directory already exists at the target location '/work/workgroup/sw/tensorflow/2.17.0-cuda' but it is not a conda environment. Continue creating environment (y/[n])? y : Preparing transaction: done Verifying transaction: done Executing transaction: done # # To activate this environment, use # # $ conda activate /work/workgroup/sw/tensorflow/2.17.0-cuda # # To deactivate an active environment, use # # $ conda deactivate </code> We will need to run the ''conda activate'' command and then install the Tensorflow with cuda: <code bash> [(my_workgroup:user)@login01 ~]$ conda activate /work/workgroup/sw/tensorflow/2.17.0-cuda (/work/workgroup/sw/tensorflow/2.17.0-cuda)[(my_workgroup:user)@login01 ~]$ pip install "tensorflow[and-cuda]==2.17.0" </code> Use ''conda deactivate'' command to exit the virtual environment. Roll back the environment changes before proceeding: <code bash> (/work/workgroup/sw/tensorflow/2.17.0-cuda)[(my_workgroup:user)@login01 ~]$ conda deactivate [(my_workgroup:user)@login01 ~]$ vpkg_rollback all </code> ===== VALET Package Definition ===== Assuming the workgroup does //not// already have a TensorFlow VALET package definition, the following text: <file tensorflow.vpkg_yaml> tensorflow: prefix: /work/my_workgroup/sw/tensorflow description: TensorFlow Python environments flags: - no-standard-paths actions: - action: source script: sh: miniconda-activate.sh order: failure-first success: 0 versions: "2.17.0:cpu": description: 2.17.0 with CPU support dependencies: - miniconda/25.1.1.2 "2.17.0:cuda": description: 2.17.0 with CUDA support dependencies: - miniconda/25.1.1.2 </file> would be added to ''${WORKDIR}/sw/valet/tensorflow.vpkg_yaml''. If that file already exists, add your new version at the same level as others: <file tensorflow.vpkg_yaml> tensorflow: prefix: /work/my_workgroup/sw/tensorflow description: TensorFlow Python environments flags: - no-standard-paths actions: - action: source script: sh: miniconda-activate.sh order: failure-first success: 0 versions: "2.17.0:cpu": description: 2.17.0 with CPU support dependencies: - miniconda/25.1.1.2 "2.17.0:cuda": description: 2.17.0 with CUDA support dependencies: - miniconda/25.1.1.2 "1.8.0": description: 1.8.0 from pkgs/main dependencies: - miniconda/25.1.1.2 </file> <note warning>Make sure you modify ''prefix: /work/my_workgroup/sw/tensorflow'' for your workgroup (e.g. If my workgroup is ''it_nss'', then use I would use ''prefix: /work/it_nss/sw/tensorflow'').</note> <note tip>On Caviness after a user has used the ''workgroup'' command, VALET searches for package definitions in ''${WORKDIR}/sw/valet'' by default. VALET also searches a ''~/.valet'' directory (in your home directory) if it exists, so that's the best location for personal package definitions -- for software you've installed in your home directory, for example.</note> With a properly-constructed package definition file, you can now check for your versions of TensorFlow: <code bash> [(it_nss:frey)@login00 ~]$ vpkg_versions tensorflow Available versions in package (* = default version): [/work/my_workgroup/sw/valet/tensorflow.vpkg_yaml] tensorflow TensorFlow Python environments * 2.17.0:cpu 2.17.0 with GPU support 2.17.0:cuda 2.17.0 with CUDA support : </code> ===== Job Scripts ===== Any job scripts you submit that want to run scripts using this virtualenv should include something like the following toward its end: <code> # # Setup TensorFlow virtualenv: # vpkg_require tensorflow/2.17.0:cpu # # Run a Python script in that virtualenv: # python3 my_tf_work.py rc=$? # # Do cleanup work, etc.... # # # Exit with whatever exit code our Python script handed back: # exit $rc </code> technical/recipes/tensorflow-in-virtualenv.txt Last modified: 2025-11-06 15:30by thuachen