Installing Python packages user guide

Python packages are collections of modules (reusable code) that extend and improve the basic Python language’s capability. Python developers contribute to the official Python Package Index (PyPI ) repository, making their packages open source and available to the Python community. The Python Packaging Authority (PyPA ) is in charge of the repository and provides a collection of tools for creating, distributing, and installing Python packages.

The policy at W&M HPC has been altered to encourage users to install Python packages in their home folders. Individual users should attempt installing Python packages in their home directories, whereas packages required by a group should consider a system-wide installation.

This document is meant to assist users in determining the mode of installation for their packages and illustrating the necessary processes to achieve this.

Local installation of packages to complement or replace a system distribution

Python packages require the installation to be done by the ‘root’ user by default. To enhance the functionality of these distributions, most python package installers and managers will additionally allow the user to install the package into their HOME folder.

You can also use the Conda package manager to install Anaconda or Miniconda, which will allow you to utilize a newer Python distribution than the system defaults. The Conda package manager gives you complete control over your Python environment, removing the reliance on system Python versions. Don’t worry because we will delve deeper into Conda at the later stages of this article.

In general, the Intel Python distribution is more feature-rich. It includes tools for improving multi-threaded coding, which can help enhance the performance of long-running Python applications. Because it does not require a commercial license, the Gnu distribution may be more portable for porting your program to other systems. The majority of Python code will run on either distribution. As explained below, you can also use the Conda package manager to install a custom version of Python and manage packages in user-controlled settings.

Python 3 in the Intel package, which may be selected with “module load intel python3,” is a decent starting point if you don’t know how to choose from the above possibilities.

The following are the methods for installing Python packages:

  • Pip installation: For packages that have pypi packages and do not clash with the system python packages currently installed.
  • Installation of virtualenv: For packages that contain/do not have pip packages incompatible with the installed system python packages.
  • Private modules and from source installation: Packages that don’t support pip installation must be built from the source. This method can also be used to install packages that must be used across many sub-clusters (with caution).

PIP Installation

Pip is a standard package manager for Python that is used to install and maintain packages. A variety of built-in functions and packages are included in the Python standard library.

The Python standard library does not include data science tools like scikit-learn and statsmodel. They can be installed from the command line using pip, Python’s typical package manager.

The following command will install the package in the user’s home directory for packages that support the pip installation method:

pip install --user

By default, the package is installed in the place specified by the $PYTHONUSERBASE environment variable in the python environment module imported by the user.

It’s worth noting that the installation location varies per python package and sub-cluster; thus, a package installed with a python module won’t be available in other python modules. A package installed by python/2.7.13/intel on Bora, for example, will not be available on Bora’s anaconda3/4.4.0 or Vortex’s python/2.7.13. It is also true for Storm node types.

Pip Documentation

Pip offers several commands and option flags for managing Python packages, documented in the Pip documentation.

pip -h

The pip version can be printed in the same way as the Python version. The pip version and the Python version must be compatible. Pip 19.1.1 is compatible with Python 3.5.2, as seen here.

pip  documentation
pip documentation
pip --version
python --version

Pip’s Upgrade

If pip prompts you to upgrade, you can do it directly from pip:

pip install --upgrade pip

Taking a Look at a Pip List

It’s a good idea to check what’s currently installed before making any changes. You can use a pip list to display the Python packages in your current working environment alphabetical order from the command line.

pip list
pip list
pip list

Installing scikit-learn

In the example below, you’ll see how to install the scikit-learn package, installing the other required dependencies.

pip install scikit-learn
pip install scikit-learn
pip install scikit-learn

It’s possible that more than the scikit-learn package is being installed based on the logs. Because pip will install any other packages that scikit-learn requires, this is the case. Dependencies are the names given to these additional packages.

Installing a Specific Version of a Package

Pip will always install the most recent version; therefore, if you want to install an earlier version of scikit-learn, add a double equal sign in the installation statement:

pip install scikit-learn==0.19.2

Package Upgrades

Suppose the package you want to use is already installed but has become outdated. You can upgrade the package in the same way that we upgraded pip.

pip install --upgrade scikit-learn

This upgrade will also automatically upgrade any required dependant packages. In addition, the scikit-learn and statsmodel packages are installed and upgraded. The packages can be listed in line with the same pip install command as long as they are separated with spaces to pip install more than one Python package. In this example, we use a single line of code to install both scikit-learn and the statsmodel package.

pip install scikit-learn statsmodels

Multiple packages can also be upgraded with a single line of code.

pip install --upgrade scikit-learn statsmodels

Using requirements.txt to Install Packages

You can save many packages in a text file named requirements.txt if you want to install them all at once. It is what the file looks like when we preview it:

cat requirements.txt
requirements.txt
requirements.txt

Python package developers typically include a requirements.txt file in their Github repositories that lists all dependencies for pip to find and install.

Pip install uses the -r option flag to install packages from the file supplied after the option flag. It’s worth noting that naming this file requirements.txt is common, although it’s unnecessary.

Pip install -r requirements.txt has the same effect as pip install scikit-learn statsmodel in our examples. If you needed to install ten packages, typing out each package may become tedious. It’s considerably easier to use the requirements.txt file.

pip install -r requirements.txt

is similar to the following

pip install scikit-learn statsmodel

Adding Python Dependencies in an Interactive Example

Before running a Python model script, you’ll go through the setup procedure to ensure your Python environment has all of the necessary library dependencies installed.

The requirements.txt document will be created, and the scikit-learn library will be added to the file.

# Add sckit-learn to the requirements.txt file
# first, create a directory named interactive_example

echo "scikit-learn" > requirements.txt

# preview file content
cat requirements.txt

After running the code shown above, the results are as follows.

adding dependencies to requirements.txt
adding dependencies to requirements.txt

Virtual Environment

If the packages/dependencies required by users’ programs are incompatible with the pip packages available on the system, a python virtual environment is advised. It establishes a secure environment in which users can install their preferred packages without affecting the system. The python version is unchanged.

For example, on Vortex, the user is advised to utilize virtualenv rather than a system numpy update, which may break other packages based on numpy 1.9.2.

It must be built and prepared for use, activated for use, and deactivated after using the python virtual environment. The following are the steps to take:

Creating a virtual environment

The user establishes a virtual environment and installs python packages for use once.

cd <directory_to_create_virtualenv>
virtualenv <virtualenvname>
virtualenv python_virtual_environment
creating a virtualenv
creating a virtualenv

Getting a virtual environment up and running

To use a virtual environment, it must first be activated. It can be accomplished by using one of the commands listed below:

source bin/activate
activating the virtual environment in Python
activating the virtual environment in Python

Getting ready to use

It is usually a one-time operation in which the user must install all of the relevant packages. Pip or source code can be used to install the package. For instance, in our case, we will run the requirements.txt to install the necessary packages as follows.

tuts@codeunderscored:~/interactive_example$ pip install -r requirements.txt
use pip to install requirements.txt
use pip to install requirements.txt

Deactivating a virtual environment

After use, a virtual environment must be deactivated since the $PATH environment variable is reset to the system suitable packages. In our example, we will use the following command listed below to accomplish this:

(virtualenv_name)tuts@codeunderscored:~/interactive_example$ deactivate # generic python
deactivating your virtual environment after use
deactivating your virtual environment after use

Getting rid of a virtual environment

The following command(s) can be used to permanently delete a virtual environment:

tuts@codeunderscored:~/interactive_example$ rm -rf # generic python
getting rid of your virtual environment
getting rid of your virtual environment

Private Modules and Source Install

This is the preferred way for packages that aren’t supported by PyPI repositories or that need to be customized during the build process. The user is urged to read the package installation documentation before deciding on the optimal installation procedure. A flag can be supplied to setup.py-containing packages to direct installation to a specific directory. It should be emphasized that for reliable results, packages compiled from the source must be run on the same platform.

To use the package, the directory must be added to the $PYTHONPATH environment variable once installed. Create a private user environment module and load it with the python package to accomplish this.

The steps for building a private user environment module are outlined in the following section.

Module for creating a private user environment

A module file must be generated and placed in the proper loading location to accomplish this. Each of these points is detailed below:

The following is an example of a basic Python module file. In fact, /usr/local/Modules/modulefiles contains more examples.

#%%Module1.0
#
# The first line above is essential for the file to be considered a module file
#
# Description: example mod module
#
#
proc ModulesHelp { } {
global version
puts stderr "[module-info name] - An example module file to load examplemod python module"
}

prereq isa

# Change the module name 'python/2.7.13/intel' below to reflect
# the correct module used to build this python package
if { [module-info mode load] && ![is-loaded python/2.7.13/intel] } { module load python/2.7.13/intel }

set xarch $env(XARCH)
set xchip $env(XCHIP)

set version "2.0b"
module-whatis "examplemod 2.0b"

#change the path below to reflect the correct installation path of your package
set path /sciclone/data10/<user>/<path_to_install_location>

prepend-path PYTHONPATH $path/lib/site-packages
prepend-path LD_LIBRARY_PATH $path/lib
setenv EXAMPLEMOD $path

This file (for example, examplemod) should be placed in the /privatemodules directory of the user’s home directory (/privatemodules). After that, the module can be loaded by first loading your own module, then the examplemod module.

The following is the whole process (with examplemod) file:

  • mkdir privatemodules
  • cd privatemodules
  • vim examplemod # The choosing of the filename can reflect the package name
  • module load use.own # Adds visibility to users’ private modules
  • module load examplemod # Load actual module

Using easy_install to install packages locally

Easy_install is another widely used utility for installing Python packages, and it is a supported method for many of them. If you try to install a package without root access, this tool, like pip, will fail. When easy_install fails, unlike pip, it indicates that it is possible to install without root, but it does not provide the command. By adding the –user option to easy_install, you can circumvent the need for root access and have it install to your HOME folder instead:

easy_install --user

easy_install is currently deprecated in favor of pip which was released in 2008 still as a build-up on the setuptools components. The latter was initially bundled easy_install module making the entire process of building, installation, and overall management of packages a breeze.

Using the Conda Package Manager to Install Packages

Some users may need to install a different version of a package installed by the HPCC, which will frequently override the version you try to install into your local HOME directory. Users may also come across Python packages that have a large number of dependencies, making installation problematic.

We strongly recommend that you install a copy of the Conda package manager into your HOME folder if you need to install a complex Python package, a package version other than the one we provided, or if you require a specific version of Python. Conda is a Python package manager that gives you complete control over your Python environment and makes installing complex Python workloads as simple as running a few Conda commands.

Anaconda is a well-known Python/R data science and machine learning platform used for large-scale data processing, predictive analytics, and scientific computing.

The Anaconda distribution includes 250 open-source data packages, and more than 7,500 additional packages are available from the Anaconda repositories. It also consists of the conda command-line tool and Anaconda Navigator, a desktop graphical user interface.

Anaconda Installation

At the time of writing, the most recent stable version of Anaconda is 2020.02. Before downloading the installer script, check the Downloads page to see if a new version of Anaconda for Python 3 is available.

Anaconda Navigator is a QT-based graphical user interface. To install Anaconda on Ubuntu 21.04, follow these steps:


Install the following packages if you install Anaconda on a desktop machine and want to use the GUI application. Otherwise, you can skip this step.

sudo apt install libgl1-mesa-glx libegl1-mesa libxrandr2 libxrandr2 libxss1 libxcursor1 libxcomposite1 libasound2 libxi6 libxtst6
installing anaconda dependencies
installing anaconda dependencies

Using your web browser or wget, download the Anaconda installation script:

wget -P /tmp https://repo.anaconda.com/archive/Anaconda3-2020.02-Linux-x86_64.sh

Depending on your connection speed, the download may take some time.

Using wget to download the Anaconda installation script
Using wget to download the Anaconda installation script

This step is optional, but it is recommended to ensure the script’s data integrity. In addition, to view the script checksum, use the sha256sum command:

sha256sum /tmp/Anaconda3-2020.02-Linux-x86_64.sh

As a result, the output should be similar to the following.

confirm the data integrity of the anaconda script
confirm the data integrity of the anaconda script

To begin the installation process, run the script:

bash /tmp/Anaconda3-2020.02-Linux-x86_64.sh

It would help if you got something like this as a result:

begin the installation process of anaconda script
begin the installation process of anaconda script

To proceed, press ENTER.

Use the ENTER key to scroll through the license.

After you’ve finished reviewing the license, you’ll be prompted to approve the license terms:

type yes to accept the license terms
type yes to accept the license terms

If you accept the license, you will be prompted to select an installation location. Most users should be fine with the default location -to confirm the location, press ENTER.

anaconda default location
anaconda default location

The installation may take some time, and once finished, the script will prompt you to run conda init. Further, choose the Yes, option, please.

anaconda installation complete
anaconda installation complete

This will add the command-line tool conda to the PATH of your system.

Close and re-open your shell to activate the Anaconda installation, or load the new PATH environment variable into the current shell session by typing:

Type conda in your terminal to verify the installation.

conda

That’s all!
You have now successfully installed Anaconda on your Ubuntu machine and are ready to use it.

Use the setup.py to install a package script

If you want to install a Python package from somewhere other than the PyPI repository, you can download and unpack the source distribution yourself, then follow the setup instructions. Installing the package in the user’s site-packages directory with a py script:

Initially, do your environment set up

Download the distribution archive (for example, pyglet) from the source. In our case, we will clone pyglet as follows.

pyglet
Clone pyglet

The distribution should be unpacked into a directory in your home directory with a similar name (for example, /pyglet).

Change to the new directory (cd), then enter the following command on the command line: python setup.py install –user

python setup.py install --user
python setup.py install –user

The –user option instructs setup.py to install the package (for example, foo) in the running Python’s user site-packages directory; for example:

~/.local/lib/pythonX.Y/site-packages/foo

Python automatically scans this directory for modules; thus, adding this path to the PYTHONPATH environment variable isn’t required.

If you don’t provide the –user option, setup.py will attempt to install the package in the global site-packages directory (where you don’t have the necessary permissions) and will subsequently fail.

You can also use the –home or –prefix options to install your package in a different location (where you have the appropriate rights), such as a subdirectory (for example, python-pkgs):

Enter the following in your home directory:

python setup.py install –home=~/python-pkgs

Enter the following in your Slate storage space:

python setup.py install –prefix=/N/slate/$USER/python-pkgs

If you want to install your package somewhere other than the user’s site-packages directory, add the path to that directory to the PYTHONPATH environment variable.

Conclusion

We went over how to install Python packages using a variety of options. These included using pip, easy_install, installation on a virtual environment, installation from the source, and using setup.py to install a script.

Setting up a Python package can be as simple as issuing a single command or as complex as required to precisely complete this task.

Python’s documentation is extensive and offers a wealth of knowledge and experience. Furthermore, because the user community is so large, excellent resources are available in various locations across the web.

Overall, we hope that you have mastered the various ways we have covered in our endeavor to make your overall experience with installing Python packages a walk in the pack.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *