Installing Python Modules

How to install local python modules

The version of python installed in the labs is fixed with a number of modules installed limited by the main lab build at the beginning of the year. While it is possible for the administrator to add more modules it is not something that we let normal users do.

In some versions of python it is possible to run

help('modules')

Or to search we can use the following

help('modules zip')

However due to some packages we have installed in the labs this will fail (It is a known issue with python see here ) and we need to use other methods.

Python site-packages (Global and Local)

When python runs it reads an environment variable called PYTHONPATH. By default this is set to just include the installed python site-packages.

The following code will print the contents of the users PYTHONPATH

import sys
print '\n'.join(sys.path)
NCCA Installed Packages
 
/usr/lib64/python2.7/site-packages/pyfits-3.5.dev-py2.7-linux-x86_64.egg
/usr/lib64/python27.zip
/usr/lib64/python2.7
/usr/lib64/python2.7/plat-linux2
/usr/lib64/python2.7/lib-tk
/usr/lib64/python2.7/lib-old
/usr/lib64/python2.7/lib-dynload
/usr/lib64/python2.7/site-packages
/usr/lib64/python2.7/site-packages/gst-0.10
/usr/lib64/python2.7/site-packages/gtk-2.0
/usr/lib64/python2.7/site-packages/wx-2.8-gtk2-unicode
/usr/lib/python2.7/site-packages
/usr/local/lib/python/site-packages
/usr/local/lib/python2.7/site-packages
/usr/local/lib64/python2.7/site-packages

As you can all of these directories are locked for any user but root, and hense nobody can install new packages.

easy_install

Python come with a package called easy_install which is installed as part of the setuptools package. This allows us to install both to the site-packages directory if we are root, or into any other specified directory that the user has permissions to write too. In the following example we will use easy_install to install a tool called pip which with the current lab build is not installed (but should be soon).

First we must create a directory to install the local packages, python will automatically search .local for site-packages so we don’t need it to be added to PYTHONPATH

mkdir -p .local/lib/python2.7/site-packages
easy_install --prefix=$HOME/.local pip
.local/bin/pip install --upgrade  --user setuptools

If you re-run the python script to check the sys.path you should now see that /local/[USERNAME]/.local/lib/python2.7/site-packages has been added to the output.

We can now add an alias to our .bashrc to run this locally installed version of pip (see here for details of setting up the .bashrc )

alias piplocal='$HOME/.local/bin/pip'

If we now run

piplocal freeze 

It will give use a full list of the python packages installed in the base lab build as well as our local packages.

pip freeze
 
backports.ssl-match-hostname==3.4.0.2
Beaker==1.5.4
blivet==0.61.15.59
Brlapi==0.6.0
cffi==1.6.0
chardet==2.2.1
configobj==4.7.2
configshell-fb==1.1.18
coverage==3.6b3
cryptography==1.3.1
cupshelpers==1.0
custodia==0.1.0
decorator==3.4.0
di==0.3
dnspython==1.12.0
ecdsa==0.13
enum34==1.0.4
ethtool==0.8
firstboot==19.5
fros==1.0
glfw==1.4.0
gssapi==1.2.0
idna==2.0
iniparse==0.4
initial-setup==0.3.9.36
iotop==0.6
ipaclient==4.4.0
ipaddr==2.1.9
ipaddress==1.0.16
ipalib==4.4.0
ipaplatform==4.4.0
ipapython==4.4.0
IPy==0.75
javapackages==1.0.0
jsonpointer==1.9
jwcrypto==0.2.1
kerberos==1.1
kitchen==1.1.1
kmod==0.1
langtable==0.0.31
lxml==3.2.1
M2Crypto==0.21.1
Magic-file-extensions==0.2
Mako==0.8.1
MarkupSafe==0.11
MaterialX==1.35.2
matplotlib==1.2.0
mercurial==3.4.2
MySQL-python==1.2.5
netaddr==0.7.5
netifaces==0.10.4
nose==1.3.0
ntplib==0.3.2
numpy==1.7.1
OpenEXR==1.3.0
paramiko==1.16.1
Paste==1.7.5.1
pciutils==1.7.3
pcp==1.0
perf==0.1
Pillow==2.0.0
ply==3.4
policycoreutils-default-encoding==0.1
pyasn1==0.1.9
pycparser==2.14
pycrypto==2.6.1
pycups==1.9.63
pycurl==7.19.0
pyfits==3.5.dev0
pygobject==3.14.0
pygpgme==0.3
pyinotify==0.9.4
pykickstart==1.99.66.10
pyliblzma==0.5.3
PyOpenGL==3.1.0b2
pyOpenSSL==0.13.1
pyparsing==1.5.6
pyparted==3.9
PySDL2==0.9.5
python-augeas==0.5.0
python-dateutil==1.5
python-dmidecode==3.10.13
python-ldap==2.4.15
python-meh==0.25.2
python-nss==0.16.0
python-yubico==1.2.3
pytz===2012d
pyudev==0.15
pyusb==1.0.0b1
pyxattr==0.5.1
qrcode==5.0.1
redhat-access-insights==1.0.11
requests==2.6.0
rhnlib==2.5.65
rhsm==1.17.9
rtslib-fb==2.1.57
scipy==0.12.1
seobject==0.1
sepolicy==1.1
setroubleshoot==1.1
six==1.9.0
slip==0.4.0
slip.dbus==0.4.0
SSSDConfig==1.14.0
subscription-manager==1.17.15
targetcli-fb===2.1.fb41
Tempita==0.5.1
urlgrabber==3.10
urllib3==1.10.2
urwid==1.1.1
wxPython==2.8.12.0
wxPython-common==2.8.12.0
yum-langpacks==0.4.2
yum-metadata-parser==1.1.4

A simple test

The following example will now install a simple python package and we will then write a small program to test it. We are going to install a package called progressbar into our local python site-packages. To do this we need to use the –user command line argument to pip.

Collecting progressbar
Downloading progressbar-2.3.tar.gz
Installing collected packages: progressbar
Running setup.py install for progressbar ... done
Successfully installed progressbar-2.3

We can now write the following program to test it, first create an empty python file and make it executable.

touch progress.py
chmod 777 progress.py

Edit the file and add the following

#!/usr/bin/python
import progressbar
from time import sleep

bar = progressbar.ProgressBar()
for i in bar(range(100)) :
  sleep(0.1)

To run just type ./progress.py and you should see a simple text progress bar running.

Tensorflow

This next example will install a more complex example that a lot of people have been asking for which is the neural network library called tensorflow

piplocal install --user tensorflow

Using this test example

tensor flow example
 
'''
A linear regression learning algorithm example using TensorFlow library.
Author: Aymeric Damien
Project: https://github.com/aymericdamien/TensorFlow-Examples/
'''

from future import print_function

import tensorflow as tf import numpy import matplotlib.pyplot as plt rng = numpy.random

Parameters

learning_rate = 0.01 training_epochs = 1000 display_step = 50

Training Data

train_X = numpy.asarray([3.3,4.4,5.5,6.71,6.93,4.168,9.779,6.182,7.59,2.167, 7.042,10.791,5.313,7.997,5.654,9.27,3.1]) train_Y = numpy.asarray([1.7,2.76,2.09,3.19,1.694,1.573,3.366,2.596,2.53,1.221, 2.827,3.465,1.65,2.904,2.42,2.94,1.3]) n_samples = train_X.shape[0]

tf Graph Input

X = tf.placeholder(“float”) Y = tf.placeholder(“float”)

Set model weights

W = tf.Variable(rng.randn(), name=“weight”) b = tf.Variable(rng.randn(), name=“bias”)

Construct a linear model

pred = tf.add(tf.multiply(X, W), b)

Mean squared error

cost = tf.reduce_sum(tf.pow(pred-Y, 2))/(2*n_samples)

Gradient descent

Note, minimize() knows to modify W and b because Variable objects are trainable=True by default

optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)

Initialize the variables (i.e. assign their default value)

init = tf.global_variables_initializer()

Start training

with tf.Session() as sess:

# Run the initializer
sess.run(init)

# Fit all training data
for epoch in range(training_epochs):
    for (x, y) in zip(train_X, train_Y):
        sess.run(optimizer, feed_dict={X: x, Y: y})

    # Display logs per epoch step
    if (epoch+1) % display_step == 0:
        c = sess.run(cost, feed_dict={X: train_X, Y:train_Y})
        print("Epoch:", '%04d' % (epoch+1), "cost=", "{:.9f}".format(c), \
            "W=", sess.run(W), "b=", sess.run(b))

print("Optimization Finished!")
training_cost = sess.run(cost, feed_dict={X: train_X, Y: train_Y})
print("Training cost=", training_cost, "W=", sess.run(W), "b=", sess.run(b), '\n')

# Graphic display
plt.plot(train_X, train_Y, 'ro', label='Original data')
plt.plot(train_X, sess.run(W) * train_X + sess.run(b), label='Fitted line')
plt.legend()
plt.show()

# Testing example, as requested (Issue #2)
test_X = numpy.asarray([6.83, 4.668, 8.9, 7.91, 5.7, 8.7, 3.1, 2.1])
test_Y = numpy.asarray([1.84, 2.273, 3.2, 2.831, 2.92, 3.24, 1.35, 1.03])

print("Testing... (Mean square loss Comparison)")
testing_cost = sess.run(
    tf.reduce_sum(tf.pow(pred - Y, 2)) / (2 * test_X.shape[0]),
    feed_dict={X: test_X, Y: test_Y})  # same function as cost above
print("Testing cost=", testing_cost)
print("Absolute mean square loss difference:", abs(
    training_cost - testing_cost))

plt.plot(test_X, test_Y, 'bo', label='Testing data')
plt.plot(train_X, sess.run(W) * train_X + sess.run(b), label='Fitted line')
plt.legend()
plt.show()

We should get the following images

tf1 tf1

What about the DCC tools?

Both maya and houdini ship with different versions of python to the ones we are using in the shell. It is possible to run both of tese either in the actual tool or in the shell. The following examples will show this using the shell.

The following test program is going to be used which is taken from here and for these tests will be saved as tftest.py

import tensorflow as tf

class SquareTest(tf.test.TestCase):
  def testSquare(self):
    with self.test_session():
      x = tf.square([2, 3])
      self.assertAllEqual(x.eval(), [4, 9])

if __name__ == '__main__':
  tf.test.main()

maya

Maya python is simple to use, it lives in the directory /opt/autodesk/maya/bin/ and is called mayapy. The easiest way to use it is to add this directory to the current path in the .bashrc

export PATH=$PATH:/opt/autodesk/maya/bin/

We can then execute the following with the test program

mayapy tftest.py
2017-11-22 12:41:13.860018: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
..
----------------------------------------------------------------------
Ran 2 tests in 0.019s

OK

For more examples of using mayapy see this blog post.

Houdini

The houdini python environment can be setup using the installed houdini setup scripts, for the latest version installed in the lab use

cd /opt/hfs16.0.705/
source houdini_setup_bash
The Houdini 16.0.705 environment has been initialized.

We can now use hython to run the demo program.

hython tftest.py
2017-11-22 12:44:59.014949: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
..
----------------------------------------------------------------------
Ran 2 tests in 0.023s

OK

virtualenv

virtualenv is a sandboxed virtual python environment which allows for a totally clean install of python with no modules associated with the base system build.

At present this is not installed in the lab but will be for the next lab update, however as we have installed our local pip environment we can add it to this.

piplocal install --user virtualenv
ls ~/.local/bin
easy_install      f2py         pbr  pip2    saved_model_cli  virtualenv
easy_install-2.7  markdown_py  pip  pip2.7  tensorboard      wheel

To run virtualenv we can do the following to setup an new sandbox. In this example I’m using sandbox as the name, however this is just a directory and we can have as many unique environment as we like, just change the name.

~/.local/bin/virtualenv sandbox
New python executable in /local/vince/sandbox/bin/python
Installing setuptools, pip, wheel...done.

To enable the sandbox we need to change to the new directory created and run the activate script

cd sandbox/
-bash-4.2$ source bin/activate
(sandbox) -bash-4.2$

You will notice the prompt has changed to (sandbox) to show you we are in the virtual environment, we can now run python which will run the sandboxed version and not the global one.

which python
~/sandbox/bin/python

To install we use pip as normal, so for example to install the theano neural network library we can do the following.

(sandbox) -bash-4.2$ pip install theano
Collecting theano
  Downloading Theano-1.0.0.tar.gz (2.9MB)
    100% |████████████████████████████████| 2.9MB 414kB/s
Collecting numpy>=1.9.1 (from theano)
  Using cached numpy-1.13.3-cp27-cp27mu-manylinux1_x86_64.whl
Collecting scipy>=0.14 (from theano)
  Downloading scipy-1.0.0-cp27-cp27mu-manylinux1_x86_64.whl (46.7MB)
    100% |████████████████████████████████| 46.7MB 35kB/s
Collecting six>=1.9.0 (from theano)
  Using cached six-1.11.0-py2.py3-none-any.whl
Building wheels for collected packages: theano
  Running setup.py bdist_wheel for theano ... done
  Stored in directory: /local/vince/.cache/pip/wheels/e1/38/41/29fead4ea90d8fb9e23af0ba80d24222f8ba6effe93896ecbf
Successfully built theano
Installing collected packages: numpy, scipy, six, theano
Successfully installed numpy-1.13.3 scipy-1.0.0 six-1.11.0 theano-1.0.0

We can now test using this simple example

import theano
import numpy

x = theano.tensor.fvector('x')
W = theano.shared(numpy.asarray([0.2, 0.7]), 'W')
y = (x * W).sum()

f = theano.function([x], y)

output = f([1.0, 1.0])
print output

To exit the sandbox and revert to our system installed python we need to use the command deactivate

Using virtualenv within python scripts

It is possible to use a pre-created sandbox in your own scripts (this will also work with dcc tools as well).

The following code needs to be added to any python script to enable the environment. It is best to do this as the first part of the script as it will setup the version used.

import os
import virtualenv
sandboxenvdir='/local/vince/sandbox'
activate_script = os.path.join(sandboxenvdir, "bin", "activate_this.py")
execfile(activate_script, dict(__file__=activate_script))

Note in the above example I am logged in as the user vince for testing and am using the directory '/local/vince/sandbox' you will need to change it for the path to the correct directory.

For the above theano example we can use

import os
import virtualenv
sandboxenvdir='/local/vince/sandbox'
activate_script = os.path.join(sandboxenvdir, "bin", "activate_this.py")
execfile(activate_script, dict(__file__=activate_script))

import theano
import numpy

x = theano.tensor.fvector('x')
W = theano.shared(numpy.asarray([0.2, 0.7]), 'W')
y = (x * W).sum()

f = theano.function([x], y)

output = f([1.0, 1.0])
print output

And run with the site python, mayapy or hython.

virtualenv pycuda

To install the package pycuda we can use a combination of pip and building from source (this is due to how our system is built). Assuming you are using either the local version of virtualenv above (or, as some labs now have installed, the global one)

virtualenv pycuda
cd pycuda
source bin/activate
pip install numpy pycuda

This will install numpy and attempt to install the pycuda and assosiated package. This will however fail, but we can then build it from source.

git clone --recursive http://git.tiker.net/trees/pycuda.git
cd pycuda/
./configure.py --cuda-root=/usr --cuda-inc-dir=/usr/include/cuda --cuda-enable-gl --cudart-lib-dir=/usr/lib64/nvidia/:/usr/lib64/ --curand-lib-dir=/usr/lib64/nvidia
make -j12
make install

This will install pycuda into the local python environment. We can now write a simple test program (based on this demo)

#!/usr/bin/env python
import pycuda.driver as cuda
import pycuda.autoinit
from pycuda.compiler import SourceModule

import numpy
a = numpy.random.randn(4,4)
a = a.astype(numpy.float32)
a_gpu = cuda.mem_alloc(a.nbytes)
cuda.memcpy_htod(a_gpu, a)

mod = SourceModule("""
  __global__ void doublify(float *a)
  {
    int idx = threadIdx.x + threadIdx.y*4;
    a[idx] *= 2;
  }
  """, options=['-I/usr/include/cuda/','-std=c++11'])

func = mod.get_function("doublify")
func(a_gpu, block=(4,4,1))

a_doubled = numpy.empty_like(a)
cuda.memcpy_dtoh(a_doubled, a_gpu)
print a_doubled
print a

Note this script differes from the sample as it adds the following options to the SourceModule command options=['-I/usr/include/cuda/','-std=c++11'] this is needed to allow our local version of nvcc to work.

Running this program should give the following output.

[[-0.41994292  3.34337902 -0.07818903  0.75090253]
[ 1.59274638 -0.11435074 -1.85094416 -0.13969003]
[ 2.59167218  0.39977184 -1.58711851  0.61941791]
[-0.58775586  0.76065254 -0.30258772 -1.32978904]]
[[-0.20997146  1.67168951 -0.03909452  0.37545127]
[ 0.79637319 -0.05717537 -0.92547208 -0.06984501]
[ 1.29583609  0.19988592 -0.79355925  0.30970895]
[-0.29387793  0.38032627 -0.15129386 -0.66489452]]

virtualenv pyopencl

To install the package pyopencl we can use a combination of pip and building from source

virtualenv pyopencl
cd pyopencl/
source bin/activate
pip install numpy Mako

Now we clone the pyopencl source

git clone --recursive http://git.tiker.net/trees/pyopencl.git
cd pyopencl/
./configure.py --cl-lib-dir=/usr/lib64/nvidia/ --cl-enable-gl

Next we need to evit the siteconf.py file and modify it to look like This

CL_TRACE = False
CL_ENABLE_GL = True
CL_PRETEND_VERSION = "1.1"
CL_USE_SHIPPED_EXT = True
CL_INC_DIR = []
CL_LIB_DIR = ['/usr/lib64/nvidia/']
CL_LIBNAME = ['OpenCL']
CXXFLAGS = ['-std=c++11']
LDFLAGS = ['-Wl,--no-as-needed']

In particular we need to fool the CL version as per the discussion here

Next we build and install

make -j 12
make install

To test we need to add the runtime to the LD_LIBRARY_PATH (this can be added to the .bashrc)

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib64/nvidia

Finally use this test program from here make sure it is not in the source directory.

#!/usr/bin/env python
# -*- coding: utf-8 -*-

from __future__ import absolute_import, print_function
import numpy as np
import pyopencl as cl

a_np = np.random.rand(150000).astype(np.float32)
b_np = np.random.rand(150000).astype(np.float32)

ctx = cl.create_some_context()
queue = cl.CommandQueue(ctx)

mf = cl.mem_flags
a_g = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=a_np)
b_g = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=b_np)

prg = cl.Program(ctx, """
__kernel void sum(
    __global const float *a_g, __global const float *b_g, __global float *res_g)
{
  int gid = get_global_id(0);
  res_g[gid] = a_g[gid] + b_g[gid];
}
""").build()

res_g = cl.Buffer(ctx, mf.WRITE_ONLY, a_np.nbytes)
prg.sum(queue, a_np.shape, None, a_g, b_g, res_g)

res_np = np.empty_like(a_np)
cl.enqueue_copy(queue, res_np, res_g)

# Check on CPU with Numpy:
print(res_np - (a_np + b_np))
print(np.linalg.norm(res_np - (a_np + b_np)))
Previous