CTYPES Mini Tutorial

Donald Kenney (donaldkenney@gmail.com)
Last Update: Thu Sep 1 07:40:17 2016



Introduction

Ctypes is a Python package that allows Python programs to access "foreign" libraries/APIs. Although ctypes is specifically targeted at C-language libraries, C's diverse data types typically relate directly to hardware implementation features. That sometimes allows ctypes to be used to access libraries written for other languages such as FORTRAN

Both a Tutorial and Reference Manual for ctypes exist.

When I needed to use ctypes, I read both and found them to be pretty rough going. Object oriented programming does not come naturally to me and I tended to find myself lost in twisty mazes of types, objects, methods and attributes. A lot of text was devoted to things I don't think I needed to know and conversely, critical information was hidden or buried -- apparently because it was assumed (incorrectly) that it would be obvious to me. My feeling was, and is, that ctypes doesn't have to be that hard. Thus this document.

I subsequently found a less obscure tutorial at http://www.devx.com/opensource/Article/33153

THE MINI-TUTORIAL

Ctypes is a python package that attempts to bridge the gap between Python and Unix/Windows libraries that are written to be used from C, C++ or assembler. Three problems need to be handled.

There is a brief section on dealing with C formatted data from Python.

A final section deals briefly with approaches other than ctypes for accessing foreign libraries in C.

ACCESSING LIBRARIES: This is pretty simple. For most purposes, the following should do the job:

       from ctypes import *         #Load the ctypes names into the local
                                                               #namespace
       x = CDLL(name)         #Create a cddl object x that can be used to
                                                #access the named library
       r = x.function_name()       #call a library function and stash the
                                                   #value it returns in r

There is a choice of four kinds of libraries:

Just plug the appropriate name into the second (x = ...) line. x can be any arbitrary name that catches your fancy. Since it is unconstrained, multiple libraries can be accessed simultaneously by using different names.

There are many variations that can be used if you know what you are doing. For example, on Windows, but not Unix, you may be able to access a library as an attribute of a library object (e.g. windll) built into ctypes without specifying the library file name. But you don't need this document if you know what you are doing.

There are a couple of special cases.

PASSING PARAMETERS:

For the most part, this isn't much harder than accessing libraries. The basic situation is that Python/ctypes knows nothing about the library function that you are calling except its name, where it is, and how it returns a single value -- often used as an operation success indicator. Except for OleDLL libraries, it doesn't even know for sure what the format of that single value is and assumes that it is an integer.

To exacerbate things, C has a multitude of data types related to various permutations of what the data is, the 'size' of the variables, whether numbers are signed or unsigned, and whether strings are ASCII or Unicode. They range from single 8 bit chars to pointers to Unicode strings. Get the C data type for any parameter wrong and bad things will probably happen. Python, has much different data types from C. Some Python types don't map to C data types at all, and those that do typically have a many-to-one or one-to-many relationship.

It is up to the ctypes user to tell Python how many parameters to call library functions with and what the nature of each parameter is. The programmer is going to have to look up the format of the function call in the documentation or the code. For the most part, only a handful of ordinary Python data types -- int, long, str, ustr(Unicode string), and None -- can be used directly. And they can only be used to pass data to a library function, not to retrieve data. Other types of data must be explicitly declared via ctypes. Ctypes allows 17 C variable types to be explicitly used:
c_char char 1-character string
c_wchar wchar_t 1-character Unicode string
c_byte char int/long
c_ubyte unsigned char int/long
c_short short int/long
c_ushort unsigned short int/long
c_int int int/long
c_uint unsigned int int/long
c_long long int/long
c_ulong unsigned long int/long
c_longlong __int64 or long long int/long
c_ulonglong unsigned __int64 or unsigned long long int/long
c_float float float
c_double double float
c_char_p char * (NUL terminated) string or None
c_wchar_p wchar_t * (NUL terminated) Unicode or None
c_void_p null None

So, for the most part, calling a library function once the library has been loaded is just a matter of setting up variables with appropriate layouts -- None, int, long, str or one of the ctypes listed above; making sure that the variables have data in them; and invoking the function as an attribute of the library object. e.g.:

    from ctypes import *                              #Load ctypes module
    x = CDLL("libc.so.6")      #create a library object for the C library
    print x.time(None)                     #Get the time and print it out
    x.printf("%s %d\n", "this is a test", 3)             #print something
  
    #But to use a type other than int, str, long, None we need a Ctype
    x.printf("%s %f\n", "this is a test", c_double(3.14))

You can define Ctype objects explicitly as well as in line as I did with the c_double object above. Manipulating them in Python seems less than straightforward. Once they are initialized doing anything other than printing their value or passing them to a function requires working with their value. Finding where Python has stashed the value can be annoying and/or aggravating. It may be in a .value attribute, or a .contents attribute, or someplace else. It may be accessible via subscript[0]. More on that in the next section on dealing with values returned by library functions.

It is possible to set a library function's argtype parameter to specify the number of parameters and the proper Ctype for the library function. If that is done, calls to that function will be legality checked and type conversions -- where that it possible -- can be done automatically by ctypes.

Again, there are several special cases.

DEALING WITH RETURNED VALUES:

Some functions do things. Others return information. Some do both. In general, most library functions will return a single value. This single value will often be a result code, but may be a computed value. It is accessed by assigning the function to a variable e.g.

    from ctypes import *                              #Load ctypes module
    x = CDLL("libc.so.6")      #create a library object for the C library
    y = x.time(None).  

y now contains the returned value which -- in the case of the time function -- is the current time in seconds from epoch. If the returned value is not an integer, the function's restype attribute can be used to specify an appropriate type or c_type.

What about functions that need to return multiple, or complex or variable length data? That's done by passing pointers. In some cases, the pointer may be to a buffer in the calling program. In others, the calling program passes a empty pointer and gets back a pointer to a buffer in the library's data area. Call by value parameters are never modified and bad things will surely result if a call by value parameter is specified where the function is looking for a pointer. Likewise, byval passes a pointer, but not one that can be used to return data. In almost all cases, the pointer function is what is needed.

If the function returns (a) pointer(s) to its data space, you can simply define an appropriate pointer (POINTER?) and proceed directly to the next section on dealing with C formatted data. If the function wants a pointer to Python's address space, you need to make sure that there is a usable Python accessible variable with the proper size and layout at the address that is pointed to. In the case of strings (which are not mutable (i.e. alterable) objects in Python), ctypes has a create_string_buffer utility function that can be used to create a string buffer that the library can modify.

Ctypes provides Structure and Union classes and a capability to specify bit field subsets of integers. These can be used to define complex data structures if they are really needed. See the ctypes Tutorial sections 1.10-1.12 for details.

DEALING WITH C FORMATTED DATA

When a library function returns data other than the single parameter that can be assigned from the function call, the data will be laid out in C format not Python format. The data layout can be described using Ctype objects. Unfortunately, using Ctype objects in Python is not as easy as one might hope because even basic operations like addition or type conversion apparently are not defined for Ctype objects. If, for example a has type c_int, the expression int(a) is not legal.

To access c type variables, one apparently needs to use/set the object's value or contents attributes or for objects accessed by pointers, the [0] instance of the pointer. For example:

    x = c_int(2)
    y = c_int(3)
    z = x.value + y.value
    print x,y,z
      c_long(2) c_long(3) 5
    z = c_int(y.value**2)
    print z
      c_long(9)

I have yet to find a description of exactly what value and content are and why there are two different attributes. The ctypes documentation seems to assume that the nature will be obvious. (In fact, I didn't see the value attribute mentioned at all although I imagine that it is there somewhere). About all I know at this point is that value often works as does subscripting pointers with [0]. But sometimes contents or contents.value(???) has to be used instead. Moreover the .contents attribute doesn't always exist including some cases that seem to me to match examples of .contents usage found on the Internet. I'll update this paragraph if further understanding ever comes to me.

Ctypes also has a cast(ctype1,ctype2) function that can be used to change c variable types within c function calls.

The normal approach to retrieving data from a library function is to construct a Structure and/or Union(s) of Ctype objects with the proper layout. Then set up a pointer to the Structure and pass that to the function. Upon return from the function, the data in the structure can be accessed using the contents and/or value attributes of the ctypes. This should work whether the data is stored within Python or the called function. If the data is in memory owned by the function, the pointer will have been quietly switched to point to it.

Adequate storage must be allocated for any fixed or variable length information that will be stored in Python.

There is a set of Ctype utility functions described in section 6 of the Ctypes Reference. I'll list them here. See the manual for more data should you need to use one of these.

OTHER APPROACHES TO FOREIGN LIBRARIES

Ctypes is one of a number of approaches to accessing foreign libraries from Python. Other approaches include: SWIG, Cython/Pyrex, Boost and Psycho.

The principal advantages of ctypes are that it is built into Python 2.5 and later, that it is portable to non x86 architectures, and that it does not require recompilation of existing libraries. The principle disadvantages seem to be the possible need to define complex data structures by hand and the fact that errors in doing so will likely make themselves known by crashing Python.

Other solutions tend to require recompiling the target C code which is a problem if it is not available. There is a discussion of various approaches at http://www.msri.org/web/msri/for-visitors/computing-handbook.


Copyright 2006-2012 Donald Kenney (Donald.Kenney@GMail.com). Unless otherwise stated, permission is hereby granted to use any materials on these pages under the Creative Commons License V2.5.

This page will have been validated as Valid HTML 4.01 Transitional prior to posting on the web site. W3C Logo Image omitted.