CTYPES Mini Tutorial

Donald Kenney (donaldkenney@gmail.com)
Last Update: Tue May 9 02:57:48 2023

Introduction
THE MINI-TUTORIAL
DEALING WITH C FORMATTED DATA
OTHER APPROACHES TO FOREIGN LIBRARIES

Introduction

Ctypes is a Python package that allows Python programs to access "foreign" libraries/APIs. Although ctypes is specifically targeted at C-language libraries, C's diverse data types typically relate directly to hardware implementation features. That sometimes allows ctypes to be used to access libraries written for other languages such as FORTRAN

Both a Tutorial and Reference Manual for ctypes exist.

When I needed to use ctypes, I read both and found them to be pretty rough going. Object oriented programming does not come naturally to me and I tended to find myself lost in twisty mazes of types, objects, methods and attributes. A lot of text was devoted to things I don't think I needed to know and conversely, critical information was hidden or buried -- apparently because it was assumed (incorrectly) that it would be obvious to me. My feeling was, and is, that ctypes doesn't have to be that hard. Thus this document.

I subsequently found a less obscure tutorial at http://www.devx.com/opensource/Article/33153

THE MINI-TUTORIAL

Ctypes is a python package that attempts to bridge the gap between Python and Unix/Windows libraries that are written to be used from C, C++ or assembler. Three problems need to be handled.

The first is providing access to the libraries that typically have numerous entry points, often with complex calling conventions that vary from one entry point to the next.
The second problem is that of passing parameters to the library functions.
The third problem is that of accessing and using information returned by those library functions that return information.

There is a brief section on dealing with C formatted data from Python.

A final section deals briefly with approaches other than ctypes for accessing foreign libraries in C.

ACCESSING LIBRARIES: This is pretty simple. For most purposes, the following should do the job:

       from ctypes import *         #Load the ctypes names into the local
                                                               #namespace
       x = CDLL(name)         #Create a cddl object x that can be used to
                                                #access the named library
       r = x.function_name()       #call a library function and stash the
                                                   #value it returns in r

There is a choice of four kinds of libraries:

CDLL libraries are Unix shared libraries.
WinDLL libraries are Windows libraries that do not return an HRESULT code.
OleDLL libraries are Windows libraries that do return an HRESULT code.
The PyDLL library is the Python API.

Just plug the appropriate name into the second (x = ...) line. x can be any arbitrary name that catches your fancy. Since it is unconstrained, multiple libraries can be accessed simultaneously by using different names.

There are many variations that can be used if you know what you are doing. For example, on Windows, but not Unix, you may be able to access a library as an attribute of a library object (e.g. windll) built into ctypes without specifying the library file name. But you don't need this document if you know what you are doing.

There are a couple of special cases.

A few libraries export functions by number rather than by name. This is handled by sub-scripting the library name.
Some libraries export variables as well as functions. Consult the real Tutorial if you need to access these.

PASSING PARAMETERS:

For the most part, this isn't much harder than accessing libraries. The basic situation is that Python/ctypes knows nothing about the library function that you are calling except its name, where it is, and how it returns a single value -- often used as an operation success indicator. Except for OleDLL libraries, it doesn't even know for sure what the format of that single value is and assumes that it is an integer.

To exacerbate things, C has a multitude of data types related to various permutations of what the data is, the 'size' of the variables, whether numbers are signed or unsigned, and whether strings are ASCII or Unicode. They range from single 8 bit chars to pointers to Unicode strings. Get the C data type for any parameter wrong and bad things will probably happen. Python, has much different data types from C. Some Python types don't map to C data types at all, and those that do typically have a many-to-one or one-to-many relationship.

It is up to the ctypes user to tell Python how many parameters to call library functions with and what the nature of each parameter is. The programmer is going to have to look up the format of the function call in the documentation or the code. For the most part, only a handful of ordinary Python data types -- int, long, str, ustr(Unicode string), and None -- can be used directly. And they can only be used to pass data to a library function, not to retrieve data. Other types of data must be explicitly declared via ctypes. Ctypes allows 17 C variable types to be explicitly used:

c_char char 1-character string

c_wchar wchar_t 1-character Unicode string

c_byte char int/long

c_ubyte unsigned char int/long

c_short short int/long

c_ushort unsigned short int/long

c_int int int/long

c_uint unsigned int int/long

c_long long int/long

c_ulong unsigned long int/long

c_longlong __int64 or long long int/long

c_ulonglong unsigned __int64 or unsigned long long int/long

c_float float float

c_double double float

c_char_p char * (NUL terminated) string or None

c_wchar_p wchar_t * (NUL terminated) Unicode or None

c_void_p null None

So, for the most part, calling a library function once the library has been loaded is just a matter of setting up variables with appropriate layouts -- None, int, long, str or one of the ctypes listed above; making sure that the variables have data in them; and invoking the function as an attribute of the library object. e.g.:

    from ctypes import *                              #Load ctypes module
    x = CDLL("libc.so.6")      #create a library object for the C library
    print x.time(None)                     #Get the time and print it out
    x.printf("%s %d\n", "this is a test", 3)             #print something
  
    #But to use a type other than int, str, long, None we need a Ctype
    x.printf("%s %f\n", "this is a test", c_double(3.14))

You can define Ctype objects explicitly as well as in line as I did with the c_double object above. Manipulating them in Python seems less than straightforward. Once they are initialized doing anything other than printing their value or passing them to a function requires working with their value. Finding where Python has stashed the value can be annoying and/or aggravating. It may be in a .value attribute, or a .contents attribute, or someplace else. It may be accessible via subscript[0]. More on that in the next section on dealing with values returned by library functions.

It is possible to set a library function's argtype parameter to specify the number of parameters and the proper Ctype for the library function. If that is done, calls to that function will be legality checked and type conversions -- where that it possible -- can be done automatically by ctypes.

Again, there are several special cases.

Callback parameters can be done with ctypes, But they are complicated because you have Python calling C which then calls Python which then possibly returns to C. Handling callbacks involves the use of the ctypes CFUNCTYPE function. You'll probably have to check the documentation for details. Fortunately, callbacks aren't all that common.
Pointers are more common. Strings are actually handled invisibly by ctypes by passing pointers. Unfortunately, pointers can't always be invisible. You may sometimes have to deal explicitly with pointer parameters. With ctypes, pointers can be created with a function (method) called pointer. Pointer is invoked as 'pointer(x)' where x is a object that is either a Ctype (e.g. c_int) or something that can be converted to a Ctype like a Python int. There is also a method called byval that creates a lightweight pointer. Use pointer when passing a pointer to an object that may be modified by the function being called. Byval can be used instead if the function does not modify the object. There is another function called POINTER which is NOT the same as pointer and creates what is basically an empty pointer. It appears to be useful when accessing data structures in the C program address space. Subscribing pointers is allowed. It appears to be useful for accessing arrays of C formatted information where [0] points to the first entry in the array and [n] presumably points to the nth element.
We'll see more about pointers when we deal with returned values in the next section.

DEALING WITH RETURNED VALUES:

Some functions do things. Others return information. Some do both. In general, most library functions will return a single value. This single value will often be a result code, but may be a computed value. It is accessed by assigning the function to a variable e.g.

    from ctypes import *                              #Load ctypes module
    x = CDLL("libc.so.6")      #create a library object for the C library
    y = x.time(None).

y now contains the returned value which -- in the case of the time function -- is the current time in seconds from epoch. If the returned value is not an integer, the function's restype attribute can be used to specify an appropriate type or c_type.

What about functions that need to return multiple, or complex or variable length data? That's done by passing pointers. In some cases, the pointer may be to a buffer in the calling program. In others, the calling program passes a empty pointer and gets back a pointer to a buffer in the library's data area. Call by value parameters are never modified and bad things will surely result if a call by value parameter is specified where the function is looking for a pointer. Likewise, byval passes a pointer, but not one that can be used to return data. In almost all cases, the pointer function is what is needed.

If the function returns (a) pointer(s) to its data space, you can simply define an appropriate pointer (POINTER?) and proceed directly to the next section on dealing with C formatted data. If the function wants a pointer to Python's address space, you need to make sure that there is a usable Python accessible variable with the proper size and layout at the address that is pointed to. In the case of strings (which are not mutable (i.e. alterable) objects in Python), ctypes has a create_string_buffer utility function that can be used to create a string buffer that the library can modify.

Ctypes provides Structure and Union classes and a capability to specify bit field subsets of integers. These can be used to define complex data structures if they are really needed. See the ctypes Tutorial sections 1.10-1.12 for details.

DEALING WITH C FORMATTED DATA

When a library function returns data other than the single parameter that can be assigned from the function call, the data will be laid out in C format not Python format. The data layout can be described using Ctype objects. Unfortunately, using Ctype objects in Python is not as easy as one might hope because even basic operations like addition or type conversion apparently are not defined for Ctype objects. If, for example a has type c_int, the expression int(a) is not legal.

To access c type variables, one apparently needs to use/set the object's value or contents attributes or for objects accessed by pointers, the [0] instance of the pointer. For example:

    x = c_int(2)
    y = c_int(3)
    z = x.value + y.value
    print x,y,z
      c_long(2) c_long(3) 5
    z = c_int(y.value**2)
    print z
      c_long(9)

I have yet to find a description of exactly what value and content are and why there are two different attributes. The ctypes documentation seems to assume that the nature will be obvious. (In fact, I didn't see the value attribute mentioned at all although I imagine that it is there somewhere). About all I know at this point is that value often works as does subscripting pointers with [0]. But sometimes contents or contents.value(???) has to be used instead. Moreover the .contents attribute doesn't always exist including some cases that seem to me to match examples of .contents usage found on the Internet. I'll update this paragraph if further understanding ever comes to me.

Ctypes also has a cast(ctype1,ctype2) function that can be used to change c variable types within c function calls.

The normal approach to retrieving data from a library function is to construct a Structure and/or Union(s) of Ctype objects with the proper layout. Then set up a pointer to the Structure and pass that to the function. Upon return from the function, the data in the structure can be accessed using the contents and/or value attributes of the ctypes. This should work whether the data is stored within Python or the called function. If the data is in memory owned by the function, the pointer will have been quietly switched to point to it.

Adequate storage must be allocated for any fixed or variable length information that will be stored in Python.

There is a set of Ctype utility functions described in section 6 of the Ctypes Reference. I'll list them here. See the manual for more data should you need to use one of these.

addressof(obj) -- return address of memory buffer
alignment(obj_or_type) -- get alignment requirements
byref(obj) -- construct a lightweight pointer to obj
cast(obj, type) -- define obj as being a different Ctype
create_string_buffer(size or init,size)
create_unicode_buffer(size or init,size)
DllCanUnloadNow() windows only
DllGetClassObject() windows only
FormatError(code) windows only Return error desc
GetLastError() Return last error code
memmove(dst,src,count)
memset(dst, c, count)
POINTER(type) create a new ctypes pointer type
pointer(obj) create a pointer to obj
resize(obj, size) expand obj
set_conversion_mode(encoding, errors) set mode for unicode cnvsns
sizeof(obj_or_type)
string_at(address,size or address) retrieve string from address
WinError(code=None, descr=None) windows only
wstring_at(address)

OTHER APPROACHES TO FOREIGN LIBRARIES

Ctypes is one of a number of approaches to accessing foreign libraries from Python. Other approaches include: SWIG, Cython/Pyrex, Boost and Psycho.

The principal advantages of ctypes are that it is built into Python 2.5 and later, that it is portable to non x86 architectures, and that it does not require recompilation of existing libraries. The principle disadvantages seem to be the possible need to define complex data structures by hand and the fact that errors in doing so will likely make themselves known by crashing Python.

Other solutions tend to require recompiling the target C code which is a problem if it is not available. There is a discussion of various approaches at http://www.msri.org/web/msri/for-visitors/computing-handbook.

This page will have been validated as Valid HTML 4.01 Transitional prior to posting on the web site. W3C Logo Image omitted.

c_char	char	1-character string
c_wchar	wchar_t	1-character Unicode string
c_byte	char	int/long
c_ubyte	unsigned char	int/long
c_short	short	int/long
c_ushort	unsigned short	int/long
c_int	int	int/long
c_uint	unsigned int	int/long
c_long	long	int/long
c_ulong	unsigned long	int/long
c_longlong	__int64 or long long	int/long
c_ulonglong	unsigned __int64 or unsigned long long	int/long
c_float	float	float
c_double	double	float
c_char_p	char * (NUL terminated)	string or None
c_wchar_p	wchar_t * (NUL terminated)	Unicode or None
c_void_p	null	None