Bridging Python and C++ with pybind11
pybind11 lets you write performance-critical code in C++ and call it directly from Python. Understanding how the binding layer works explains why this is a serious tool, not just a workaround.
I am an engineer. I like to engineer software the same way I think about physical products: you pick the right material for each part of the structure. Python is expressive, fast to write, and has an enormous ecosystem. C++ is compiled, has direct memory control, and runs close to the hardware. The question is not which one is better. The question is how to use both where each belongs. pybind11 is the answer to that question.
Why Python Needs C++ at All
Python’s strength is productivity. Its weakness is raw compute speed. When you process a large vector, run a physics simulation, apply a numerical algorithm to millions of data points, or interface with hardware, Python’s overhead becomes the bottleneck.
The standard solution is to write the expensive routine in C or C++ and expose it to Python. NumPy, OpenCV, PyTorch, and most performance-sensitive Python libraries work exactly this way. The Python layer handles orchestration and user-facing logic. The compiled layer does the heavy work.
pybind11 is a header-only C++ library that makes writing that compiled layer straightforward. It automatically converts Python types to C++ types and back, exposes C++ functions and classes to Python, and requires no intermediate code generation or separate translation step.
How the Binding Works
The core idea is simple. You write a C++ function, then use pybind11’s PYBIND11_MODULE macro to declare a Python module and bind that function to a Python-callable name. When Python imports the compiled module, it can call the C++ function directly.
#include <pybind11/pybind11.h>
#include <pybind11/stl.h>
#include <vector>
namespace py = pybind11;
int process_vector(const std::vector<int>& data) {
int sum = 0;
for (int value : data) {
sum += value;
}
return sum;
}
PYBIND11_MODULE(datamodule, m) {
m.doc() = "pybind11 module for data transfer.";
m.def("process_data", &process_vector,
"Accepts a list of integers and returns their sum.");
}
PYBIND11_MODULE takes the module name and a variable m that represents the module interface. m.def binds the C++ function process_vector to the Python name process_data. The &process_vector is a pointer to the function. pybind11 uses template metaprogramming to infer the argument and return types automatically.
The #include <pybind11/stl.h> header is critical. It enables automatic conversion between Python lists and std::vector, Python dicts and std::map, and other standard library containers. Without it, passing a Python list to a function expecting std::vector<int> will fail.
Calling It from Python
Once compiled, the module is imported like any other Python module. Python types pass in, C++ executes, and the result comes back as a Python type.
import datamodule
python_list = [10, 20, 30, 40, 50]
result = datamodule.process_data(python_list)
print(result) # 150
pybind11 converts the Python list to const std::vector<int>& before the function executes, and converts the int return value back to a Python integer. No manual conversion code is required on either side.
Compiling the Module
There are three ways to compile a pybind11 module. The simplest for a single file is the direct compiler command:
# Linux
c++ -O3 -Wall -shared -std=c++11 -fPIC \
$(python3 -m pybind11 --includes) datamodule.cpp \
-o datamodule$(python3-config --extension-suffix)
# macOS (add -undefined dynamic_lookup)
c++ -O3 -Wall -shared -std=c++11 -fPIC -undefined dynamic_lookup \
$(python3 -m pybind11 --includes) datamodule.cpp \
-o datamodule$(python3-config --extension-suffix)
python3 -m pybind11 --includes outputs the include paths needed to find the pybind11 headers. python3-config --extension-suffix outputs the correct file extension for the platform (.so on Linux, .dylib on macOS).
For a proper project, use setuptools with a setup.py:
from glob import glob
from setuptools import setup
from pybind11.setup_helpers import Pybind11Extension, build_ext
ext_modules = [
Pybind11Extension(
"datamodule",
sorted(glob("src/*.cpp")),
),
]
setup(
name="datamodule",
ext_modules=ext_modules,
cmdclass={"build_ext": build_ext},
)
And a pyproject.toml that declares the build requirements:
[build-system]
requires = ["setuptools", "pybind11"]
build-backend = "setuptools.build_meta"
Build with:
pip install pybind11
python setup.py build_ext --inplace
The compiled .so file appears in the same directory and can be imported immediately.
Type Conversion at the Boundary
pybind11 handles the conversion between Python and C++ types at the boundary automatically for all built-in types and standard library containers when the appropriate headers are included.
| Python type | C++ type | Header required |
|---|---|---|
list | std::vector | pybind11/stl.h |
dict | std::map | pybind11/stl.h |
tuple | std::tuple | pybind11/stl.h |
int | int, long | none |
float | double, float | none |
str | std::string | none |
numpy.ndarray | Eigen::MatrixXd | pybind11/eigen.h |
The conversion happens at the call boundary. Data is copied, not shared by reference, which means modifications inside C++ do not affect the Python object and vice versa. For high-throughput use cases where copying is too expensive, pybind11 supports buffer protocol and NumPy array access without copying.
What You Can Do Now
Install pybind11 and compile the example above yourself:
pip install pybind11
Create datamodule.cpp with the C++ code from above, then compile it:
c++ -O3 -shared -std=c++11 -fPIC \
$(python3 -m pybind11 --includes) datamodule.cpp \
-o datamodule$(python3-config --extension-suffix)
Then run:
import datamodule
print(datamodule.process_data([1, 2, 3, 4, 5])) # 15
Once that works, extend it. Write a C++ function that sorts a vector and returns it. Bind it and call it from Python with a shuffled list. Then write one that takes a 2D vector and computes a dot product. Each step makes the boundary between Python and C++ more familiar, and familiarity is what makes you reach for it when the problem actually needs it.