Heterogeneous computing is the practice of leveraging dissimilar processor architectures, e.g. CPUs and GPUs, in the solution of a numerical problem. Writing or adapting algorithms that are suited to different architectures can be a daunting task, but one that rewards the user with access to massive parallelism. In this talk, we will begin with an overview of GPU architecture, focusing on its strengths and weaknesses as they relate to algorithm design. A short series of examples will demonstrate how a user can access heterogeneous computing capability directly from Python, using PyOpenCL. We will then introduce an assembly-free strategy for finite element method simulations which is well suited for GPU parallelism, particularly when many successive simulations over a domain with varying material properties are desired. This heterogeneous computing strategy is used in the solution of a 3D coefficient inverse problem.