As empirical improvements in deep learning become more challenging, we will need to better understand its success and shortcomings. Fortunately, approximation theory has a lot to say about deep learning from a theoretical perspective. In this talk, we will explore how classical results in radial basis function theory, highly composite function theory, rational functions, and the entropy of function spaces can be used to give us theoretical insights into neural networks. This talk involves work with Austin Benson, Nicolas Boulle, Anil Damle, and Yuji Nakatsukasa.