This refers to the concept of building a memory subsystem which consists of various levels of storage.
The storage at higher levels of the hierarchy has faster access times, but also has a smaller size (because it is also more expensive). Lower levels of the hierarchy have slower access times, and have larger sizes (because it is also less expensive).
The storage at a higher level acts as a cache for the storage at the level below it. This results in a memory subsystem with a larger effective size and lower cost (cost per byte close to that of the least expensive storage), but with faster average access speed (close to that of the fastest storage).