One of the problems I recently discovered with the Drupal cache is that it doesn’t properly handle https transactions when caching happens. Let’s take a look at understanding the problem, an interim solution for the database cache, and finding a long term solution.
The Problem
The page cache properly respects https because it uses a full url including the protocol when generating a cache id. But, Drupal uses multiple layers of caching with the html in a page. For example, the content of blocks can be cached. What if the html for a block has absolute URLs pointing back to the site and those are generated on the http version of the site. Then this is cached. Then this cache is used to create the https version of the page. Then those cached absolute URLs are not using https.
Media Module, Where I Found The Problem
The example that caused me to understand this problem was the media module. With the media module you can embed images and video into a text area (like the body of an article). Media tags are converted to html by the filter system and the results are cached. Images, for example, use absolute URLs (this is part of core and is a good thing for some use cases). If the cache for this text was generated on the http version of the site the path to the image on the https version of the page will use http as the protocol.This opens up all kids of possible issues. Just imagine someone in a coffee shop thinking they are on https only to have cookies sent back to the server for that image in plain text. Or maybe you have a better imagination than me and can think of something more sinister.