This property determines whether or not to cache the table metadata to a file store.
As you execute queries with this property set, table metadata in the Spark SQL catalog are cached to the file store specified by CacheLocation if set or the user's home directory otherwise. A table's metadata will be retrieved only once, when the table is queried for the first time.
When to Use CacheMetadata
The driver automatically persists metadata in memory for up to two hours when you first discover the metadata for a table or view and therefore, CacheMetadata is generally not required. CacheMetadata becomes useful when metadata operations are expensive such as when you are working with large amounts of metadata or when you have many short-lived connections.
When Not to Use CacheMetadata
- When you are working with volatile metadata: Metadata for a table is only retrieved the first time the connection to the table is made. To pick up new, changed, or deleted columns, you would need to delete and rebuild the metadata cache. Therefore, it is best to rely on the in-memory caching for cases where metadata changes often.
- When you are caching to a database: CacheMetadata can only be used with CacheLocation. If you are caching to another database with the CacheDriver and CacheConnection properties, use AutoCache to cache implicitly. Or, use CACHE Statements to cache explicitly.