JDBC Driver for Google BigQuery

Build 23.0.8839

Explicitly Caching Data

With explicit caching (AutoCache = false), you decide exactly what data is cached and when to query the cache instead of the live data. Explicit caching gives you full control over the cache contents by using CACHE Statements. This section describes some strategies to use the caching features offered by the driver.

Creating the Cache

To load data in the cache, issue the following statement.

CACHE SELECT * FROM tableName WHERE ...

Once the statement is issued, any matching data in tableName is loaded into the corresponding table.

Updating the Cache

This section describes two ways to update the cache.

Updating with the SELECT Statement

The following example shows a statement that can update modified rows and add missing rows in the cached table. However, this statement does not delete extra rows that are already in the cache. This statement only merges the new rows or updates the existing rows.

String cmd = "CACHE SELECT * FROM [publicdata].[samples].github_nested WHERE repository.name = 'EntityFramework'", connection";
stat.execute(cmd);
connection.close();

Updating with the TRUNCATE Statement

The following example shows a statement that can update modified rows and add missing rows in the cached table. This statement can also delete rows in the cache table that are not present in the live data source.

String cmd = "CACHE WITH TRUNCATE SELECT * FROM [publicdata].[samples].github_nested WHERE repository.name = 'EntityFramework'";
stat.execute(cmd);
connection.close();

Query the Data in Online or Offline Mode

This section describes how to query the data in online or offline mode.

Online: Select Cached Tables

You can use the tableName#CACHE syntax to explicitly execute queries to the cache while still online, as shown in the following example.

SELECT * FROM [publicdata].[samples].github_nested#CACHE

Offline: Select Cached Tables

With Offline = true, SELECT statements always execute against the local cache database, regardless of whether you explicitly specify the cached table or not. Modification of the cache is disabled in Offline mode to prevent accidentally updating only the cached data. Executing a DELETE/UPDATE/INSERT statement while in Offline mode results in an exception.

The following example selects from the local cache but not the live data source because Offline = true.

Connection connection = DriverManager.getConnection("jdbc:googlebigquery:InitiateOAuth=GETANDREFRESH;ProjectId=NameOfProject;DatasetId=NameOfDataset;Offline=true;Cache Location=C:\\cache.db;");
Statement stat = connection.createStatement();
String query = "SELECT * FROM [publicdata].[samples].github_nested WHERE repository.name='EntityFramework' ORDER BY repository.name ASC";
stat.execute(query);
connection.close();

Delete Data from the Cache

You can delete data from the cache by building a direct connection to the database. Note that the driver does not support manually deleting data from the cache.

Common Use Case

A common use for caching is to have an application always query the cached data and only update the cache at set intervals, such as once every day or every two hours. There are two ways in which this can be implemented:

  • AutoCache = false and Offline = false. All queries issued by the application explicitly reference the tableName#CACHE table. When the cache needs to be updated, the application executes a tableName#CACHE ... statement to bring the cached data up to date.
  • Offline = true. Caching is transparent to the application. All queries are executed against the table as normal, so most application code does not need to be aware that caching is done. To update the cached data, simply create a separate connection with Offline = false and execute a tableName#CACHE ... statement.

Copyright (c) 2024 CData Software, Inc. - All rights reserved.
Build 23.0.8839