ADO.NET Provider for HDFS

Build 24.0.9060

Automatically Caching Data

Automatically caching data is useful when you do not want to rebuild the cache for each query. When you query data for the first time, the provider automatically initializes and builds a cache in the background. When AutoCache = true, the provider uses the cache for subsequent query executions, resulting in faster response times.

Configuring Automatic Caching

Caching the Files Table

The following example caches the Files table in the file specified by the CacheLocation property of the connection string.

C#

String connectionString = "Cache Location=C:\\cache.db;AutoCache=true;Host=sandbox-hdp.hortonworks.com;Port=50070;Path=/user/root;";
using (HDFSConnection connection = new HDFSConnection(connectionString)) {
  HDFSCommand cmd = new HDFSCommand("SELECT FileId, ChildrenNum FROM Files WHERE FileId = '119116'", connection);
  HDFSDataReader rdr = cmd.ExecuteReader();
  while (rdr.Read()) {
    Console.WriteLine("Read and cached the row with Id " + rdr["Id"]);
  }
}

VB.NET

Dim connectionString As [String] = "Cache Location=C:\\cache.db;AutoCache=true;Host=sandbox-hdp.hortonworks.com;Port=50070;Path=/user/root;"
Using connection As New HDFSConnection(connectionString)
  Dim cmd As New HDFSCommand("SELECT FileId, ChildrenNum FROM Files WHERE FileId = '119116'", connection)
  Dim rdr As HDFSDataReader = cmd.ExecuteReader()
  While rdr.Read()
	  Console.WriteLine("Read and cached the row with Id " + rdr("Id"))
  End While
End Using

Common Use Case

A common use for automatically caching data is to improve driver performance when making repeated requests to a live data source, such as building a report or creating a visualization. With auto caching enabled, repeated requests to the same data may be executed in a short period of time, but within an allowable tolerance (CacheTolerance) of what is considered "live" data.

Copyright (c) 2024 CData Software, Inc. - All rights reserved.
Build 24.0.9060