From Pandas

When combined with the connector, Pandas can be used to generate data frames that contain your Greenplum data. Once created, a data frame can be passed to various other Python packages.

Connecting

Pandas relies on an SQLAlchemy engine to execute queries. Before you can use Pandas you must import it:

import pandas as pd
from sqlalchemy import create_engine
engine = create_engine("cdata_greenplum:///?User=user;Password=admin;Database=dbname;Server=127.0.0.1;Port=5432")

Querying Data

In Pandas, SELECT queries are provided in a call to the read_sql() method, alongside a relevant connection object. Pandas executes the query on that connection, and returns the results in the form of a data frame, which can be used for a variety of purposes.

df = pd.read_sql("""
	SELECT
	   ShipName,
	   ShipCity,
     $exNumericCol;
	FROM "template1"."public".Orders;""", engine)
print(df)

Modifying Data

To insert new records into a table, create a new data frame, and define its fields accordingly. When that is done, call to_sql() on the data frame to perform the INSERT operation with the connector, as shown in the example below. You must set the "if _exists" argument to "append" to prevent Pandas from attempting building the table from scratch. To prevent Pandas from writing the data frame index as a column, set index=False.

df = pd.DataFrame({"ShipName": ["Raleigh"], "ShipCity": ["New York"]})
df.to_sql("\"template1\".\"public\".Orders", con=engine, if_exists="append", index=False)

CData Python Connector for Greenplum

From Pandas

Connecting

Querying Data

Modifying Data