CData Python Connector for Greenplum

Build 21.0.7930

From Pandas

When combined with the connector, Pandas can be used to generate data frames which contains your Greenplum data. Once created, a data frame can be passed to various other python packages.

Connecting

Pandas will need to be imported before it can be used. Pandas will also rely on a SQLAlchemy engine when executing queries, as below:

import pandas as pd
from sqlalchemy import create_engine
engine = create_engine("cdata_greenplum:///?User=user;Password=admin;Database=dbname;Server=127.0.0.1;Port=5432")

Querying Data

SELECT queries are provided in a call to the "read_sql()" method in pandas, alongside a relevant connection object. Pandas will execute the query on that connection, and return the results in the form of a data frame, which are used for a variety of purposes.

df = pd.read_sql("""
	SELECT
	   ShipName,
	   ShipCity,
     $exNumericCol;
	FROM "template1"."public".Orders;""", engine)
print(df)

Modifying Data

To insert new records to a table, simply create a new data frame, and define its fields accordingly. From there, simply call "to_sql()" on the data frame to perform the INSERT operation with the connector, as in the below example. The "if _exists" argument must be set to "append" to prevent Pandas from attempting building the table from scratch, set index=False if needed to prevent Pandas from writing data frame index as a column:

df = pd.DataFrame({"ShipName": ["Raleigh"], "ShipCity": ["New York"]})
df.to_sql("\"template1\".\"public\".Orders", con=engine, if_exists="append", index=False)

Copyright (c) 2021 CData Software, Inc. - All rights reserved.
Build 21.0.7930