To write a Pandas DataFrame to Google Cloud Storage or BigQuery, you can use the “df.to_csv()” function.
Writing a Pandas DataFrame to Google Cloud Storage
You need to install the google-cloud-storage package.
pip install google-cloud-storage
Then, you can use the following code to write a DataFrame to a CSV file in Google Cloud Storage.
import pandas as pd
from google.cloud import storage
# Your DataFrame
data = {'col1': [1, 2], 'col2': [3, 4]}
df = pd.DataFrame(data)
# Set up Google Cloud Storage
project_id = 'your_project_id'
bucket_name = 'your_bucket_name'
destination_blob_name = 'your_destination_blob_name.csv'
# Authenticate using a JSON key file. Replace 'path/to/keyfile.json'
# With the path to your JSON key file.
storage_client = storage.Client
.from_service_account_json('path/to/keyfile.json')
bucket = storage_client.get_bucket(bucket_name)
# Write the DataFrame to a CSV file in memory
csv_data = df.to_csv(index=False).encode('utf-8')
# Upload the CSV data to Google Cloud Storage
blob = bucket.blob(destination_blob_name)
blob.upload_from_string(csv_data, content_type='text/csv')
print(f"DataFrame uploaded to {destination_blob_name}")
Writing a Pandas DataFrame to BigQuery
First, you need to install the google-cloud-bigquery and pandas-gbq packages:
pip install google-cloud-bigquery pandas-gbq
Then, you can use the following code to write a DataFrame to a BigQuery table:
import pandas as pd
from google.cloud import bigquery
# Your DataFrame
data = {'col1': [1, 2], 'col2': [3, 4]}
df = pd.DataFrame(data)
# Set up BigQuery
project_id = 'your_project_id'
dataset_id = 'your_dataset_id'
table_id = 'your_table_id'
# Authenticate using a JSON key file.
# Replace 'path/to/keyfile.json' with the path to your JSON key file.
client = bigquery.Client.from_service_account_json('path/to/keyfile.json')
# Create a job config object to specify the write disposition
job_config = bigquery.LoadJobConfig(
write_disposition=bigquery.WriteDisposition.WRITE_TRUNCATE
)
# Write the DataFrame to BigQuery
table_ref = client.dataset(dataset_id).table(table_id)
job = client.load_table_from_dataframe(df, table_ref, job_config=job_config)
job.result()
print(f"DataFrame uploaded to {dataset_id}.{table_id}")
This code will overwrite the existing table in BigQuery. If you want to append the data instead, change the write_disposition to bigquery.WriteDisposition.WRITE_APPEND.

Amit Doshi is a Cloud Engineer who has experienced more than 5 years in AWS, Azure, and Google Cloud. He is an IT professional responsible for designing, implementing, managing, and maintaining cloud computing infrastructure, applications, and services.