Photo by Campaign Creators on Unsplash
Hi All! Hope you all had a great weekend!! Last week, we looked into creating a Postgres SQL database using the Amazon RDS DB instance. This week's we will look into how to insert your python Data Frame into the Postgres DB table that we created.
The two steps include:
Updating the Data frames to reflect real-time youtube data
Inserting the Data Frame to the Postgres SQL using python script
Updating the Data Frame - Realtime Data
To get the real-time data, we need to create two functions. One for — check if the youtube video already exists and second, update the row with updated data.
def check_if_video_exists(curr, videoid):
query = ("""SELECT video_id FROM VIDEOS WHERE videoid = %s""")curr.execute(query, (videoid,))
return curr.fetchone() is not None
def update_db(curr,df):
tmp_df = pd.DataFrame(columns=['videoid', 'title', 'publisheddt', 'views','likes', 'dislikes', 'comments'])
for i, row in df.iterrows():
if check_if_video_exists(curr, row['video_id']):
# If video already exists then we will update
update_row(curr,row['videoid'], row['title'], row['views'], row['likes'], row['dislikes'],row['comments']
else: # The video doesn't exists so we will add it to a temp df and append it using append_from_df_to_dbtmp_df =
tmp_df.append(row)return tmp_df
Below snippet for insertion of data:
def insert_into_table(curr, videoid, title, publisheddt, views, likes, dislikes, comments):
insert_into_videos = ("""INSERT INTO videos (videoid, title, publisheddt,views, likes, dislikes,comments) VALUES(%s,%s,%s,%s,%s,%s,%s);""")
row_to_insert = (videoid, title, publisheddt,views, likes, dislikes,comments)
curr.execute(insert_into_videos, row_to_insert)
def append_from_df_to_db(curr,df):
for i, row in df.iterrows():
insert_into_table(curr, row['videoid'], row['title'], row['publisheddt'], row['views'], row['likes'], row['dislikes'], row['comments'])
Once that is done, create a connection and a cursor to execute the above written function.
#establish a connection to dbconn = connect_to_db(host_name, dbname, port, username, password)curr = conn.cursor()
#update data for existing videosnew_vid_df = update_db(curr,df)conn.commit()
#insert new videos into db table
append_from_df_to_db(curr, new_vid_df)conn.commit()
Finally, view the inserted table using the code below:
#view data in db tablecurr.execute("SELECT * FROM VIDEOS")print(curr.fetchall())
This marks the end of importing the python data frame to Postgres SQL DB dynamically. Next up we can start exploring more data analysis and ML stuff. Happy coding!!
If you like this post, feel free to like, share, and subscribe!!
Комментарии