top of page

Importing Dataframe to Postgres SQL Database - Part 4


Hi All! Hope you all had a great weekend!! Last week, we looked into creating a Postgres SQL database using the Amazon RDS DB instance. This week's we will look into how to insert your python Data Frame into the Postgres DB table that we created.


The two steps include:

  • Updating the Data frames to reflect real-time youtube data

  • Inserting the Data Frame to the Postgres SQL using python script

Updating the Data Frame - Realtime Data

To get the real-time data, we need to create two functions. One for — check if the youtube video already exists and second, update the row with updated data.


def check_if_video_exists(curr, videoid):
    query = ("""SELECT video_id FROM VIDEOS WHERE videoid = %s""")curr.execute(query, (videoid,))
    return curr.fetchone() is not None

def update_db(curr,df):
    tmp_df = pd.DataFrame(columns=['videoid', 'title', 'publisheddt', 'views','likes', 'dislikes', 'comments'])
    for i, row in df.iterrows():
        if check_if_video_exists(curr, row['video_id']): 
            # If video already exists then we will update      
             update_row(curr,row['videoid'], row['title'], row['views'], row['likes'], row['dislikes'],row['comments']
        else: # The video doesn't exists so we will add it to a temp df     and append it using append_from_df_to_dbtmp_df =         
                tmp_df.append(row)return tmp_df

Below snippet for insertion of data:

def insert_into_table(curr, videoid, title, publisheddt, views, likes, dislikes, comments):
    insert_into_videos = ("""INSERT INTO videos (videoid, title, publisheddt,views, likes, dislikes,comments)    VALUES(%s,%s,%s,%s,%s,%s,%s);""")
    row_to_insert = (videoid, title, publisheddt,views, likes, dislikes,comments)
    curr.execute(insert_into_videos, row_to_insert)
    
def append_from_df_to_db(curr,df):
    for i, row in df.iterrows():
        insert_into_table(curr, row['videoid'], row['title'], row['publisheddt'], row['views'], row['likes'], row['dislikes'], row['comments'])

Once that is done, create a connection and a cursor to execute the above written function.


#establish a connection to dbconn = connect_to_db(host_name, dbname, port, username, password)curr = conn.cursor()

#update data for existing videosnew_vid_df = update_db(curr,df)conn.commit()

#insert new videos into db table
append_from_df_to_db(curr, new_vid_df)conn.commit()

Finally, view the inserted table using the code below:



#view data in db tablecurr.execute("SELECT * FROM VIDEOS")print(curr.fetchall())

This marks the end of importing the python data frame to Postgres SQL DB dynamically. Next up we can start exploring more data analysis and ML stuff. Happy coding!!


If you like this post, feel free to like, share, and subscribe!!

Opmerkingen


bottom of page