Git-ing Familiar with Python - Part 1: autodelete_tweets.py

Git-ing Familiar with Python - Part 1: autodelete_tweets.py

Introduction

In this post, I'd like to briefly chronicle some of the fun I recently had learning how to code up some simple functionality in Python and commit it to a GitHub repository along the way. While I've used GitHub in the past for documenting minor changes to configuration files, I really think utilizing it as repository for active development efforts has helped immensely to ingrain the commit/push process in my head a bit more effectively.


The Idea

So I initially thought up the idea for my Python project while browsing Twitter and thinking to myself "how can I like or retweet things without contributing to a massive backlog of my online activity?"

Enter: autodelete_tweets!

The application will use the Twitter API to pull down all of my tweets, retweets, and favorites every day, sort through those items by date, and delete anything that's over 10 days old.

Seems easy enough, right? Let's dive in!


The Setup

First, we'll import all the required modules for python into the script:

# Import modules
import os
import sys
import time
import argparse
import tweepy as tw
from datetime import datetime, timedelta
from pushover import init, Client

Next, we'll import our authentication secrets from a file we created named api_secrets:

# Import the api_secrets variables
from api_secrets import *

NOTE: It's crucial that you add this file to the .gitignore file in your github repository! If you don't, you risk uploading all your secrets into your public github repo - That would be very bad!

Then, we'll load the variables from that file:

# Import Twitter authentication information from api_secrets.py
auth = tw.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tw.API(auth, wait_on_rate_limit=True)
user = api.me()
user_id = user.id

You'll notice here that we are pulling in authentication tokens, which are created using the tweepy python module and the consumer_key, consumer_secret, access_token, and access_token_secret provided to us when we request API access from Twitter. Once we receive the auth key from Twitter, we can load that into the tw.API function and load the user identity from Twitter. That user_id field will be critical for identifying which Twitter user we're pulling tweets in from. If it's not your account, the attempts to delete will fail.

Next, I'm going to set up Pushover so that I can receive logs to my mobile phone. The pushover_api_token and pushover_user_key are both located in the api_secrets file as well.

# Initialize Pushover using authentication information from api_secrets.py
pushoverClient = Client(pushover_user_key, api_token=pushover_api_token)

Now for the retention period variables:

#* Set how far back you'd like to retain tweets
daysAgo = 10
oldestDateToKeep = datetime.now() - timedelta(days=daysAgo)

We set the daysAgo variable, and then crunch some quick numbers using some basic subtraction to determine the date of the oldest tweets we want to keep.

And finally, we setup the script to receive CLI arguments that it can use as part of the automation process.

#* Import the CLI arguments, if any
def cliSetup():
	cliParser = argparse.ArgumentParser(description="Deletes and unfavorites tweets more than X days old.")
	cliParser.add_argument("-c", "--confirm", help="'y' confirms the deletion on launch, 'n' will just print the lists of collected tweets")
	args = cliParser.parse_args()
	return args.confirm
cliConfirm = cliSetup()

Ultimately, I added the CLI arguments so that you could run this script via a chron job and have it auto-accept the confirmation at the end of the tweet. Including a -c y will execute the delete function, whereas including a -c n will just print the number of tweets that would be deleted if the script were run.


The Execution

Now I will run through the actual execution of the script. This is where the fun begins!

Collect all the tweets

First, we'll collect all the tweets (by their tweet.id) and store them in an array (rawTweets):

#* Collect all the tweets we can (Important Fields: tweet.id, tweet.created_at, tweet.favorited, tweet.retweeted)
tweetsToDelete = []
rawTweets = tw.Cursor(api.user_timeline, id=user_id).items(600)

Then, will narrow this list down to just the tweetsToDelete using the oldestDateToKeep variable we defined earlier:

for tweet in rawTweets:
	if tweet.created_at < oldestDateToKeep:
		tweetsToDelete.append(tweet.id)

Now, we'll do the same for favorites:

tweetsToUnfavorite = []
rawFavorites = tw.Cursor(api.favorites).items(600)
for tweet in rawFavorites:
	if tweet.created_at < oldestDateToKeep:
		tweetsToUnfavorite.append(tweet.id)

Define the deletion function

Now that we've collected the tweets and favorites we want to delete, let's define our delete function for each item type:

def deleteTweets(tweetsList):
	deletedTweets = 0
	errorTweets = 0
	for tweet in tweetsList:
		try:
			api.destroy_status(tweet)
			print("Deleted:", tweet)
			deletedTweets += 1
		except Exception:
			print("Failed to delete:", tweet)
			errorTweets += 1
	return deletedTweets, errorTweets

and

def unfavoriteTweets(tweetsList):
	deletedFaves = 0
	errorFaves = 0
	for tweet in tweetsList:
		try:
			api.destroy_favorite(tweet)
			print("Unfavorited:", tweet)
			deletedFaves += 1
		except Exception:
			print("Failed to unfavorite:", tweet)
			errorFaves += 1
	return deletedFaves, errorFaves

Ultimately, what we're doing with both of these functions is declaring our starting count for both successfully deleted items and errors as 0, and then going through the list of id's and calling the api.destroy function for each item type. If the destroy is successful, we increment our success count by 1, if it's unsuccessful, we increment our error count by 1.

Execute the script

Now for the final bit of code, the actual execution of the script.

If the CLI variable -c y is defined at runtime, the following will execute:

if cliConfirm == 'y':
	pushoverClient.send_message("Deleting " + str(len(tweetsToDelete)) + " and unfavoriting " + str(len(tweetsToUnfavorite)) + " tweets now!", title="Autodelete Initialized")
	deletedTweets, errorTweets = deleteTweets(tweetsToDelete)
	deletedFaves, errorFaves = unfavoriteTweets(tweetsToUnfavorite)
	pushoverClient.send_message("Deleted " + str(deletedTweets) + " and unfavorited " + str(deletedFaves) + " tweets! Errors deleting " + str(errorTweets) + " tweets and " + str(errorFaves) + " favorites.", title="Autodelete Complete")

If the CLI variable -c n is defined at runtime, the following will execute:

elif cliConfirm=='n':
	print("No deletion requested - To delete, change your cli argument to '-c y'.")
	pushoverClient.send_message("Collected " + str(len(tweetsToDelete)) + " tweets to delete and " + str(len(tweetsToUnfavorite)) + " to unfavorite. No action taken!", title="Autodelete Incomplete")

And finally, if there are no CLI variables defined, or if the script doesn't recognize the CLI variables, the script will provide a summary of the total number of tweets and favorites it plans to delete, and will request user input of either y or n to proceed.

else:
	while (res:= input('Do you want to delete ' + str(len(tweetsToDelete)) + ' and unfavorite ' + str(len(tweetsToUnfavorite)) + ' tweets? (y/n): ').lower()) not in {"y", "n"}: pass
	if res=='y':
		print("Deleting/Unfavoriting tweets now!")
		deleteTweets(tweetsToDelete)
		unfavoriteTweets(tweetsToUnfavorite)
	if res=='n':
		print("Deletion cancelled!")

Gracefully close

And of course, we need to gracefully quit the script, lest we be relegated to a special place in development hell:

print("Exiting in 3 seconds...")
time.sleep(3)
sys.exit()

Setting up the cron job

Now for the final bit of work - making this script run every day. To accomplish this, I'm going to set up a daily cron job on my linux system.

  1. Open the crontab editor by typing crontab -e.
  2. Add the following line to the bottom:
    5 4 * * * python3 /home/wendingtuo/autodelete_tweets/autodelete_tweets.py -c y
  3. Hit Ctrl+X to close the window, and type y to save the changes we made.

This will save the cron job to your system and allow it to execute automatically at 4:05 AM every day.


Closing Thoughts

This was an extremely fun exercise for me. I really enjoy the methodical nature you have to approach programming with. It's also really fun to look back on the commits I made along the way while developing this code to see how it evolved.

For example, I initially had no intention on adding command line arguments to the workflow, but once I got to setting up the cron job on the server, I realized I needed some way of testing that the cron job successfully ran without actually deleting all the tweets on my profile! And I know I probably should have been working on a development instance of twitter, but I didn't have any other account with more than 10 days worth of tweets.

Maybe for my next project, I'll create a twitter account that tweets quotes or auto-favorites certain tweets so I have a development-safe environment with some actual historical data to manage.

Anyway, thanks for coming along for the ride!

Show Comments