I run a Django website. Its a pilot logbook application, where pilots can use the site to keep track of their flying experience. Each flight you enter to your logbook is represented by the Flight model. The Flight model contains all the information about the flight, such as comments about the flight, length of the flight, when the flight took place, and the route of the flight. The route of the flight is represented by the Route model, which is attached to the Flight model by a ForeignKey relation. The way the user logs a route is by entering a string in to the route field of the FlightForm such as "KLGA - KBOS - KLGA", which represents a flight from La Guardia to Boston, then back to La Guardia. When the FlightForm is saved, the string gets translated to a Route instance via a class method I created. Adding the route to the flight looks a little like this:
r=Route.from_string("KLGA-KBOS-KLGA")
flight.route = r
flight.save()
The problem is that the Route object is fairly complex. It contains a few fields that represent pre-rendered HTML, as well as RouteBase instances (in this case three, two KLGA's, and a KBOS). The RouteBase object then is connected to an Airport instance, which stores the airport's name, city, and coordinates. Since the Route object needs to have foreign key relations for it to make sense (routes with no airports doesn't make sense), the route object has to be saved to the database before the RouteBase's can be added.
What ends up happening here, is that whenever a flight is edited, whether or not the route is edited, a new route instance is created, and the old one is just discarded. Or if the user wants to delete a flight, the route object remains as well. Over time, orphaned routes start to build up, and may cause queries to be slower that they otherwise could be.
The Solution:
The solution here is very quite easy. Create a function like this:
def delete_empty_routes():
from route.models import Route
Route.objects.filter(flight__isnull=True).delete()
...and run it once a day.
But how should we do this? There are a few ways. Some are easy, and some are hard. The easiest way (well, the way I do it) is to make this function a view:
def delete_empty_routes(request):
from route.models import Route
from django.http import HttpResponse
qs = Route.objects.filter(flight__isnull=True)
c = qs.count
qs.delete()
return HttpRequest("deleted %s Routes" % c,
mimetype="text/plain")
and whenever you want to clear out all empty routes, attach this view to a URL, then just hit the URL. For automation, you can easiely add this to your crontab:
6 30 * * * wget http://domain.com/delete-empty-routes
...which will hit that URL every day at 6:30 AM and clear out all empty routes. Another Problem: My site also has a feature where the user can elect to have an email sent to them every few days or weeks with a zipped CSV file attached that contains all their flights they have logged with the site. In the aviation world, your logbook data is very important. If you lose all that data you are screwed. A lot of my users are apprehensive about storing their data with me, fearing that the site will go down one day, taking their data with it. Once I implemented the email feature, I saw my signups skyrocket.
Another Solution:
Again, the solution to this problem is to create a view function that emails each user a backup of their data:
def email_all_users(request, interval):
#interval = an int representing weekly/monthly/biweekly
profiles = Profile.objects.filter(backup_freq=interval)
for p in profiles:
email_backup(p.user) #create backup then send to user
return HttpResponse("success!")
...then set it to a URL, and then to a crontab:
30 3 1 * * wget http://domain.com/send_email-1
30 3 7 * * wget http://domain.com/send_email-2
30 3 14 * * wget http://domain.com/send_email-3
30 3 21 * * wget http://domain.com/send_email-4
One last Problem:
Now you successfully have a URL on your site that, when hit, will send an email to each user who has elected to receive backups. The problem now is that you have a URL on your site, that when hit, sends an email to each user. What is a search engine gets ahold of this URL? What if a user gets ahold of it? Your users will get spammed.
One last Solution:
To protect this function from being hit by unauthorized persons, we must edit the function a little bit:
def email_all_users(request, interval):
from settings import SECRET_KEY # make sure the secret key is present, or else fail
assert request.POST.get('key', "") == SECRET_KEY
#interval = an int representing weekly/monthly/biweekly
profiles = Profile.objects.filter(backup_freq=interval)
for p in profiles:
email_backup(p.user) #create backup then send to user
return HttpResponse("success!")
Now in your crontab, add the following:
30 3 1 * * wget http://domain.com/send_email-1?key=4f46g...
30 3 7 * * wget http://domain.com/send_email-2?key=4f46g...
30 3 14 * * wget http://domain.com/send_email-3?key=4f46g...
30 3 21 * * wget http://domain.com/send_email-4?key=4f46g...
If you want to be even more thorough, you can have the crontab call a python script which adds your settings file to the python path (if it isn't already), then import the SECRET_KEY that way, but doing it this way where you copy/paste the SECRET_KEY works too. We could also have it return a 404 error instead of just an assertion, but either way it still works.
You do not hit your own urls from cron jobs...this is *wrong*
ReplyDeleteYou should have the crontabs just execute python scripts that include the django enviorment (look at setup_environ). When sending emails to users, use EmailMessage and give them all the same SMTPConnection object.