published: 2008/11/23. tags: sql optimization, dos

SQL Optimization and DOS (Denial of Service)

Watchout for the bad guys!

Last few years, Social Networking and as part of that, user generated content has gained a lot of popularity. If you have a website that allows users to provide their own content, there are always people who try to abuse it. I recently got a firsthand experience of this after having experimented with a website that allowed people provide their own content. Ofcourse, the more popular a website becomes, the more the problems are. The website I am talking about has been listed on a few websites that list the latest cool websites. What's more, it was even talked about in a local TV channel! As a result, a lot of traffic poured in and that's when the abuse of the power to provide user generated content started.

This issue immediately prompted for a few changes to the system like providing a way for the administrators to censor the content. It is within this context, I want to write about a specific issue of over optimizing a SQL could lead to denial of service attacks.

The data model Say there are two tables, header and lines with 1-to-many relationship. The report displays the headers and their lines based on the most recent N lines which could span across headers. So, initially the query was simply

  select h.*,l.*
    from headers h,lines l,(select distinct header_id from lines where ... order by creation_date limit 0,50) sl
   where l.header_id = h.id
     and h.id = sl.header_id
This worked fine till the abuse incident started. To fix the problem a censor flag has been added to the header. Now, the query has been initially modified to
  select h.*,l.*
    from headers h,lines l,(select distinct header_id from lines where ... order by creation_date limit 0,50) sl
   where l.header_id = h.id
     and h.id = sl.header_id
     and h.censored = 0;
Here, the query tries to filter the headers which have been censored. This should fix the problem right? Well, yes it does. But there is one small problem. What if the person trying to abuse the system tries to continuously create lines against the censored header? Say the most 50 recent lines happened to be against 10 headers that have been censored. Now the report is going to be blank. The root cause of this is the fact that the above query is trying to use as fewer tables in the join as possible. So, the right way to fix this issue is
  select h.*,l.*
    from headers h,lines l
        ,(select distinct l.header_id from lines l,headers h where ... 
                and h.censored = 0 order by creation_date limit 0,50) sl
   where l.header_id = h.id
     and h.id = sl.header_id
Essentially the lines have to be joined with the header in the sub-query as well and to start with pick only those headers that are not censored among the recent lines and then again join with the headers and lines.

© 2008 Dirisala.Net/articles