Text
Helpful Tip To Speed Up Forums

 
This forum is locked: you cannot post, reply to, or edit topics.   This topic is locked: you cannot edit posts or make replies.    howtodoit.myfreeforum.org Forum Index -> General Information
View previous topic :: View next topic  
Author Message
Please Register and Login to this forum to stop seeing this advertsing.





Add Karma

Posted:     Post subject:

Back to top
symon
Site Admin
Site Admin

Location: 

Joined: 13 Mar 2006
Posts: 866


Location: Hampshire Add Karma

PostPosted: Wed Mar 07, 2007 11:00 pm    Post subject: Helpful Tip To Speed Up Forums Reply with quote

- Index page

By default phpBB collects all topics that are new to see which forum have been updated since your last visit. When user post count approached 50,000 posts a day this process took more than 10 seconds to load. Eventually the process was dropped, and a cheat was put in-place.

This was taken out:

Code:
 if ( $userdata['session_logged_in'] )
   {
      $sql = "SELECT t.forum_id, t.topic_id, p.post_time
         FROM " . TOPICS_TABLE . " t, " . POSTS_TABLE . " p
         WHERE p.post_id = t.topic_last_post_id
            AND p.post_time > " . $userdata['user_lastvisit'] . "
            AND t.topic_moved_id = 0";
      if ( !($result = $db->sql_query($sql)) )
      {
         message_die(GENERAL_ERROR, 'Could not query new topic information', '', __LINE__, __FILE__, $sql);
      }

      $new_topic_data = array();
      while( $topic_data = $db->sql_fetchrow($result) )
      {
         $new_topic_data[$topic_data['forum_id']][$topic_data['topic_id']] = $topic_data['post_time'];
      }
   }



The cheat is the one line right after $forum_data[] = $row;

Code:
$sql = "SELECT f.*, p.post_time, p.post_username, u.username, u.user_id
        FROM (( " . FORUMS_TABLE . " f
        LEFT JOIN " . POSTS_TABLE . " p ON p.post_id = f.forum_last_post_id )
        LEFT JOIN " . USERS_TABLE . " u ON u.user_id = p.poster_id )
        ORDER BY f.cat_id, f.forum_order";



   if ( !($result = $db->sql_query($sql)) )
   {
      message_die(GENERAL_ERROR, 'Could not query forums information', '', __LINE__, __FILE__, $sql);
   }

   $forum_data = array();
   while( $row = $db->sql_fetchrow($result) )
   {
      $forum_data[] = $row;
        if ($row['post_time'] > $userdata['user_lastvisit']) $new_topic_data[$row['forum_id']][$row['topic_id']] = $row['post_time'];
   }


It just end up checking to see if the latest post is posted after the user's last visit time. Quick and dirty change.

- Users Online

The index page also scans the users plus session tables for active users. When online users reaches close to a thousand this check becomes costly in time. This isn't even a hack, I ended up having an external script on a cron job to run the exact query on a per minute basis. Result is stored in a hash table, and phpBB then queries the hash table for the list of on-line users.

This code was taken out

Code:
$user_forum_sql = ( !empty($forum_id) ) ? " WHERE session_page = " . intval($forum_id) : '';
   $sql = "SELECT u.username, u.user_id, u.user_allow_viewonline, u.user_level, s.session_logged_in, s.session_ip FROM ".USERS_TABLE." u, ".SESSIONS_TABLE." s WHERE u.user_id = s.session_user_id AND s.session_time >= ".( $board_config['time_now'] - 300 ) . " $user_forum_sql ORDER BY u.username ASC, s.session_ip ASC";


And replaced with a query that grabs everything from a hash table:

Code:
$user_forum_sql = ( !empty($forum_id) ) ? " WHERE session_page = " . intval($forum_id) : '';
    $sql = "SELECT * FROM " . $session_table_name . $user_forum_sql;


- Viewtopic.php

Perhaps this is one of the more interesting hack and more useful to everyone. When fetching a particular page within a topic, there is a large query that fetches the user info, post info, and post contents. I had the believe that we should always join queries to avoid latency so I never touched that query. As it turns out, when a thread gets bigger and bigger, a large joined query with the where and sort by clause will take a ton of memory to process. If we're looking for the last page of a topic with 1000 replies, MYSQL grabs the post text and user data from all the posts, then splits out the last 15. (or whatever posts each page shows)

So, instead of one query containing:


Code:
$sql = "SELECT u.username, u.user_id, u.user_posts, u.user_from, u.user_website, u.user_email, u.user_icq, u.user_aim, u.user_yim, u.user_regdate, u.user_msnm, u.user_viewemail, u.user_rank, u.user_sig, u.user_sig_bbcode_uid, u.user_avatar, u.user_session_time, u.user_allow_viewonline, u.user_allowsmile, p.*,  pt.post_text, pt.post_subject, pt.bbcode_uid
   FROM " . POSTS_TABLE . " p, " . USERS_TABLE . " u, " . POSTS_TEXT_TABLE . " pt
   WHERE p.post_id in  p.topic_id = $topic_id $limit_posts_time
      AND pt.post_id = p.post_id
      AND u.user_id = p.poster_id
   ORDER BY p.post_time $post_time_order
   LIMIT $start, ".$board_config['posts_per_page'];

Now start off with

Code:
$p_array = array();
$sql = "SELECT p.post_id FROM " . POSTS_TABLE . " p WHERE p.topic_id = $topic_id $limit_posts_time LIMIT $start, " . $board_config['posts_per_page'];
if ( !($result = $db->sql_query($sql)) )
{
   message_die(GENERAL_ERROR, "Could not obtain post index information.", '', __LINE__, __FILE__, $sql);
}
while (list($p_id) = $db->sql_fetchrow($result)) {
    $p_array[] = $p_id;
}
$post_index = implode(",",$p_array);

A list of post_id are obtained, now I slip it into the large query

Code:
$sql = "SELECT u.username, u.user_id, u.user_posts, u.user_from, u.user_website, u.user_email, u.user_icq, u.user_aim, u.user_yim, u.user_regdate, u.user_msnm, u.user_viewemail, u.user_rank, u.user_sig, u.user_sig_bbcode_uid, u.user_avatar, u.user_session_time, u.user_allow_viewonline, u.user_allowsmile, p.*,  pt.post_text, pt.post_subject, pt.bbcode_uid
   FROM " . POSTS_TABLE . " p, " . USERS_TABLE . " u, " . POSTS_TEXT_TABLE . " pt
   WHERE p.post_id in ($post_index)
      AND pt.post_id = p.post_id
      AND u.user_id = p.poster_id
   ORDER BY p.post_time $post_time_order";


And all is well. The difference is not noticeable on small systems. The latency in having 2 queries might actually make itself slower in some cases. Though on a forum with 3.3M posts, the difference is 5 minutes vs 1 minute on a topic with 70,000 replies. There is a performance increase even on topics with as little as a few hundred replies. Overall it saves memory, which made everyone happy.



_________________
Join me on FaceBook = symon field
Back to top
View user's profile Send private message
Display posts from previous:   
This forum is locked: you cannot post, reply to, or edit topics.   This topic is locked: you cannot edit posts or make replies.    howtodoit.myfreeforum.org Forum Index -> General Information All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

Card File  Gallery  Forum Archive
HOWTODOIT was designed and built by Symon Field
Creator Of ☆ FarnboroughInvinciblesAgonyAuntFairiesAsylum Alliance FunnyWorldHOWTODOIT PHPbb3
HOWTODOIT© Symon Field 2006,2007,2008

Powered by phpBB Group © 2001,2005.
 

Create your own free forum | Buy a domain to use with your forum