PHP Developers Network

A community of PHP developers offering assistance, advice, discussion, and friendship.
 
Loading
It is currently Mon Sep 23, 2019 3:25 am

All times are UTC - 5 hours




Post new topic Reply to topic  [ 28 posts ]  Go to page Previous  1, 2
Author Message
PostPosted: Mon Apr 05, 2010 3:54 am 
Offline
Forum Commoner

Joined: Thu Apr 01, 2010 7:28 pm
Posts: 96
Location: Chicagoland, IL, USA


Top
 Profile  
 
PostPosted: Mon Apr 05, 2010 3:42 pm 
Offline
DevNet Master

Joined: Wed Feb 11, 2004 4:23 pm
Posts: 4872
Location: Palm beach, Florida
I don't see how that is possible. It worked fine last week, its not like the old syntax highlighter was just magically guessing where to put the line feeds. To me it looks like during the last 2 weeks, the contents of the posts was mutilated... either by a person, or by the upgrade process.

If this was a conscious decision (to not worry about old posts) then at least say so...

Assuming that the information is still there in the database (despite the fact we don't see it in the view source), which is highly probable, then this is an easy fix. I don't see how you can observe 100s of thousands of mutilated posts and not come to the conclusion that they were altered for a reason. The posts didn't just decide to commit suicide, the upgrade process altered the content of the posts, or some new functionality is filtering out the line breaks. It really has to be that simple...


Top
 Profile  
 
PostPosted: Tue Apr 06, 2010 4:33 am 
Offline
Site Administrator
User avatar

Joined: Sun May 19, 2002 10:24 pm
Posts: 6887
We will need a skilled programmer to create an algorithm which will parse the posts and format them correctly for the new parser. Josh, if you can go ahead and commit yourself to this that would be great. I'll arrange to send you a data set to work with.

Thanks!

_________________
Image


Top
 Profile  
 
PostPosted: Tue Apr 06, 2010 6:31 pm 
Offline
DevNet Master

Joined: Wed Feb 11, 2004 4:23 pm
Posts: 4872
Location: Palm beach, Florida
I'd be happy to help however I can.


Top
 Profile  
 
PostPosted: Tue Apr 06, 2010 11:28 pm 
Offline
Site Administrator
User avatar

Joined: Sun May 19, 2002 10:24 pm
Posts: 6887
Ok this is where we are at. I have written an algorithm to reparse the posts but have ran out of time for the evening.

My code will take an existing record consisting of:

Syntax: [ Download ] [ Hide ]
[syntax=php]<div class="php" id="{CB}" style="font-family: monospace;"><ol><li style="" class="li1"> </li><li style="" class="li2">    <span style="color: #000000; font-weight: bold;">function</span> <span style="color: #000000; font-weight: bold;">__construct</span><span style="color: #66cc66;">&#40;</span><span style="color: #0000ff;">$username</span>, <span style="color: #0000ff;">$password</span>, <span style="color: #0000ff;">$db_name</span>=<span style="color: #ff0000;">''</span>, <span style="color: #0000ff;">$host</span>=<span style="color: #ff0000;">''</span>, <span style="color: #0000ff;">$tns</span>=<span style="color: #ff0000;">''</span><span style="color: #66cc66;">&#41;</span></li><li style="" class="li1">    <span style="color: #66cc66;">&#123;</span></li><li style="" class="li2">        <span style="color: #808080; font-style: italic;">//Db Vars</span></li><li style="" class="li1">        <span style="color: #0000ff;">$this</span>-&gt;<span style="color: #006600;">username</span>       = <span style="color: #0000ff;">$username</span>;</li><li style="" class="li2">        <span style="color: #0000ff;">$this</span>-&gt;<span style="color: #006600;">password</span>   = <span style="color: #0000ff;">$password</span>;</li><li style="" class="li1">        <span style="color: #0000ff;">$this</span>-&gt;<span style="color: #006600;">db_name</span>        = <span style="color: #0000ff;">$db_name</span>;</li><li style="" class="li2">        <span style="color: #0000ff;">$this</span>-&gt;<span style="color: #006600;">host</span>           = <span style="color: #0000ff;">$host</span>;</li><li style="" class="li1">        <span style="color: #0000ff;">$this</span>-&gt;<span style="color: #006600;">tns</span>            = <span style="color: #0000ff;">$tns</span>;</li><li style="" class="li2">        </li><li style="" class="li1">        <span style="color: #808080; font-style: italic;">//Connect to db</span></li><li style="" class="li2">        <a href="http://www.php.net/if"><span style="color: #b1b100;">if</span></a><span style="color: #66cc66;">&#40;</span>!<span style="color: #66cc66;">&#40;</span><span style="color: #0000ff;">$this</span>-&gt;<span style="color: #006600;">conn</span> = self::<span style="color: #006600;">db_connect</span><span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">&#41;</span> die<span style="color: #66cc66;">&#40;</span><span style="color: #ff0000;">&quot;&lt;b&gt;Error:&lt;/b&gt; No Connection to Database&quot;</span><span style="color: #66cc66;">&#41;</span>;</li><li style="" class="li1">    <span style="color: #66cc66;">&#125;</span></li><li style="" class="li2">    </li><li style="" class="li1">    <span style="color: #000000; font-weight: bold;">function</span> db_connect<span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">&#41;</span></li><li style="" class="li2">    <span style="color: #66cc66;">&#123;</span></li><li style="" class="li1">        <a href="http://www.php.net/if"><span style="color: #b1b100;">if</span></a><span style="color: #66cc66;">&#40;</span><span style="color: #0000ff;">$oci</span> = @oci_connect<span style="color: #66cc66;">&#40;</span><span style="color: #0000ff;">$this</span>-&gt;<span style="color: #006600;">username</span>, <span style="color: #0000ff;">$this</span>-&gt;<span style="color: #006600;">password</span>, <span style="color: #0000ff;">$this</span>-&gt;<span style="color: #006600;">tns</span><span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">&#41;</span> <a href="http://www.php.net/return"><span style="color: #b1b100;">return</span></a> <span style="color: #0000ff;">$oci</span>;</li><li style="" class="li2">        <a href="http://www.php.net/else"><span style="color: #b1b100;">else</span></a> <a href="http://www.php.net/return"><span style="color: #b1b100;">return</span></a> <span style="color: #000000; font-weight: bold;">false</span>;</li><li style="" class="li1">    <span style="color: #66cc66;">&#125;</span></li></ol></div>[/syntax]


And convert it into this:

Syntax: [ Download ] [ Hide ]
[syntax=php]
    function __construct&#40;$username, $password, $db_name='', $host='', $tns=''&#41;
    &#123;
        //Db Vars
        $this-&gt;username       = $username;
        $this-&gt;password   = $password;
        $this-&gt;db_name        = $db_name;
        $this-&gt;host           = $host;
        $this-&gt;tns            = $tns;
       
        //Connect to db
        if&#40;!&#40;$this-&gt;conn = self::db_connect&#40;&#41;&#41;&#41; die&#40;&quot;&lt;b&gt;Error:&lt;/b&gt; No Connection to Database&quot;&#41;;
    &#125;
   
    function db_connect&#40;&#41;
    &#123;
        if&#40;$oci = @oci_connect&#40;$this-&gt;username, $this-&gt;password, $this-&gt;tns&#41;&#41; return $oci;
        else return false;
    &#125;[/syntax]


You'll notice that there are still entities in the database record. The issue we are now running into is that if I run html_entity_decode() on the records, they will display correctly, however the ability to edit them is lost. It seems that only certain characters must be encoded. If you can determine what entities these are by digging around in the code that would be good.

Syntax: [ Download ] [ Hide ]
<?php
set_time_limit(0);
require 'db.php';
# counter for progress indication
$i = 0;

$dbConnector = new dbconnection();
$db = $dbConnector->create('localhost', 'developer', '', 'forum', '3306');
$db2 = $dbConnector->create('localhost', 'developer', '', 'forum', '3306');

$db->query("SELECT post_id, post_text FROM phpbb_posts ORDER BY post_id DESC");

while ($post = $db->fetch_assoc()) {
    # split the post_text on code tags
   $post_text = preg_split('#(\[/{0,1}(?:php|code)(?:=[a-z]*?){0,1}(?::[^\]].*?){0,1}])#im', $post['post_text'], null, PREG_SPLIT_DELIM_CAPTURE);

    # count the number of matches
   $cnt = count($post_text);

    # if there is only 1, there are no matching code tags
   if ($cnt < 2) continue;

    # current state variables
    $new_post_text  = '';
    $is_code        = false;
    $c_open_tag     = null;
    $c_close_tag    = null;

    # iterate through the pieces
   foreach ($post_text as $chunk) {
        if ($is_code) { # reformat the post
           # convert <br> and </li> to new lines
           $chunk = preg_replace("#<br {0,1}/{0,1} {0,1}>|< {0,1}/ {0,1}li>#im", "\n", $chunk);

            # strip all the tags, remove extra line feeds and convert entities into characters
           #$chunk = html_entity_decode(trim(strip_tags($chunk)));
           $chunk = trim(strip_tags($chunk));

            # processing of entities should be done here

            # append the code to the post
           $new_post_text .= "{$c_open_tag}{$chunk}{$c_close_tag}";
            $is_code = false;
        } else {
            # determine if the next chunk contains code
           if (preg_match('#\[(/){0,1}(?:php|code)(?:=([a-z]*?)){0,1}(?::[^\]].*?){0,1}]#im', $chunk, $matches)) {

                # is this an opening or closing code tag?
               if (isset($matches[1]) && $matches[1] == '/') {
                    # we don't need to do anything with closing tags
                   $is_code = false;
                    continue;
                } else {
                    # next chunk contains code
                   $is_code = true;

                    # determine the best code tags to use based on the previous tags
                   if (empty($matches[2]) || $matches[2] == 'text') {
                        $c_open_tag = '[syntax=php]';
                    } else {
                        $c_open_tag = "[syntax={$matches[2]}]";
                    }

                    $c_close_tag = '[/syntax]';
                }
            } else {
                # this chunk is not a code tag, so it's apart of the post text
               $new_post_text .= $chunk;
                $is_code = false;
            }
        }
    }

    # update the post
   $db2->query("UPDATE phpbb_posts SET post_text = '" . $db->escape($new_post_text) . "' WHERE post_id = {$post['post_id']}");

    # output a status message after processing every 100 posts
   if (++$i % 100 == 0) echo "Updated " . number_format($i) . " posts\n";
}
 

_________________
Image


Top
 Profile  
 
PostPosted: Wed Apr 07, 2010 2:38 am 
Offline
DevNet Master

Joined: Wed Feb 11, 2004 4:23 pm
Posts: 4872
Location: Palm beach, Florida
That's wonderful news. I don't know anything about the new syntax highlighter, or which version of phpBB we are using to know what code to dig around in.

Also, as a side note its too bad the new highlighter doesn't add line numbers. That was another complaint I had.

However, I think your existing algorithm works fine. The temptation to get the "edit posts" working is probably unnecessary. As long as I can see old posts I would be very happy. If someone wants to edit a year old post, they can copy and paste the code from the view page, onto the edit page.

The real problem was I could not get a copy of my old posts, this should solve that, I would think. Not being able to edit my year old posts without messing up a little formatting is a lot better situation then not being able to SEE them at all (as it stands right now, if I can't see them I can't edit them anyways, so its just an improvement)!

What do you say? Let's roll with it... I like it... Thanks for your time in fixing the old posts.


Top
 Profile  
 
PostPosted: Wed Apr 07, 2010 11:17 am 
Offline
Site Administrator
User avatar

Joined: Sun May 19, 2002 10:24 pm
Posts: 6887
I have a hunch it could be < > characters. I'll do some more research.

_________________
Image


Top
 Profile  
 
PostPosted: Thu Apr 08, 2010 12:47 am 
Offline
Site Administrator
User avatar

Joined: Sun May 19, 2002 10:24 pm
Posts: 6887

_________________
Image


Top
 Profile  
 
PostPosted: Thu Apr 08, 2010 2:38 am 
Offline
DevNet Master

Joined: Wed Feb 11, 2004 4:23 pm
Posts: 4872
Location: Palm beach, Florida


Top
 Profile  
 
PostPosted: Thu Apr 08, 2010 3:37 am 
Offline
Forum Commoner

Joined: Thu Apr 01, 2010 7:28 pm
Posts: 96
Location: Chicagoland, IL, USA
Good show Benjamin! Thank you so much!

I noticed that the first post in the Posting Code thread still has one wonky block: , but the tutorials and a few other posts I have noticed the changes.

I'll dutifully report any issues I see to you guys.


Top
 Profile  
 
PostPosted: Thu Apr 08, 2010 12:19 pm 
Offline
Site Administrator
User avatar

Joined: Sun May 19, 2002 10:24 pm
Posts: 6887

_________________
Image


Top
 Profile  
 
PostPosted: Mon Apr 12, 2010 9:05 pm 
Offline
DevNet Master

Joined: Wed Feb 11, 2004 4:23 pm
Posts: 4872
Location: Palm beach, Florida
viewtopic.php?f=20&t=114699
FYI, "[syntax=foo]tags have been depreciated. They still work, but the new Geshi system is significantly more advanced. Use [syntax=foo]"....
Don't you hate when find & replace replaces the wrong stuff :-)


Top
 Profile  
 
PostPosted: Mon Apr 12, 2010 10:34 pm 
Offline
Site Administrator
User avatar

Joined: Sun May 19, 2002 10:24 pm
Posts: 6887
Fixed.

_________________
Image


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 28 posts ]  Go to page Previous  1, 2

All times are UTC - 5 hours


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Jump to:  
Powered by phpBB® Forum Software © phpBB Group