preg_split problem

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
daboss
Forum Newbie
Posts: 2
Joined: Mon Oct 31, 2011 11:04 pm

preg_split problem

Post by daboss »

Okay so I've got this very simply preg_split function searching for a semicolon (';') in a very long string (that is actually an sql query exported from mySQL).

Code: Select all

$queries = preg_split('/;/',$sql);
The string it is searching through ($sql):

Code: Select all

$sql = "
-- phpMyAdmin SQL Dump
-- version 3.4.5
-- http://www.phpmyadmin.net
--
-- Host: localhost
-- Generation Time: Oct 30, 2011 at 09:38 AM
-- Server version: 5.1.56
-- PHP Version: 5.2.9

SET SQL_MODE="NO_AUTO_VALUE_ON_ZERO";
SET time_zone = "+00:00";


/*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
/*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
/*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
/*!40101 SET NAMES utf8 */;

--
-- Database: `newssc3_bt`
--

-- --------------------------------------------------------

--
-- Table structure for table `wp_links`
--

CREATE TABLE IF NOT EXISTS `wp_links` (
  `link_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `link_url` varchar(255) NOT NULL DEFAULT '',
  `link_name` varchar(255) NOT NULL DEFAULT '',
  `link_image` varchar(255) NOT NULL DEFAULT '',
  `link_target` varchar(25) NOT NULL DEFAULT '',
  `link_description` varchar(255) NOT NULL DEFAULT '',
  `link_visible` varchar(20) NOT NULL DEFAULT 'Y',
  `link_owner` bigint(20) unsigned NOT NULL DEFAULT '1',
  `link_rating` int(11) NOT NULL DEFAULT '0',
  `link_updated` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `link_rel` varchar(255) NOT NULL DEFAULT '',
  `link_notes` mediumtext NOT NULL,
  `link_rss` varchar(255) NOT NULL DEFAULT '',
  PRIMARY KEY (`link_id`),
  KEY `link_visible` (`link_visible`)
) ENGINE=MyISAM  DEFAULT CHARSET=utf8 AUTO_INCREMENT=28 ;

--
-- Dumping data for table `wp_links`
--

INSERT INTO `wp_links` (`link_id`, `link_url`, `link_name`, `link_image`, `link_target`, `link_description`, `link_visible`, `link_owner`, `link_rating`, `link_updated`, `link_rel`, `link_notes`, `link_rss`) VALUES
(22, 'http://kikolani.com/', 'Kikolani', '', '', '', 'Y', 1, 0, '0000-00-00 00:00:00', '', '', ''),
(23, 'http://jobhits.net/blog/', 'Job Search Blog', '', '', 'Lastest news and updates recruitment and employment.', 'Y', 2, 0, '0000-00-00 00:00:00', '', '', ''),
(24, 'http://jobhits.net/Seo-jobs', 'Seo Jobs', '', '', 'Seo Jobs', 'Y', 2, 10, '0000-00-00 00:00:00', '', '', 'http://jobhits.net/services/rss?k=seo'),
(21, 'http://www.webuildyourblog.com/', 'Blogging Guide', '', '', '', 'Y', 1, 0, '0000-00-00 00:00:00', '', '', ''),
(17, 'http://ma.tt/', 'Matt Mullenweg', '', '', '', 'Y', 1, 0, '0000-00-00 00:00:00', '', '', ''),
(19, 'http://www.blogussion.com/', 'Blogussion', '', '', '', 'Y', 1, 0, '0000-00-00 00:00:00', '', '', ''),
(20, 'http://weblogbetter.com/', 'We Blog Better', '', '', '', 'Y', 1, 0, '0000-00-00 00:00:00', '', '', ''),
(25, 'http://www.buyrealfacebookfans.com', 'Get Facebook Fans', '', '', '', 'Y', 1, 0, '0000-00-00 00:00:00', '', '', ''),
(26, 'http://www.cssformatter.com/', 'Format CSS', '', '', '', 'Y', 1, 0, '0000-00-00 00:00:00', '', '', ''),
(27, 'http://wwww.place.vn/', 'Vietnam Place', '', '_blank', 'Vietnam Places', 'Y', 2, 0, '0000-00-00 00:00:00', '', '', '');

/*!40101 SET CHARACTER_SET_CLIENT=@OLD_CHARACTER_SET_CLIENT */;
/*!40101 SET CHARACTER_SET_RESULTS=@OLD_CHARACTER_SET_RESULTS */;
/*!40101 SET COLLATION_CONNECTION=@OLD_COLLATION_CONNECTION */;
";
The idea is, ultimately, to use the array created by pre_split to break up the very large query into individual queries I can send through with the mysql_query() function, but I'm not there yet. My problem is that preg_split trips when it reaches this part:

Code: Select all

SET SQL_MODE="NO_AUTO_VALUE_ON_ZERO";
SET time_zone = "+00:00";
Why?!?
If I take that part out, preg_split functions properly and splits the giant string just the way I want it. But it won't process the above lines. if I include those lines it doesn't store anything into the array. any help would be greatly appreciated ...
User avatar
twinedev
Forum Regular
Posts: 984
Joined: Tue Sep 28, 2010 11:41 am
Location: Columbus, Ohio

Re: preg_split problem

Post by twinedev »

Simple:

You are using double quotes inside of of double quotes without escaping them, so PHP sees the quote ending, and then following by a what it will assume is a constant, and just rammed up against a string, so it will error.

When you past the DUMP data to your editor, before you wrap it with $sql = ""; you need to do a search/replace over that chunk of code and replace " with \"

(Also, if you look at your post here, you will notice this in syntax highlighting on those two lines, the info between the quotes are not blue as the rest of the string is. (this is where a decent IDE can really help.)

-Greg
daboss
Forum Newbie
Posts: 2
Joined: Mon Oct 31, 2011 11:04 pm

Re: preg_split problem

Post by daboss »

Thanks for identifying the problem, Greg. I suspected it was the quotation marks tripping. A search and replace can't be the solution as the whole purpose of the script is to automate sending SQL queries by putting them in an array. I will figure it out from here though. Thanks!
User avatar
twinedev
Forum Regular
Posts: 984
Joined: Tue Sep 28, 2010 11:41 am
Location: Columbus, Ohio

Re: preg_split problem

Post by twinedev »

You can also do the following:

Code: Select all

$sql = <<< ENDOFSQL
(pasted SQL code...)
ENDOFSQL;
Some other things to consider....

1. No need to execute items that are only comments
2. Via PHP, how does a query with a single line comment + line break + actual SQL statement get treated (never tried it)
3. What happens if part of a value actually contains a line break with type hyphens starting the next line
4. (most likely issue) What happens if part of a value contains a semicolon, the initial split would have split here as well.

Here is some sample code that handles all of that, with the following assumptions which I think is normal for a SQL DUMP:
1. line comments only start with type hyphens, not a hash (#), and they are always at the beginning of the line
2. block comments are never within a set of quotes as data.

Code: Select all


$aryLines = explode(';',$sql);

$strPrevLine = '';
foreach($aryLines as $strCmd) {
	$strCleanCmd = ($strPrevLine!='') ? $strPrevLine.';'.$strCmd : $strCmd;
	$strCleanCmd = trim($strCleanCmd); // mainly to get rid the newline at the start for the next one
	$strCleanCmd = preg_replace('/^(--.*?(\r?\n)+)+/s', '', $strCleanCmd); // Get rid of line comments (only from start of line)
	$strCleanCmd = preg_replace('%/\*.*?\*/%s', '', $strCleanCmd); // Get rid of block comments
	$strCleanCmd = trim($strCleanCmd);
	if (substr_count(str_replace("\'",'',$strCleanCmd),"'")%2) {
		// There was an odd number of single quotes, so the semicolon here was part of a string....
		$strPrevLine .= ';'.$strCmd; // Add raw current line to previous line and loop
	}
	else {
		// There was even number of single quotes, so should be a good semicolon break
		if ($strCleanCmd!='') {
			// EXECUTE QUERY HERE
			echo "<hr>$strCleanCmd<hr>";
		}
		$strPrevLine = ''; // Reset to blank
	}
}
-Greg
Post Reply