Bitcoin Forum
December 15, 2024, 11:29:19 AM *
News: Latest Bitcoin Core release: 28.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1]
  Print  
Author Topic: Help me with a regex  (Read 584 times)
Bitcoin_BOy$ (OP)
Hero Member
*****
Offline Offline

Activity: 854
Merit: 503


|| Web developer ||


View Profile
June 26, 2015, 03:31:12 AM
 #1

Hello ,
I want to build a regex to get text between
Quote

I built this but seem not working well : (\[quote?.*\])(.*)(\[.*\])
Anyone could help me please to build one working for PHP .

Thanks for your time ,
Bitcoin Boy .


Note : I don't know if this is the right way to ask  Roll Eyes;
secrethedgehog
Newbie
*
Offline Offline

Activity: 50
Merit: 0


View Profile
June 26, 2015, 03:57:54 AM
 #2

Well you would have to get the source. use preg_match.

Test this out on any thread. Prob wont work for all but its a start, I need to look more into it.

Quote
<?php

$sForumThread = 'https://bitcointalk.org/index.php?topic=1095300.msg11689120#msg11689120';

$sRequest = file_get_contents($sForumThread);

$sPattern = '|<div class="quoteheader"><a.*>(.*)</a></div><div class="quote">(.*)</div>|';

preg_match_all($sPattern, $sRequest, $sMatches);

var_dump($sMatches);

?>
bitnanigans
Sr. Member
****
Offline Offline

Activity: 266
Merit: 250


View Profile
June 26, 2015, 01:02:59 PM
 #3

You can use this pattern
Code:
/<div class="quote">(.*?)<\/div>/is

However, this will get tricky with nested quotes. Observing the forum thread HTML code, It looks like there is always a <br> after the div.quote element, so this may achieve the desired result for nested quotes.
Code:
/<div class="quote">(.*?)<\/div><br/is

It may not work 100% of the time though, so you should always check your results.

hexafraction
Sr. Member
****
Offline Offline

Activity: 392
Merit: 268

Tips welcomed: 1CF4GhXX1RhCaGzWztgE1YZZUcSpoqTbsJ


View Profile
June 26, 2015, 03:47:21 PM
 #4

Is it necessary to use a regex? They can easily break (esp. ones like bitnanigan's hack that rely on extra tags) if the forum software gets an update that slightly changes the HTML generation behavior, or if the theme changes, as well as with some edge cases that the forum software may allow or create. You might have better luck with a DOM parser/HTML parser.

I have recently become active again after a long period of inactivity. Cryptographic proof that my account has not been compromised is available.
Bitcoin_BOy$ (OP)
Hero Member
*****
Offline Offline

Activity: 854
Merit: 503


|| Web developer ||


View Profile
June 26, 2015, 03:58:35 PM
 #5

You can use this pattern
Code:
/<div class="quote">(.*?)<\/div>/is

However, this will get tricky with nested quotes. Observing the forum thread HTML code, It looks like there is always a <br> after the div.quote element, so this may achieve the desired result for nested quotes.
Code:
/<div class="quote">(.*?)<\/div><br/is

It may not work 100% of the time though, so you should always check your results.


Thanks , I used DOM Parser because Regex is not sure , and the result won't be 100% correct
Problem resolved !

Bitcoin Boy.
Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!