Bitcoin Forum

Other => Meta => Topic started by: andytoshi on August 29, 2014, 10:38:01 PM



Title: Non-truncated RSS feeds
Post by: andytoshi on August 29, 2014, 10:38:01 PM
As described at http://wiki.simplemachines.org/smf/SMF2.0:News_and_newsletters#Settings it is possible to change the RSS feeds so that they do not truncate the text (setting Maximum message length to zero means no limit).

Currently the limit is set to some small number, which makes the RSS feed unusable without a web browser. The wiki page I linked recommends this "because some users have broken RSS readers", which is a non-sequitor, but presumably is the reason for this setting.

If we could remove the RSS message length limit, that'd be awesome. I get most of my news over RSS in my email inbox, and it severely interrupts my flow when individual feeds require special attention to be readable. It also prevents me from forwarding or archiving forum messages.

Thanks!

Andrew


Title: Re: Non-truncated RSS feeds
Post by: btc4ever on October 09, 2014, 02:48:09 AM
The truncation breaks the html badly.   Real example from this site's RSS feed today:

Quote
                        <description>
<![CDATA[I can&#39;t believe the senators applauded him at the end. &nbsp;<img src="https://bitcointalk.org/Smileys/default/shocked.gif" alt="Shocked" border="0" /><br /><br /><a href="http://hocca.wmod.llnwd.net/a4502/e2/20141008161400_9692_990.wmv" target="_blank"><img class="userimg" src="https://ip.bitcointalk.org/?u=http%3A%2F%2Fs27.postimg.org%2Fjvou3ugsj%2Fzandreas.png]]>
                        </description>

Notice that the final <img> tag is truncated.  What isn't immediately obvious is that the image URL itself is truncated. 

The full image URL is:

https://ip.bitcointalk.org/?u=http%3A%2F%2Fs27.postimg.org%2Fjvou3ugsj%2Fzandreas.png&t=545&c=QfibD-q76XHMGA

NOT

https://ip.bitcointalk.org/?u=http%3A%2F%2Fs27.postimg.org%2Fjvou3ugsj%2Fzandreas.png

The latter URL gives a "image proxy invalid" image.

So in this instance we have:

1) unterminated html tag/attribute <img src="...
2) invalid URL

This is an absolute mess for the RSS consumer to deal with.

As a fix I would suggest at least one of these:

a) provide plaintext description only, sans html.   eg using php's strip_tags().     ( Easy )

b) truncate in a way that leaves syntactically correct html.  eg truncate then run through htmltidy.   ( Harder ).

c) provide option for retrieving full text of the post/article.