Welcome, Guest. Please login or register.
Did you miss your activation email?
May 26, 2012, 09:35:47 AM

Login with username, password and session length
Search:     Advanced search
Interested in joining the WebsiteBaker team?
For more Information read here or on our new website.
155535 Posts in 21713 Topics by 7737 Members
Latest Member: chris85
* Home Help Search Login Register
Pages: [1]   Go Down
Print
Author Topic: utf8 vs entities (do not want)  (Read 337 times)
YoJoe

Offline Offline

Posts: 74


WWW
« on: May 27, 2010, 04:43:38 PM »

Hello
I've noticed some time ago, that all national chars are being converted to entities.
I always use utf-8 encoding on my websites, because iso's charsests got on my nerves long time ago.
I've found 2 threads about chineese and arabian chars not being properly shown.
http://www.websitebaker2.org/forum/index.php/topic,14963.0.html
http://www.websitebaker2.org/forum/index.php/topic,14355.0.html
and a thread where someone asked why there are 2 tables for the content - 1 for plain text, and 1 for encoded text (with entities).
But I haven't found a solution.
(utf-8 characters written in .php file template are being shown as supposed).

Neither adding to .htaccess
Code:
php_value default_charset utf-8
nor whole directive
AddCharset UTF8 .utf8
Code:
php_value default_charset utf-8
didn't work, and only lead to internal server error 500.

Is there a way to turn off converting chars to entites, and show the content exactly how it has been added to database ?
The main reason are SEO phrases, and links, where instead of an example word "rozwój", I see "rozwój" in code.


edit: forgot to mention that I'm using wb2.8.0 + 2.8.1 security fix
« Last Edit: May 27, 2010, 04:45:50 PM by Joduai » Logged

WuJitsu - in web I trust  cool
ruebenwurzel
WebsiteBaker Org e.V.

Offline Offline

Posts: 7973



WWW
« Reply #1 on: May 27, 2010, 06:19:03 PM »

Hello,

if you are using WB 2.8.1 the default charset is utf8 and there is no need to change it.

Quote
how it has been added to database

Thats exactly the issue. WB 2.x stores content in the database without talking with the database. So often database kollation is latin_swedish_ci and using WB with charset utf8 stores spezial chars in the database in a funny way and not as you have written the special chars. WB can handle an convert this to show it in the frontend correct.

Can you have a look in your database and look how your special chars are stored?

If your database has not kollation utf8 using a .htaccess with AddCharset UTF8 or php_value default_charset utf8 will bring you funny results.

On the other hand, if you talk about the WYSIWYG Content. Using FCKEditor as WYSIWYG Editor you can add following lines to wb_fckconfig.js

Code:
FCKConfig.ProcessHTMLEntities = true ;
FCKConfig.IncludeLatinEntities = true ;
FCKConfig.IncludeGreekEntities = true ;

FCKConfig.ProcessNumericEntities = false ;

and change the values to your needs to get rid of automatically converting special chars to entieties.

Have fun

Matthias
Logged
Pages: [1]   Go Up
Print
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.16 | SMF © 2011, Simple Machines Valid XHTML 1.0! Valid CSS!