Home
Download
Add-ons
Help
Forum
Organisation
Project
Welcome,
Guest
. Please
login
or
register
.
Did you miss your
activation email?
May 16, 2012, 10:06:37 PM
1 Hour
1 Day
1 Week
1 Month
Forever
Login with username, password and session length
Search:
Advanced search
Wollen Sie dem WebsiteBaker Team beitreten?
Nähere Informationen finden Sie unter
hier
und auf unserer
neuen Webseite
.
155094
Posts in
21661
Topics by
7721
Members
Latest Member:
arrow345
WebsiteBaker Community Forum
English
Archive (posts up to 2007)
(Moderator:
Argos
)
WB 2.6.5 and the pagenames in languages with special chars
Pages: [
1
]
Go Down
Author
Topic: WB 2.6.5 and the pagenames in languages with special chars (Read 3260 times)
ruebenwurzel
WebsiteBaker Org e.V.
Offline
Posts: 7970
WB 2.6.5 and the pagenames in languages with special chars
«
on:
February 06, 2007, 09:51:58 PM »
Hello,
i know this is discussed on different places in the englisch forum and the german subforum. It is a very complex theme and there is at the moment no finally version wich works for all. As there are different components of WB, wich are working together, i tried to make a summary of the most possible variations. Here are the components wich has to do with:
1. Core Files of WB in the
framework and admin
folder
We added in WB 2.6.5
htmlentities
to the core files. This causes, as we know now, problems with some pages wich are stored in UTF8 and not in the language related charset. As UTF8 is a wordlwide standard i made a package wich changes htmlentities back to only
htmlspecialchars
.
2.
Charset
options in WB admin
In WB admin you have the option to choose your language related charset. With this option you change the charset for WB admin interface and if supported the template of your page. The default charset of WB is
UTF8
but if htmlentities are used for all languages with specialchars you need to change this to the right charset of your language (
ISO-8859-1
should work in most cases) otherwise your page titles got broken . (read here for more informations
http://us2.php.net/manual/en/function.htmlentities.php
)
3. the
convert.php
The convert.php only affects the link of the page (not the page title) and together with this the
filename
in the pagesfolder of WB. The default one in WB is stored in "ANSI as UTF8".
Point 1 and 2 are necessary for the page_title
Point 3 is necessary for the pagelink(filename)
As you see there are a lot of abilitys to combine this with different results.
If you ask why it works in WB 2.6.4 with all to default UTF8, let me answer, it not really worked. The pagenames and the pagelinks looks good in Frontend and Backend but please have a look in the database. So if you have used WB 2.6.4 together with UTF8 and wanna upgrade to WB 2.6.5 you got problems. To solve this you can download the attached file, wich replaces htmlentities with htmlspecialchars. Now your page should work as before.
For all who start new with WB i recommand to use the right charset together with a working convert.php. This is the only combination wich will definitly work. As the default convert.php did not work in all cases, I will add for the different languages the working convert files in the post of this thred here. So we can have all together in one post and don't need to discuss the same problems in different threads.
Matthias
Tested on a blank new WB 2.6.5 with default EN language. XP Pro with Apache 2.0.59, PHP 5.2.0 , MySQL 5.0.27. Database charset UTF8_unicode_ci and latin_swedish_ci both the same results above.
P.S.
In WB 2.7 Ryan plans to add a new way of pagenames generation wich makes the convert.php redundantly.
«
Last Edit: February 07, 2007, 05:36:18 AM by ruebenwurzel
»
Logged
ruebenwurzel
WebsiteBaker Org e.V.
Offline
Posts: 7970
ANSI convert.php for german and portuguese
«
Reply #1 on:
February 06, 2007, 10:55:10 PM »
Hello,
only unzip and overwrite the original file in the frameworks dir.
Matthias
Logged
garny
Offline
Posts: 11
Re: WB 2.6.5 and the pagenames in languages with special chars
«
Reply #2 on:
February 17, 2007, 11:08:41 AM »
Hi!
Thanks for great cms and community!
Back to problem with russian language... i'v a new installation of WB 2.6.5, then i have the opportunity to adjust settings and files in any way so far.
First of all, i'v read all issues on this item...
There's no problem for me to chage or adjust "convert.php" file, and i dont care about the lang of admin interface.
I really need to display the russian symbols in the page menu only.
I played with all charsets, and there is no use... Strange symbols are coming..
Can't understand why this happen if there are seperate values in "page settings'" for "page title" and "menu title".. i'd even prefer to set "page title" to english, and "menu title" to russian - just for appearance.... But we have what we have...
So my question is, should i try to find right charset (is there any possibiliy to add some charsets?) and probably try to adjust the sql database,... something else,
or the only way is to replace some core files (that is the way backward as i understood), using the achieve in this topic?
By the way, russian google.ru coming in utf8, and no problem at all.
Thanks
Cheers
Logged
garny
Offline
Posts: 11
Re: WB 2.6.5 and the pagenames in languages with special chars
«
Reply #3 on:
February 17, 2007, 01:28:45 PM »
Hi again!
we have a result... final and precise
1)core files replaced -> menu items become clean and right (even earlier created)
2)"convert.php" replaced -> didn't work; created a new one in clean ANSI (attached) -> works perfect (only for new pages)
Note: adoped ANSI file has some interpreting changes, a couple of the operation did not have sense at all (for native russian), some changed on my personal veiw; mostly last 8 symbols; if have some question on russian specials, please, post or contact.
Still wondering if there is another way except of replacing core files...but probably it makes sense just wait for another release..
Many thanks to Ryan and community..
Special thanks to Matthias for detailed issue on lang item.. and actual supply of the solution
Logged
garny
Offline
Posts: 11
Re: WB 2.6.5 and the pagenames in languages with special chars
«
Reply #4 on:
February 17, 2007, 06:28:55 PM »
My mistake...
One letter is wrong in "convert.php".
Found on my site draft... Corrected file attached.
Sorry
Logged
icouto
Offline
Posts: 119
Re: WB 2.6.5 and the pagenames in languages with special chars
«
Reply #5 on:
March 15, 2007, 10:52:01 PM »
I have to put together sites that are in (or use) a variety of languages, including Esperanto - the international language. It was, therefore, absolutely crucial for me to be able to setup WebsiteBaker so that the entire installation is fully utf-8 compliant. I needed the database to be utf-8, the php scripts to make queries in utf-8 and the interface to be utf-8. If any of the components is *not* working in utf-8, then we get problems, as described on the table above.
Reading some of the postings in these forums, I thought it would be extremely hard to adapt websitebaker, in order to make it fully utf-8 compliant. However, after reading this thread, I decided to try and experiment.
The first thing I needed to do, was to have the DATABASE created by WebsiteBaker fully utf-8 compliant. I noticed that WebsiteBaker creates the database tables using default charset (which in my version of MySQL is Latin-1 Swedish). So, I did a search in my WebsiteBaker install package for all instances of "CREATE TABLE". This enabled me to find all places in the script where tables were being created. In order to make it fully utf-8 compliant, all I needed to do was add "CHARACTER SET utf8" to the end of the sql query. So that if the original is, for instance:
$mod_wrapper = 'CREATE TABLE `'.TABLE_PREFIX.'mod_wrapper` ('
. ' `section_id` INT NOT NULL DEFAULT \'0\','
. ' `page_id` INT NOT NULL DEFAULT \'0\','
. ' `url` TEXT NOT NULL,'
. ' `height` INT NOT NULL DEFAULT \'0\','
. ' PRIMARY KEY ( `section_id` ) '
. ' )';
It becomes:
$mod_wrapper = 'CREATE TABLE `'.TABLE_PREFIX.'mod_wrapper` ('
. ' `section_id` INT NOT NULL DEFAULT \'0\','
. ' `page_id` INT NOT NULL DEFAULT \'0\','
. ' `url` TEXT NOT NULL,'
. ' `height` INT NOT NULL DEFAULT \'0\','
. ' PRIMARY KEY ( `section_id` ) '
. ' ) CHARACTER SET utf8';
This change needs to be done 18 times, so even if you have to do it manually, it is *very* quick, and saves you having to *manually* having to change the database/table/field collations afterwards using phpMyAdmin!
Next, after reading the 'htmlentities' problem described in this thread, I decided to consult the php documentation. There I found that htmlentities CAN take some optional parameters related to character encoding, just as htmlspecialchars. So, I did a search through the entire WebsiteBaker install package, and wherever I found 'htmlentities', such as in:
"...htmlentities($page['page_title'])..."
I added the 2 optional parameters for defining character encoding, like this:
"...htmlentities($page['page_title'],ENT_COMPAT,"UTF-8")..."
The text 'htmlentities' appears 24 times only in the entire installation, so even doing this manually, it took only about 5 minutes to do.
I then installed the site, with no problems.
Last of all, I made sure in the admin interface of my site, that the default charset of the site is set to 'utf-8'.
Voila! Everything works!
Because utf-8 is multilingual, my users can use any character set I can think of (cyrillic, greek, hebrew, japanese, chinese, devanagari), and it displays correctly, both in admin as well as on the live site, as well as on the database in phpMyAdmin. ALL PAGE TITLES WORK AS IT SHOULD IN ANY LANGUAGE.
I have not tried this with any right-to-left languages (arabic, for instance), but my suspicion is that it will work as well.
I hope that this may help other users in the future.
Although making future releases of WebsiteBaker 'full utf-8' by default (as described here) may cause problems for existing users whose database may be already set to local character sets, the development team should perhaps consider making this an ALTERNATIVE during installation.
Although some work is needed for the 'transition', slowly moving all users to utf-8 may be a way to avoid endless charset problems in the future!
And last of all: keep up the good work! WebsiteBaker is AWESOME!
Logged
ruebenwurzel
WebsiteBaker Org e.V.
Offline
Posts: 7970
Re: WB 2.6.5 and the pagenames in languages with special chars
«
Reply #6 on:
March 16, 2007, 05:49:12 AM »
Hello,
Quote
htmlentities($page['page_title'],ENT_COMPAT,"UTF-8")
I'm not a php guru but makes this sense? Did this line means all htmlentities gets back converted to UTF8?
'
Quote
PRIMARY KEY ( `section_id` ) '
. ' ) CHARACTER SET utf8';
This would change the requirmenst of WB to MySQL greater then 4.1 as the lower MySQL versions didn't support this. I would not recommand th set such a high requirement, because a lot of hosters has lower mysql version and so a lot of users cannot us WB.
Quote
Although making future releases of WebsiteBaker 'full utf-8' by default (as described here) may cause problems for existing users whose database may be already set to local character sets, the development team should perhaps consider making this an ALTERNATIVE during installation.
I'm according that we have at the moment a mixed up version (with WYSIWYG Editors too) wich is not really a good solution and causes a lot of discussions here in the forum. In my opinion we should have (as you suggested) a 100% UTF8 version and a characterset version. But this means not only the few changes you wrote above this means adaptions in a lot of corefiles, the most modules and the wysiwyg editors. To give a alternative during install its not a good idea. I think it's enough to use the settings in the advanced options. Default should be (also as you sugested) UTF8 and this info from the settings table should be used then on every places it is needed. Changing there to another characterset should change this on all needed places (templates, modules, Editors ...) too. As i prefer characterset pages (and i think a lot of other users too) this option should be at the same level as UTF8.
Quote
Although some work is needed for the 'transition', slowly moving all users to utf-8 may be a way to avoid endless charset problems in the future!
Totally disagree because WB should be a system wich can be used from all, not only UTF8 fans. UTF8 is not the absolutly solution, it is only one solution and for me not the best, only the one wich seems to work in most cases, but not in all. Why add Mysql, php and Apache in the latest versions more and mor characterset options when UTF8 is enough? So it seems to give a world outside of UTF8 and this world should also be included in the WB community.
Matthias
Logged
icouto
Offline
Posts: 119
Re: WB 2.6.5 and the pagenames in languages with special chars
«
Reply #7 on:
March 16, 2007, 06:27:14 AM »
Dear Matthias,
Thank you for the feedback. I hope I can provide you with some more information, that may prove useful to you.
1) htmlentities:
Quote
Quote
htmlentities($page['page_title'],ENT_COMPAT,"UTF-8")
I'm not a php guru but makes this sense? Did this line means all htmlentities gets back converted to UTF8?
What the parameters of the htmlentities function do may be best explained by looking at the documentation at php.net:
http://www.php.net/htmlentities
.
2) MySQL requirements:
Quote
Quote
PRIMARY KEY ( `section_id` ) '
. ' ) CHARACTER SET utf8';
This would change the requirmenst of WB to MySQL greater then 4.1 as the lower MySQL versions didn't support this. I would not recommand th set such a high requirement, because a lot of hosters has lower mysql version and so a lot of users cannot us WB.
Indeed, this would increase the requirements. There are still a few service providers in Australia using MySQL 4.0, and it would, indeed, be nice of you to continue to give support to users in older systems. At the same time, however, you *can* support the users who have more up-to-date systems. For instance, the installer could easily check the version of MySQL, and insert the SQL code mentioned above, only *if* the database was 4.1 or higher:
Code:
if (version_compare(mysql_get_server_info(), '4.1.0', '>=')) {...}
This would have the advantage - for you, the developer - that when (in the very near future) everyone upgrades their MySQL to a higher number, your code will scale nicely. And in an year or two, it will make it easier for you to 'clean up', and remove the 'legacy' code (by removing these 'if' statements).
3) Unicode and UTF-8:
Being a person who has to deal with a wide variety of languages (and charsets) on a daily basis, I can tell you that the adoption of unicode - and utf-8 - is a given. Regardless of whether it is good, bad, or whether there would be better options, in my experience, it is already *vastly* adopted by software developers everywhere. In my own experience, as a developer, although it is not perfect, it still makes it much easier to give support to an inordinate amount of charsets, which would be very difficult to do otherwise, and still have your data being able to be imported, read, massaged and generally used by a wide number of applications.
A couple of years ago it was difficult to find an email program that used unicode. Sending messages to users with non-latin charsets meant having to juggle through a myriad of settings, and often having to use more than one email program. Let's not even talk about using open source php scripts in different languages! These days, any email program that is worth having, supports it. A similar thing is happening to most open-source php scripts I know. Indeed, I have just setup a forum (using the open source Vanilla) which is capable of handling every charset my users may want - and that was the default, out-of-the-box install.
Most of the problems that I've come across with charsets, have to do with giving support to legacy systems. You mentioned, for instance, 'why do we still have all these alternative encodings?' The main reason, that I've come across, is just to give support to information in legacy (older) systems, and try to make an older system more compliant. This is why, in my opinion, if you have a look at all the 'collation' options available for MySQL, for instance, you will see a plethora of possibilities (and, as you said, people still asking for more!). This, in my view, is all legacy support, and does not in any way detract from the widespread use - and usefulness - of unicode, and utf-8.
WebsiteBaker is not legacy software. This is a sleek, modern, and evolving package. While there is still need to give support to several different encodings and character sets (as you still have users in legacy systems), it may be easier to start already to support a single system that supports all character sets, and direct everyone to use that. This is the solution most developers seem to be adopting, and you could not be criticised for doing the same - specially if, at least for a while, you still gave support to your old users...
Keep up the good work! - it's certainly appreciated!
Logged
ruebenwurzel
WebsiteBaker Org e.V.
Offline
Posts: 7970
Re: WB 2.6.5 and the pagenames in languages with special chars
«
Reply #8 on:
March 16, 2007, 06:38:37 AM »
Hello,
Quote
What the parameters of the htmlentities function do may be best explained by looking at the documentation at php.net:
http://www.php.net/htmlentities
.
What htmlentities do (and the link you give) i know, the ask was more what 'ENT_COMPAT,"UTF-8"' does.
I'm asking this because if this doesn't break characterset pages (like ISO-8859-1) and fixes issues we have with UTF8 pages while upgrading from 2.6.4 to 2.6.5 maybe we can include it in WB 2.6.6.
I'm according setting all to UTF8 makes life a lot easier, but I'm not shure if i wanna have this
. But I'm pretty shure one of the next versions will have the option to be 100% UTF8 as default with the option to use other chartsets too.
Matthias
Logged
icouto
Offline
Posts: 119
Re: WB 2.6.5 and the pagenames in languages with special chars
«
Reply #9 on:
March 16, 2007, 07:08:01 AM »
Dear Matthias,
Quote
But I'm pretty shure one of the next versions will have the option to be 100% UTF8 as default with the option to use other chartsets too.
Oh, I am SOOOOOO looking forward to this new version!
Many thanks in advance, and
Keep up the good work!!!
Logged
nomasis
Guest
Re: WB 2.6.5 and the pagenames in languages with special chars
«
Reply #10 on:
March 20, 2007, 12:26:14 PM »
Thanks Matthias
working fine now.
Logged
icouto
Offline
Posts: 119
Re: WB 2.6.5 and the pagenames in languages with special chars
«
Reply #11 on:
March 25, 2007, 02:37:54 AM »
Matthias, just wanted to give you some feedback on the htmlentities issue. I've now installed 4 sites which are running WB with *all* the modifications to the core that I've mentioned in this thread (including the addition of those 2 parameters to 'htmlentities'). All 4 sites have now been running for 2 weeks, with page names being added in 3 different languages (including English), and so far I have not come across any problem.
Due to the modifications, all sites are uniformly using UTF-8, both in the backend database, as well as in the front end and admin. It is all pretty seamless to the users, who have remained totally unaware of any encoding issues.
The database being in UTF-8 is a *big* bonus, as searches now give perfect results (not always so in the past, when the database encoding not always matched the browser html encoding), and I can perform searches directly in the database, with all results being visible and exportable correctly by phpMyAdmin.
I would, of course, recommend further testing by others, but I am now satisfied that these modifications are sufficiently stable for my own use, and have now permanently altered my base-installation folder of WB, and will be using this modified version for all my future installs.
Logged
ruebenwurzel
WebsiteBaker Org e.V.
Offline
Posts: 7970
Re: WB 2.6.5 and the pagenames in languages with special chars
«
Reply #12 on:
March 26, 2007, 06:22:15 AM »
Hello,
did the changes you made with ISO-xxx and with none UTF8 Databases work too and are their no issues when upgrading from any possible configuration to a version with your changes? First if this all can be answered with yes we change something on our core files.
By the way have a look at the SVN we have fixed the search so it gives all results no matter if UTF8, ISO-xxxx Charset and nor matter if database is UTF8 or other charset.
Matthias
Logged
icouto
Offline
Posts: 119
Re: WB 2.6.5 and the pagenames in languages with special chars
«
Reply #13 on:
March 28, 2007, 11:11:08 PM »
Matthias, sorry about the delay in replying to you - I just noticed your last post here today!
I have now used WB baker in 5 different websites. After the first website, I changed the CORE files in the default installation folder I use, so now I no longer have to apply the changes individually to every site I build. I have now installed the 4 other sites with my modifications already in place, and I have not experienced any problems with the installer. The installer now creates UTF-8 tables by default, IF the MySQL server version is >= 4.1 (as described in the code above).
The change to using 'htmlentities' so far has not affected *anything* in my sites. I have installed and tested almost every add-on available, and have not experienced any problem specifically because of this. My sites, however, are totally in UTF-8. A possible solution for backwards compatibility might be to make the 3rd parameter in the calls to htmlentities a variable (which will be UTF-8, ISO-XXXX, or whatever other encoding the user has chosen for their site/page).
I cannot tell you how well this would work as far as UPGRADING an existing site. All my sites are brand new, and I have not had to do an upgrade yet. In theory, however, the changes as I have described them here in this message *would* be backwards compatible, so that even if the user were using an ISO-XXXX encoding, it *should* work. The only way to really find out, and iron out any possible incompatibilities, is to try it out on a test site.
So, based on my own experience, I would definitely recommend doing these changes to the core.
Last of all, that is GREAT news about the Search! Keep up the good work!!!
Logged
maerik
Offline
Posts: 1
Re: WB 2.6.5 and the pagenames in languages with special chars
«
Reply #14 on:
March 30, 2007, 07:50:07 AM »
Quote from: icouto on March 28, 2007, 11:11:08 PM
I cannot tell you how well this would work as far as UPGRADING an existing site. All my sites are brand new, and I have not had to do an upgrade yet. In theory, however, the changes as I have described them here in this message *would* be backwards compatible, so that even if the user were using an ISO-XXXX encoding, it *should* work. The only way to really find out, and iron out any possible incompatibilities, is to try it out on a test site.
I will do an UPGRADE to UTF-8 for my 3 WB sites and check if it works.
But I have one question:
I stuck at the search problem - searching for one page listed pages in every languages. Are you using multiple WB installations for every language so that every language has it's own directory?
«
Last Edit: March 30, 2007, 07:52:18 AM by maerik
»
Logged
icouto
Offline
Posts: 119
Re: WB 2.6.5 and the pagenames in languages with special chars
«
Reply #15 on:
April 01, 2007, 12:35:32 AM »
Quote from: maerik on March 30, 2007, 07:50:07 AM
I stuck at the search problem - searching for one page listed pages in every languages. Are you using multiple WB installations for every language so that every language has it's own directory?
I don't know if my setup is similar to yours or not. In my multi-lingual sites the users can see all the languages, and all pages in the site. Usually, at the top level there is a language menu (English, Esperanto, Portuguese, etc.). Under each language there is a set of pages - not always the same set of pages for every language. Pages in different languages have different menu and page titles. So, in my setup, the search results do not cause that much of a problem, because even if the user searches for a word that may be in pages in ALL languages, it will be pretty obvious which ones are in his language, and which aren't (just like when we search in this forum, and get a bunch of results in German or French!
)
There is, however, a
very serious bug
with multi-lingual setups, which is described in this discussion:
http://forum.websitebaker.org/index.php/topic,5738.0.html
This means, that in all my sites, regardless of the languages the user wishes to use, I only install ONE language, and all the site's pages have their 'language' set as that. Adding ANY extra language causes the INFINITE REDIRECTION BUG to eventually show itself, as users start changing their own language in their Preferences. This is 'less than optimal' to say the least, and something I truly hope will be addressed in the next release of WB!
Logged
Pages: [
1
]
Go Up
Jump to:
Please select a destination:
-----------------------------
General
-----------------------------
=> General Announcements
=> Security Announcements
=> Documentation
=> WebsiteBaker Website Showcase
=> Guest Area & Off-Topic
-----------------------------
English
-----------------------------
=> WebsiteBaker 2.9
===> Announcements
===> Help/Support
=====> Modules / Extensions
===> Suggestions
===> Software bugs
=> Help & Support
=> Modules
=> Droplets (PHP code for use with Droplet module) & Snippets (raw PHP code)
=> jQuery
=> Templates, Menus & Design
=> WebsiteBaker Language Files
=> WebsiteBaker 2.x discussion
=> WebsiteBaker 3
=> Archive (posts up to 2007)
-----------------------------
Deutsch (German)
-----------------------------
=> Ankündigungen
=> WebsiteBaker 2.9
===> Ankündigungen
===> Hilfe/Support
=====> Module / Extensions
===> Vorschläge
===> Softwarefehler
===> Erfahrungs und Testberichte
=> Hilfe/Support
=> Module & Snippets
=> Templates & Design
=> Tutorials
=> jQuery
=> Diskussion über WB
=> Off-Topic
=> Archiv für Themen bis 2007
-----------------------------
Nederlands (Dutch)
-----------------------------
=> Aankondigingen
=> Hulp & Ondersteuning
=> Niet-Terzake (Off Topic)
-----------------------------
Francais (French)
-----------------------------
=> Help/Support
-----------------------------
Italiano (Italian)
-----------------------------
=> Help/Support
-----------------------------
Bakery (WB shop module)
-----------------------------
=> Bakery English
=> Bakery Deutsch
-----------------------------
KeepInTouch (Multi Contact Module)
-----------------------------
=> KeepInTouch English
=> KeepInTouch Deutsch
Loading...