New title: New script does not correctly display strange character encoding of stored data-PHP Chinese Network Q&A

I'm trying to rewrite an old website.

It is in Persian and uses Persian/Arabic characters.

CREATE DATABASE `db` DEFAULT CHARACTER SET utf8 COLLATE utf8_persian_ci; USE `db`;

Almost all my tables/columns have COLLATE set toutf8_persian_ci

I'm using codeigniter for my new script, and I've

'char_set' => 'utf8', 'dbcollat' => 'utf8_persian_ci',

is in the database settings, so no problem.

So this is the weird part

The old scripts used some kind of database engine calledTUBADBENGINEorTUBA DB ENGINE... nothing special.

When I entered some data (in Farsi) into the database using an old script, when I looked at the database, the characters were stored asØ1Ù…Ø±Ø§Ù†.

The old script gets/displays the data fine, but the new script displays them using the same weird font/charset as the database

So when I type???the data stored in the database looks likeØ1Ù...Ø±Ø§Ùand when I get it in a new script when I seeØ1Ù...Ø±Ø§Ùbut in the old script I see??

CREATE TABLE IF NOT EXISTS `tnewsgroups` ( `ID` int(11) NOT NULL AUTO_INCREMENT, `fName` varchar(200) COLLATE utf8_persian_ci DEFAULT NULL, PRIMARY KEY (`ID`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_persian_ci AUTO_INCREMENT=11 ; -- -- Dumping data for table `tnewsgroups` -- INSERT INTO `tnewsgroups` (`ID`, `fName`) VALUES (1, 'Ø¹Ù…Ø±Ø§Ù†'), (2, 'Ù…Ø¹Ù…Ø§Ø±ÛŒ'), (3, 'Ø¨Ø±Ù‚'), (4, 'Ù…Ú©Ø§Ù†ÛŒÚ©'), (5, 'test'), (6, 'test2');

On the other hand, when I enter???directly into the database

Of course, I stored the same in the database???

The new script displays very well

But in the old script I get????

Can anyone understand this?

This is a large engine

https://github.com/maxxxir/mz-codeigniter-crud/blob/master/tuba.php

Usage example of old script:

define("database_type" , "MYSQL"); define("database_ip" , "localhost"); define("database_un" , "root"); define("database_pw" , ""); define("database_name" , "nezam2"); define("database_connectionstring" , ""); $db = new TUBADBENGINE(database_type , database_ip , database_un , database_pw , database_name , database_connectionstring); $db->Select("SELECT * FROM tnews limit 3"); if ($db->Lasterror() != "") { echo "ÎØÇ ! áØÝÇ ãÌÏøÏÇ ÊáÇÔ ˜äíÏ"; exit(); } for ($i = 0 ; $i < $db->Count() ; $i++) { $row = $db->Next(); var_dump($row); }

reply all (2)

P粉2573421662023-11-18 09:06:47 2 floor

deceze's answeris very good, but I can add some information that might help with handling a large number of records without having to test them manually.

If the conversionCONVERT(BINARY CONVERT(field_name USING latin1) USING utf8)fails,NULLwill be printed instead of thefield_namecontent.

So I use this to find those records:

SELECT IFNULL( CONVERT(BINARY CONVERT(field_name USING latin1) USING utf8) , '**************************************************') FROM table_name

Or this:

SELECT id, field_name, CONVERT(BINARY CONVERT(field_name USING latin1) USING utf8) FROM table_name WHERE CONVERT(BINARY CONVERT(field_name USING latin1) USING utf8) IS NULL

UPDATEwith this clause only affects records that have been converted successfully:

UPDATE table_name SET field_name = CONVERT(BINARY CONVERT(field_name USING latin1) USING utf8mb4 ) WHERE CONVERT(BINARY CONVERT(field_name USING latin1) USING utf8mb4) IS NOT NULL

Like+0

Add Reply

P粉6638838622023-11-18 00:37:25 1 floor

In short, because this question has been discussed a thousand times before:

PHP saves a string, such as"汉字", encoded in UTF-8. The byte isE6 BC A2 E5 AD 97.
It sends this string over thedatabase connection set tolatin1.
The database received the bytesE6 BC A2 E5 AD 97and thought they representedlatin1characters.
Database storage charactersæ¡ ¡ ¿ Li>
The same process in reverse causes PHP to receive the same bytes and then treat them as UTF-8. The round trip works just fine for PHP, although the database doesn't handle characters the way it should.

So the problem here is that the database connection is not set up correctly when the data is entered into the database. You have to convert the data in the database to the correct characters. Try this:

SELECT CONVERT(BINARY CONVERT(field_name USING latin1) USING utf8) FROM table_name

Maybeutf8is not what you need, please try it. If it works, change it to aUPDATEstatement to permanently update the data.

Php8, I'm coming too

Learn website layout in 30 minutes

Shangguan Oracle Beginner to Proficient Video Tutorial

Your first line of UNI-APP code

Flutter from scratch to app launch

Brother Lian New Linux Video Tutorial

AXURE 9 Video Tutorial (Suitable for Product Manager Interactive Product Design UI)

Zero Basic Proficiency PS Video Tutorial

16 day UI video tutorial to get you started

PS Techniques and Slicing Techniques Video Tutorial

Alibaba Cloud Environment Construction and Project Launch Video Tutorial

Overview of Computer Networks - Basic Knowledge that Programmers Must Master

Essential Tutorial for Programmers - HTTP Protocol Explanation

Websocket Video Tutorial