亚洲最大看欧美片,亚洲图揄拍自拍另类图片,欧美精品v国产精品v呦,日本在线精品视频免费

<div id="tiert"><ins id="tiert"><listing id="tiert"></listing></ins></div>

<ruby id="tiert"></ruby>

<ruby id="tiert"></ruby>

當(dāng)前位置：站長(zhǎng)資訊網(wǎng) > 編程知識(shí) > 正文

php怎么刪除非utf8字符

2021-06-12 分類(lèi)：編程知識(shí) 閱讀(905) 評(píng)論(0)

php刪除非utf8字符的方法：首先創(chuàng)建一個(gè)PHP示例文件；然后使用正則表達(dá)式“preg_replace($regex, '$1', $text);”方法刪除非utf8字符即可。

php怎么刪除非utf8字符

本文操作環(huán)境：windows7系統(tǒng)、PHP7.1版，DELL G3電腦

具體問(wèn)題：

php怎么刪除非utf8字符？

php 從字符串中刪除非UTF8字符

我在從字符串中刪除非utf8字符時(shí)出現(xiàn)問(wèn)題，這些字符無(wú)法正確顯示。像這樣的字符0x97 0x61 0x6C 0x6F(十六進(jìn)制表示)

刪除它們的最佳方法是什么？正則表達(dá)式還是其他？

解決辦法：

使用正則表達(dá)式方法:

$regex = <<<'END' /   (     (?: [x00-x7F]                 # single-byte sequences   0xxxxxxx     |   [xC0-xDF][x80-xBF]      # double-byte sequences   110xxxxx 10xxxxxx     |   [xE0-xEF][x80-xBF]{2}   # triple-byte sequences   1110xxxx 10xxxxxx * 2     |   [xF0-xF7][x80-xBF]{3}   # quadruple-byte sequence 11110xxx 10xxxxxx * 3      ){1,100}                        # ...one or more times   ) | .                                 # anything else /x END; preg_replace($regex, '$1', $text);

它搜索UTF-8序列，并將其捕獲到組1中。它還匹配無(wú)法識(shí)別為UTF-8序列的一部分的單個(gè)字節(jié)，但不捕獲這些字節(jié)。替換是捕獲到組1中的任何內(nèi)容。這將有效刪除所有無(wú)效字節(jié)。

通過(guò)將無(wú)效字節(jié)編碼為UTF-8字符，可以修復(fù)字符串。但是，如果錯(cuò)誤是隨機(jī)的，則可能會(huì)留下一些奇怪的符號(hào)。

$regex = <<<'END' /   (     (?: [x00-x7F]               # single-byte sequences   0xxxxxxx     |   [xC0-xDF][x80-xBF]    # double-byte sequences   110xxxxx 10xxxxxx     |   [xE0-xEF][x80-xBF]{2} # triple-byte sequences   1110xxxx 10xxxxxx * 2     |   [xF0-xF7][x80-xBF]{3} # quadruple-byte sequence 11110xxx 10xxxxxx * 3      ){1,100}                      # ...one or more times   ) | ( [x80-xBF] )                 # invalid byte in range 10000000 - 10111111 | ( [xC0-xFF] )                 # invalid byte in range 11000000 - 11111111 /x END; function utf8replacer($captures) {   if ($captures[1] != "") {     // Valid byte sequence. Return unmodified.     return $captures[1];   }   elseif ($captures[2] != "") {     // Invalid byte of the form 10xxxxxx.     // Encode as 11000010 10xxxxxx.     return "xC2".$captures[2];   }   else {     // Invalid byte of the form 11xxxxxx.     // Encode as 11000011 10xxxxxx.     return "xC3".chr(ord($captures[3])-64);   } } preg_replace_callback($regex, "utf8replacer", $text);

編輯:

!empty(x)將匹配非空值("0"被認(rèn)為是空的)。
x != ""將匹配非空值，包括"0"。
x !== ""將匹配""以外的任何內(nèi)容。

在這種情況下，x != ""似乎是最好的選擇。

我也加快了比賽速度。而不是單獨(dú)匹配每個(gè)字符，它匹配有效的UTF-8字符序列。

推薦學(xué)習(xí)：《PHP視頻教程》

贊(0)

標(biāo)簽：apt list php UTF8 Windows7 Windows7系統(tǒng)正則表達(dá)式電腦

相關(guān)推薦

網(wǎng)站地圖滬ICP備18035694號(hào)-2

滬公網(wǎng)安備31011702889846號(hào)