In an interface docking, base64 custom encoding table needs to be used for encoding and decoding. I searched it on the Internet and found out The principles are more comprehensive and thorough. It provides encoding examples but no decoding. The following is an example of base64 custom dictionary decoding that I implemented. It is relatively rough. After testing the assembly, there should be no problem. If you need this, you can take a look. For a moment, first bring over the principles from other people’s blogs
Base64 encoding is an encoding method often used in our program development. It is a representation method based on using 64 printable characters to represent binary data. It is usually used as a encoding method for storing and transmitting some binary data! It is also a common encoding method for binary data represented by printable characters in MIME (Multipurpose Internet Mail Extensions, mainly used as an email standard)! It actually just defines a method of transmitting content using printable characters, and does not create a new character set! Sometimes, after we learn the idea of conversion, we can actually construct some of our own interface definition coding methods based on our own actual needs. Okay, let’s take a look at its conversion ideas!
Base64 implementation conversion principle
It is a method that uses 64 printable characters to represent all binary data. Since 2 to the 6th power is equal to 64, every 6 bits can be used as a unit, corresponding to a certain printable character. We know that three bytes have 24 bits, which can correspond to 4 Base64 units, that is, 3 bytes need to be represented by 4 Base64 printable characters. The printable characters in Base64 include letters A-Z, a-z, and numbers 0-9, so there are 62 characters in total. In addition, the two printable symbols are generally different in different systems. However, the other 2 characters of Base64 that we often call are: "/". The corresponding table of these 64 characters is as follows.
<table class="table" border="1" rules="all" cellspacing="0" align="center"> <tbody> <tr><th scope="col">编号</th><th scope="col">字符</th><th rowspan="18"> </th><th scope="col">编号</th><th scope="col">字符</th><th rowspan="18"> </th><th scope="col">编号</th><th scope="col">字符</th><th rowspan="18"> </th><th scope="col">编号</th><th scope="col">字符</th></tr> <tr> <td>0</td> <td>A</td> <td>16</td> <td>Q</td> <td>32</td> <td>g</td> <td>48</td> <td>w</td> </tr> <tr> <td>1</td> <td>B</td> <td>17</td> <td>R</td> <td>33</td> <td>h</td> <td>49</td> <td>x</td> </tr> <tr> <td>2</td> <td>C</td> <td>18</td> <td>S</td> <td>34</td> <td>i</td> <td>50</td> <td>y</td> </tr> <tr> <td>3</td> <td>D</td> <td>19</td> <td>T</td> <td>35</td> <td>j</td> <td>51</td> <td>z</td> </tr> <tr> <td>4</td> <td>E</td> <td>20</td> <td>U</td> <td>36</td> <td>k</td> <td>52</td> <td>0</td> </tr> <tr> <td>5</td> <td>F</td> <td>21</td> <td>V</td> <td>37</td> <td>l</td> <td>53</td> <td>1</td> </tr> <tr> <td>6</td> <td>G</td> <td>22</td> <td>W</td> <td>38</td> <td>m</td> <td>54</td> <td>2</td> </tr> <tr> <td>7</td> <td>H</td> <td>23</td> <td>X</td> <td>39</td> <td>n</td> <td>55</td> <td>3</td> </tr> <tr> <td>8</td> <td>I</td> <td>24</td> <td>Y</td> <td>40</td> <td>o</td> <td>56</td> <td>4</td> </tr> <tr> <td>9</td> <td>J</td> <td>25</td> <td>Z</td> <td>41</td> <td>p</td> <td>57</td> <td>5</td> </tr> <tr> <td>10</td> <td>K</td> <td>26</td> <td>a</td> <td>42</td> <td>q</td> <td>58</td> <td>6</td> </tr> <tr> <td>11</td> <td>L</td> <td>27</td> <td>b</td> <td>43</td> <td>r</td> <td>59</td> <td>7</td> </tr> <tr> <td>12</td> <td>M</td> <td>28</td> <td>c</td> <td>44</td> <td>s</td> <td>60</td> <td>8</td> </tr> <tr> <td>13</td> <td>N</td> <td>29</td> <td>d</td> <td>45</td> <td>t</td> <td>61</td> <td>9</td> </tr> <tr> <td>14</td> <td>O</td> <td>30</td> <td>e</td> <td>46</td> <td>u</td> <td>62</td> <td>+</td> </tr> <tr> <td>15</td> <td>P</td> <td>31</td> <td>f</td> <td>47</td> <td>v</td> <td>63</td> <td>/</td> </tr> </tbody> </table>
During conversion, three bytes of data are put into a 24-bit buffer one after another, and the byte that comes first occupies the high bit. If the data is less than 3 bytes, the remaining bits in the buffer will be filled with 0s. Then, 6 bits are taken out each time, and the characters in <br>ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789 /
are selected according to their values as the encoded output. Continue until all input data is converted.
If there are two input data left at the end, add 1 "=" after the encoding result; if there is one input data left at the end, add 2 "=" after the encoding result; if there is no data left, just what Do not add them, so as to ensure the accuracy of data restoration.
The encoded data is slightly longer than the original data, 4/3 of the original. No matter what kind of characters, all characters will be encoded, so unlike Quoted-printable encoding, some printable characters are retained. Therefore, it is not as readable as Quoted-printable encoding!
text |
|
a | n | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ASCII encoding | 77 | 97 | 110 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Binary bit | 0 | 1 | 0 | 0 | 1 | 1 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 1 | 1 | 1 | 0 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Index | 19 | 22 | 5 | 46 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Base64 encoding | T | W | F | u |
The Ascii code of M is 77, the first six digits correspond to 19, the corresponding base64 character is T, and so on. Other character encodings can be automatically converted! Let's look at the other situation where it's not exactly 3 bytes!
文本(1 Byte) | A | |||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
二进制位 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | ||||||||||||||||
二进制位(补0) | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | ||||||||||||
Base64编码 | Q | Q | = | = | ||||||||||||||||||||
文本(2 Byte) | B | C | ||||||||||||||||||||||
二进制位 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 1 | x | x | x | x | x | x | ||
二进制位(补0) | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | x | x | x | x | x | x |
Base64编码 | Q | k | M | = |
这个讲的很透彻,原文地址:http://www.cnblogs.com/chengmo/archive/2014/05/18/3735917.html
class base64{
public $base64_config = ['A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z','a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z','0','1','2','3','4','5','6','7','8','9','_','-'];
public function getBytes($string) {
$data = iconv("UTF-8","GBK",$string);
return unpack("C*",$data);
}
public function array_index($t){
return array_search($t, $this->base64_config);
}
public function decode($str){
$str = str_replace("!","",$str);
$slen = strlen($str);
$mod = $slen%4;
$num = floor($slen/4);
$desc = [];
for($i=0;$i<$num;$i ){
$arr = array_map("base64::array_index",str_split(substr($str,$i*4,4)));
$desc_0 = ($arr[0]<<2)|(($arr[1]&48)>>4);
$desc_1 = (($arr[1]&15)<<4)|(($arr[2]&60)>>2);
$desc_2 = (($arr[2]&3)<<6)|$arr[3];
$desc = array_merge($desc,[$desc_0,$desc_1,$desc_2]);
}
if($mod == 0) return implode('', array_map("chr",$desc));
$arr = array_map("base64::array_index", str_split(substr($str,$num*4,4)));
if(count($arr) == 1) {
$desc_0 = $arr[0]<<2;
if($desc_0 != 0) $desc = array_merge($desc,[$desc_0]);
}else if(count($arr) == 2) {
$desc_0 = ($arr[0]<<2)|(($arr[1]&48)>>4);
$desc = array_merge($desc,[$desc_0]);
}else if(count($arr) == 3) {
$desc_0 = ($arr[0]<<2)|(($arr[1]&48)>>4);
$desc_1 = ($arr[1]<<4)|(($arr[2]&60)>>2);
$desc = array_merge($desc,[$desc_0,$desc_1]);
}
return implode('', array_map("chr",$desc));
}
public function encode($str){
$byte_arr = $this->getBytes($str);
$slen=count($byte_arr);
$smod = ($slen%3);
$snum = floor($slen/3);
$desc = array();
for($i=1;$i<=$snum;$i ){
$index_num = ($i-1)*3;
$_dec0= $byte_arr[$index_num 1]>>2;
$_dec1= (($byte_arr[$index_num 1]&3)<<4)|($byte_arr[$index_num 2]>>4);
$_dec2= (($byte_arr[$index_num 2]&0xF)<<2)|($byte_arr[$index_num 3]>>6);
$_dec3= $byte_arr[$index_num 3]&63;
$desc = array_merge($desc,array($this->base64_config[$_dec0],$this->base64_config[$_dec1],$this->base64_config[$_dec2],$this->base64_config[$_dec3]));
}
if($smod==0) return implode('',$desc);
$n = ($snum*3) 1;
$_dec0= $byte_arr[$n]>>2;
///只有一个字节
if(!isset($byte_arr[$n 1])){
$_dec1= (($byte_arr[$n]&3)<<4);
$_dec2=$_dec3="!";
}else{
///2个字节
$_dec1= (($byte_arr[$n]&3)<<4)|($byte_arr[$n 1]>>4);
$_dec2= $this->base64_config[($byte_arr[$n 1]&0xF)<<2];
$_dec3="!";
}
$desc = array_merge($desc,array($this->base64_config[$_dec0],$this->base64_config[$_dec1],$_dec2,$_dec3));
return implode('',$desc);
}
}
$base64 = new base64();
//echo array_search("E",$base64->base64_config);
//exit;
$tt = $base64->encode("中文那在场也不怕asdasdas23232323,。、");
echo $tt."
";
$ttt = $base64->decode($tt);
echo $ttt."
";